browsertrix

Author	SHA1	Message	Date
Henry Wilkinson	f507f1d2ec	Fixes allowed actions for viewers and crawlers throughout the app (#1326 ) Closes #1294 ### Changes - `crawl-list` component - Adds a check if there are any items in the actions menu. If not, skip rendering the actions menu. - This allows us to give the component no actions! Currently required to remove them for viewers! - Collection Details - Hides "Remove from Collection" option for viewers - Crawls List - Removes the single "View Crawl Details" option from archived items for viewers - All the other actions were already set up correctly to be used by all roles! - Dashboard - Hides org settings gear icon button unless the user is an admin - Hides "Create New" dropdown for viewers - Workflow Details - Hides workflow edit icon button for viewers - Hides the "Delete Crawl" option in archived items for viewers - Hides the "Run Crawl" option for viewers - Workflow List - Hides all edit-related options for viewers, the only option now is copying tags - Removes the deactivate / delete options (were only visible when running a crawl) in the workflow list actions --------- Co-authored-by: Ilya Kreymer <ikreymer@gmail.com> Co-authored-by: sua yoo <sua@suayoo.com>	2023-11-17 14:41:21 -08:00
Ilya Kreymer	dfba4b3940	Replace partial_complete -> stopped_by_user or stopped_quota_reached + operator edge cases (#1368 ) - Adds two new crawl finished state, stopped_by_user and stopped_quota_reached - Tracking other possible 'stop reasons' in operator, though not making them distinct states for now. - Updated frontend with 'Stopped by User' and 'Stopped: Time Quota Reached', shown with same icon as current partial_complete - Added migration of partial_complete to either stopped_by_user or complete (no historical quota data available) - Addresses edge case in scaling: if crawl never scaled (no redis entry, no pod), automatically scale down - Edge case in status: if crawl is somehow 'canceled' but not deleted, immediately delete crawl object and begin finalizing. --------- Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>	2023-11-14 11:17:16 -08:00
Ilya Kreymer	0935d43a97	exclusion optimizations: dynamic exclusions (part of #1216 ): (#1268 ) - instead of restarting crawler when exclusion added/removed, add a message to a redis list (per crawler instance) - no longer filtering existing queue on backend, now handled via crawler (implemented in 0.12.0 via webrecorder/browsertrix-crawler#408) - match response optimization: instead of returning first 1000 matches, limits response to 500K and returns however many matches fit in that response size (for optional pagination on frontend)	2023-11-06 09:36:25 -08:00
Tessa Walsh	38f32f11ea	Enforce quota and hard cap for monthly execution minutes (#1284 ) Fixes #1261 Closes #1092 The quota for monthly execution minutes is treated as a hard cap. Once it is exceeded, an alert indicating that an org has exceeded its monthly execution minutes will display and the user will be unable to start new crawls. Any running crawls will be stopped once the quota is exceeded. An execution minutes meter bar is also added in the Org Dashboard and displayed if a quota is set. More detail in #1305 which was merged into this branch. ## Changes - Enable setting 'maxExecMinutesPerMonth' in orgs list quotas by superadmin - Enforce quota by stopping crawls in operator once quota is reached - Show alert banner once execution time quota is hit: - Once quota is hit, disable Run Crawl buttons in frontend, return 403 message with `exec_minutes_quota_reached` detail in backend from crawl config `/run` endpoint, and don't run new workflows on creation (similar to storage quota) - Display execution time for crawls in the crawl details overview, immediately below - Show execution minutes meter on dashboard (from #1305) --------- Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics> Co-authored-by: Ilya Kreymer <ikreymer@gmail.com> Co-authored-by: sua yoo <sua@webrecorder.org>	2023-10-26 15:38:51 -07:00
Henry Wilkinson	e274462ba0	Update tag spacing and styling for remove button (#1283 ) ### Context - Adds custom padding to each side based on if the tag is removable or not - Improves hover state for the remove button when the tag is focused - Adds padding to the remove button	2023-10-20 16:02:32 -07:00
Henry Wilkinson	40da1f8541	Make URLs in the settings viewer clickable, removes deeplinked titles (#1247 ) ### Changes - URLs on the config review pages are now links that open in a new tab - Does not do anything with the `Extra URLs in Scope` field (which we currently render as a regex so left that alone) - Hides / removes the previously deep-linked but now broken config section rendering.	2023-10-18 16:20:22 -07:00
Ilya Kreymer	9a2787f9c4	User refactor + remove fastapi_users dependency + update fastapi (#1290 ) Fixes #1050 Major refactor of the user/auth system to remove fastapi_users dependency. Refactors users.py to be standalone and adds new auth.py module for handling auth. UserManager now works similar to other ops classes. The auth should be fully backwards compatible with fastapi_users auth, including accepting previous JWT tokens w/o having to re-login. The User data model in mongodb is also unchanged. Additional fixes: - allows updating fastapi to latest - add webhook docs to openapi (follow up to #1041) API changes: - Removing the`GET, PATCH, DELETE /users/<id>` endpoints, which were not in used before, as users are scoped to orgs. For deletion, probably auto-delete when user is removed from last org (to be implemented). - Rename `/users/me-with-orgs` is renamed to just `/users/me/` - New `PUT /users/me/change-password` endpoint with password required to update password, fixes #1269, supersedes #1272 Frontend changes: - Fixes from #1272 to support new change password endpoint. --------- Co-authored-by: Tessa Walsh <tessa@bitarchivist.net> Co-authored-by: sua yoo <sua@suayoo.com>	2023-10-18 10:49:23 -07:00
sua yoo	4610d95cd7	Use org slug in place of UUIDs in app URLs (#1277 ) - Replaces org UUID in URL/browser location bar with org slug. - Refactor: Adds shared app state utility using https://sijakret.github.io/lit-shared-state/ to access org data from deep descendants. - Backwards compatible: org UUID URLs should auto-redirect to org slug URLs. - Show the org UUID in org settings general tab for use with APIs (Resolves #1258, Follows #1279)	2023-10-18 09:28:30 -07:00
sua yoo	6b897e281c	hotfix: display workflow list date as utc	2023-10-17 15:51:24 -07:00
Henry Wilkinson	0bd8748e68	Minor Workflow Creator UX Changes (#1267 ) - Adds `position: sticky` to the workflow creator / editor controls to affix them to the bottom of the screen, they are now always visible! - Renames "Extra URLs in Scope" to "Extra URL Prefixes in Scope" - Updates documentation accordingly - Adjusts casing for checkboxes - Adds the multiplication sign to the crawler instances settings to better communicate that they are increases in scale and not arbitrary numbers.	2023-10-13 16:55:54 -07:00
sua yoo	630c00c5b0	Enforce strong passwords in UI (#1266 )	2023-10-12 19:36:59 -07:00
sua yoo	f1dcc7e48a	Allow users to change display name and email (#1265 )	2023-10-11 13:42:41 -07:00
sua yoo	38efeccc25	Limit URL list entry to maximum URLs (#1242 ) - Limits URL list entry to 1,000 URLs - Limits additional URL list entry to 100 URLs - Shows first invalid URL in list in error message - Quick and dirty fix for long URLs wrapping: Show URLs in list on one line, with entire container scrolling --------- Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>	2023-10-03 21:02:32 -07:00
Tessa Walsh	bbdb7f8ce5	Require that all passwords are between 8 and 64 characters (#1239 ) - Require that all passwords are between 8 and 64 characters - Fixes account settings password reset form to only trigger logged-in event after successful password change. - Password validation can be extended within the UserManager's validate_password method to add or modify requirements. - Add tests for password validation	2023-10-03 18:57:46 -07:00
Tessa Walsh	b1ead614ee	Add --failOnFailedSeed checkbox to URL list workflows (#1236 ) - If set, and any of the seeds fails, the entire crawl is marked as a failure. - Add checkbox which adds --failOnFailedSeed checkbox to URL list workflows - Add 'Fail Crawl On Failed URL' to crawl workflow setup docs	2023-10-03 18:46:09 -07:00
sua yoo	df190e12b9	Show running workflow error logs (#1224 ) - Adds "Logs" tab to workflow detail - Shows error logs in expandable section in "Watch" tab - Show corresponding message (no logs yet or logs temporarily unavailable) when `/errors` returns 503 based on crawl state - text tweaks: use error logs instead of logs, change 'crawl start' -> 'crawl continue' in log message --------- Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>	2023-10-03 00:03:21 -07:00
sua yoo	3fea4cabe2	Show storage meter even with no quota (#1240 ) - Displays how much storage items and browser profiles take up even when quota is not specified	2023-10-02 20:01:39 -07:00
sua yoo	941a75ef12	Separate seeds into a new endpoints (#1217 ) - Remove config.seeds from workflow and crawl detail endpoints - Add new paginated GET /crawls/{crawl_id}/seeds and /crawlconfigs/{cid}/seeds endpoints to retrieve seeds for a crawl or workflow - Include firstSeed in GET /crawlconfigs/{cid} endpoint (was missing before) - Modify frontend to fetch seeds from new /seeds endpoints with loading indicator --------- Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>	2023-10-02 10:56:12 -07:00
Henry Wilkinson	e93f195d59	fix: Right Align Copy Buttons & `<btrix-desc-list>` vertical `width: 100%` (#1177 ) * Reorders actions, adds tooltip - All copy buttons on the collection share dialog are now on the right side - Adds a tooltip to tell the user the button opens the link in a new tab * Make vertical `dec-list` items fill 100% width of their parent container - Allows for better placement of items within the container - Adds horizontal padding to info bars * Right align copy button in item details page	2023-09-28 12:08:27 -07:00
sua yoo	e5cc70754e	Show org storage quotas in dashboard (#1210 ) - Displays storage quota in subdivided meter - Updates icon colors - Adds new <btrix-meter> component --------- Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>	2023-09-27 10:38:59 -07:00
sua yoo	730a160f75	New org home page dashboard (#1201 )	2023-09-21 19:20:08 -07:00
Tessa Walsh	9224f52f51	Remove config from list endpoints to speed up responses (#1193 ) * Remove config from list endpoints - Remove config field from workflow and crawl list endpoints - Add seedCount to CrawlConfigOut on backend and Workflow on frontend - Refactor CrawlConfig and CrawlConfigOut to extend CrawlConfigCore + CrawlConfigAdditional - Refactor workflow list in frontend to use firstSeed and seedCount - Frontend uses ListWorkflow type which is Omit<Workflow, "config">	2023-09-19 11:05:48 -05:00
Ilya Kreymer	c9c39d47b7	Scheduled Crawl Refactor: Handle via Operator + Add Skipped Crawls on Quota Reached (#1162 ) * use metacontroller's decoratorcontroller to create CrawlJob from Job * scheduled job work: - use existing job name for scheduled crawljob - use suspended job, set startTime, completionTime and succeeded status on job when crawljob is done - simplify cronjob template: remove job_image, cron_namespace, using same namespace as crawls, placeholder job image for cronjobs * move storage quota check to crawljob handler: - add 'skipped_quota_reached' as new failed status type - check for storage quota before checking if crawljob can be started, fail if not (check before any pods/pvcs created) * frontend: - show all crawls in crawl workflow, no need to filter by status - add 'skipped_quota_reached' status, show as 'Skipped (Quota Reached)', render same as failed * migration: make release namespace available as DEFAULT_NAMESPACE, delete old cronjobs in DEFAULT_NAMESPACE and recreate in crawlers namespace with new template	2023-09-12 13:05:43 -07:00
Tessa Walsh	9377a6f456	Issue all non-upload storage-quota-update events from LiteElement (#1151 ) - More specific toast notification error messages to the action being attempted - Single dismissable global banner shown when org storage is reached - Removed check for storage quota reached in `runNow`, since buttons are disabled in UI, and errors handled if request fails. - Allow creating new workflow when storage quota reached - More responsive storage quota updates: add storageQuotaReached to archived item replay.json, updates w/o reload when crawl pushes quota over limit - Modify LiteElement to check for storageQuotaReached on GET requests --------- Co-authored-by: sua yoo <sua@suayoo.com>	2023-09-11 18:17:48 -07:00
Tessa Walsh	d2ededc895	Add and enforce org storage quota (#1106 ) * Implement in backend - Track bytesStored in org - Add migration to pre-calculate based on size of crawlfiles and profilefiles - Add methods to increase or decrease org storage when crawl or profile files are added or deleted - Include storageQuotaReached boolean in API responses that alter storage - Don't start new crawls and fail uploads if storage quota reached * Implement in frontend - Add to orgs-list quotas - Update org's storageQuotaReached based on backend endpoint responses - Disable buttons when storage quota is met - Show toast notification when attempting to run a crawl when org storage quota is met	2023-09-07 12:45:43 -04:00
Henry Wilkinson	8850e35f7a	Changes "Crawls" → "Items" (#1145 )	2023-09-05 23:58:12 -04:00
Tessa Walsh	93573d0bfe	Use base10 for sizes in frontend (#1133 ) * Use base10 for sizes in frontend * Simplify renderSize	2023-09-05 21:35:20 -04:00
sua yoo	ff6650d481	Manage collection from archived item details (#1085 ) - Lists collections that an archived item belongs to in item detail view - Improves performance of collection add component --------- Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>	2023-09-05 17:52:17 -04:00
Henry Wilkinson	1af796bd0e	fix: Terminology unification "crawls" & "archive data" → "items" (#1127 ) Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>	2023-09-01 11:09:06 -04:00
Tessa Walsh	e667fe2e97	Add max crawl size option to backend and frontend (#1045 ) Backend: - add 'maxCrawlSize' to models and crawljob spec - add 'MAX_CRAWL_SIZE' to configmap - add maxCrawlSize to new crawlconfig + update APIs - operator: gracefully stop crawl if current size (from stats) exceeds maxCrawlSize - tests: add max crawl size tests Frontend: - Add Max Crawl Size text box Limits tab - Users enter max crawl size in GB, convert to bytes - Add BYTES_PER_GB as constant for converting to bytes - docs: Crawl Size Limit to user guide workflow setup section Operator Refactor: - use 'status.stopping' instead of 'crawl.stopping' to indicate crawl is being stopped, as changing later has no effect in operator - add is_crawl_stopping() to return if crawl is being stopped, based on crawl.stopping or size or time limit being reached - crawlerjob status: store byte size under 'size', human readable size under 'sizeHuman' for clarity - size stat always exists so remove unneeded conditional (defaults to 0) - store raw byte size in 'size', human readable size in 'sizeHuman' Charts: - subchart: update crawlerjob crd in btrix-crds to show status.stopping instead of spec.stopping - subchart: show 'sizeHuman' property instead of 'size' - bump subchart version to 0.1.1 --------- Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>	2023-08-26 22:00:37 -07:00
Tessa Walsh	ce5b52f8af	Add and enforce org maxPagesPerCrawl quota (#1044 )	2023-08-23 10:38:36 -04:00
sua yoo	54cf4f23e4	Paginate Workflows and refactor to use server-side queries (#1078 ) - Paginates Crawl Workflows when there are more than 10 workflows - Refactors workflow search and crawl search to use the same component - Adds sort by first seed, workflow creation date, and workflow modified date - Separates "last run" date from "modified" date - Update column layout into Name & Schedule (or Manual Ru'ri=), Latest Crawl (<finish time> in <duration>), total size, and last modified (modified by and modified time)	2023-08-22 16:29:17 -07:00
Ilya Kreymer	223571b18b	exclusion regex: show unmodified regex string, avoid dropping the '\' when displaying escaped regexes (#1094 )	2023-08-22 10:16:23 -07:00
sua yoo	270e134359	Show details in crawl error log (#1079 ) Shows crawl error log details in a dialog. Since the detail object does not always follow a specific format, this iteration uses the detail key in uppercase as the label.	2023-08-15 21:14:08 -07:00
Ilya Kreymer	768d1181f8	frontend: fixes for queue / exclusions: (#1076 ) - fix 'Edit Crawler Instances' not showing up when crawl running - urlencode regex params to properly encode '+' - catch server-side regex error, display 'Invalid Regex'	2023-08-15 13:15:43 -07:00
sua yoo	89983542f9	Update archived item URLs (#1064 ) - Changes to URLs in "Crawling", "All Archived Items", and "Collections": - Rename Artifacts -> Items - Unifies view crawl view as loaded from All Archived Items and from Workflows - Includes redirect for /artifacts/uploads -> /items/uploads to support archiveweb.page usage	2023-08-14 18:28:37 -07:00
Ilya Kreymer	8ea3dd5dae	terminology tweaks in frontend: (part of #922 ) (#1062 ) * terminology tweaks in frontend: (part of #922) - use 'crawl workflow' instead of 'workflow' where possible - use 'replay' instead of 'replay crawl' - localization: rerun string extraction / processing - "Review Config" → "Review Settings" - "Workflow" → "Crawl Workflow" in error message --------- Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>	2023-08-09 15:38:58 -07:00
sua yoo	37733483d5	Standardize archived item filtering, sorting and labels (#1054 ) Frontend: - Renames list view to "All Archived Items" - Refactors fetches to use single all-crawls endpoints - Removes search by config ID for more search parity with uploads - Adds sort by size - Refactors property and method names to replace crawl* - Replaces remaining references to "crawl" in copy with "item"' - Rename Upload Archive button to Upload WACZ - Fix focusout in item menu so menus close Backend: - Filter search values by type as well - Only get list of cids for crawls in search values - Don't list crawl/workflow ids in search values --------- Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>	2023-08-09 12:13:55 -07:00
sua yoo	b494070e43	Collection share dialog + copy updates (#1056 ) - Always shows primary "Share" action button in Collection detail page. - Enables toggling shareable status and share info from dialog. Difference from mockups: I made the "Done" button neutral do differentiate from our submit action buttons in the dialog, since toggling will apply changes immediately. - Menu item: "Go to Public View"/"Go to Shareable View" -> "Visit Shareable URL". - Toggle label: "Make Collection Shareable" -> "Collection is Shareable". - Additional dialog copy: adds "This collection can be viewed by anyone with the link." under "Link to Share" and "Share this collection by embedding it into an existing webpage." under "Embed Collection". - Moves share status icon to its own column in list view. - Adds new syntax-highlighted code component that supports js and html. Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>	2023-08-09 10:12:46 -07:00
Anish Lakhwara	9236a07800	fix: run `yarn format` in frontend dir (#1043 )	2023-08-03 19:12:48 -07:00
sua yoo	62d3399223	Add info bar to Collection detail view (#1036 ) - Adds Collection info bar to detail view - Update "Web Captures" -> "Archived Items" - Updates Collection list columns to match - Refactors `btrix-desc-list` and usage in `workflow-details` to reuse horizontal info bar component	2023-08-03 16:58:56 -07:00
sua yoo	54e2b2c703	List web captures in Collection (#1024 ) - Adds tab for "Web Captures" in Collection detail view - Move Collection description under Replay section - Fixes app reloading when clicking into a Collection - Standardizes Web Capture list headers from "Finished -> "Created Date"	2023-08-01 09:14:27 -07:00
Ilya Kreymer	06cf9c7cc3	add crawl ending states: 'generate-wacz', 'uploading-wacz', 'pending-wait' that occur after a crawl is finished or is being stopped (#1022 ) operator: ensure transitions from each of these states is supported, including to 'waiting_capacity' add extra check on stopping to avoid transitioning back to a running state after crawl is finished ui: add states to UI display, localization, add as active states fixes #263	2023-08-01 00:15:59 -07:00
Tessa Walsh	c21153255a	Rename notes to description in frontend and backend (#1011 ) - Rename crawl notes to description - Add migration renaming notes -> description - Stop inheriting workflow description in crawl - Update frontend to replace crawl/upload notes with description - Remove setting of config description from crawl list - Adjust tests for changes	2023-07-26 13:00:04 -07:00
sua yoo	75b011f951	Upload WACZ via UI (#992 ) - Users can now upload .WACZ archives from the "Archived Data" page. - Can specify name, description, tags and collection(s) to add upload to - Show progress of upload - Support canceling upload	2023-07-21 16:45:52 +02:00
Tessa Walsh	d5c3a8519f	Add crawler Use Sitemap option to Browsertrix Cloud (#978 ) * Add user-guide docs for Use Sitemap option --------- Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>	2023-07-19 13:57:52 -04:00
Ilya Kreymer	8eeb66e11f	Frontend more upload path fixes (#961 ) * additional fixes for #935: - don't use artifactType for detail pages, ensure correct artifact selected based on path * naming tweaks: - from uploads detail, return to 'All Uploads' with filter - from crawls detail, return to 'All Crawls' with filter - rename general to 'All Archived Data'	2023-07-07 15:41:03 -07:00
Ilya Kreymer	d3a757e20b	partial fix for: #935 : (#960 ) - add route for /artifacts/upload/<id> to be used for uploads - link uploads to /artifacts/upload/<id> instead of /artifacts/crawl/<id>	2023-07-07 14:23:26 -07:00
sua yoo	de4b18aa67	List crawls, uploads, and all objects in UI (#941 ) - Adds top-level "Archived Data" view, replacing "Finished Crawls" and moving it as "Crawls" into view - Adds list for viewing all artifacts/data - Adds list for viewing all uploaded crawls - Updates crawl detail view to show upload details - Edit upload metadata, including 'name' - Delete uploads --------- Co-authored-by: Ilya Kreymer <ikreymer@gmail.com> Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>	2023-07-07 13:20:28 -07:00
Henry Wilkinson	8a240ad044	Fixes z-index (#939 )	2023-07-04 23:05:09 -04:00

1 2 3 4

181 Commits