browsertrix

Author	SHA1	Message	Date
Emma Segal-Grossman	b15c5ccddd	ESLint & Typescript fixes (#1407 ) Closes #1405 - Properly uses `typescript-eslint`: we were missing the preset from it, so some of the default `eslint` rules (that don't properly work with typescript) were being applied and causing false positives - I also moved the `eslint` config into its own file, and enabled `typescript-eslint`'s type-awareness, so that we can enable more type-aware rules in the future if we like - Adds `ts-lit-plugin` to the typescript config, which _hopefully_ will allow us to catch issues during build (in CI) - It looks like `ts-lit-plugin` is sort of abandonware at the moment, and unfortunately _doesn't_ actually work for this purpose right now, but the lit team is working on a replacement here: https://www.npmjs.com/package/@lit-labs/analyzer - Adds `fork-ts-checker-webpack-plugin`, which allows the typescript checking process to be run on a separate forked thread in Webpack, which can help speed up builds & checking - Enables incremental type checking for better speed - Fixes a whole bunch of `eslint`-auto-fixable issues (unused imports and variables, some type issues, etc) - Fixes a bunch of `lit-analyzer` issues (mostly attribute naming, some type issues as well) - Fixes various other type issues: - Improves type safety in a bunch of places, notably anywhere `apiFetch` and `APIPaginatedList` are used - Removes some `any`s	2023-11-24 12:32:53 -05:00
sua yoo	006ce5a013	Prompt user to confirm workflow crawl deletion (#1401 ) - Adds confirmation dialog for workflow crawls - Changes archived item confirmation from default browser dialog to shoelace dialog - Increase dialog title size - Out of scope: Localizes other workflow detail confirmation buttons - Out of scope: Reword missed "Archive" reference in file uploader	2023-11-22 12:40:49 -08:00
Emma Segal-Grossman	232a29f7a2	Merge pull request #1381 from webrecorder/1379-refactor-components-index-file Refactor components index file, and add better vscode extensions and settings	2023-11-20 16:59:13 -05:00
Henry Wilkinson	f507f1d2ec	Fixes allowed actions for viewers and crawlers throughout the app (#1326 ) Closes #1294 ### Changes - `crawl-list` component - Adds a check if there are any items in the actions menu. If not, skip rendering the actions menu. - This allows us to give the component no actions! Currently required to remove them for viewers! - Collection Details - Hides "Remove from Collection" option for viewers - Crawls List - Removes the single "View Crawl Details" option from archived items for viewers - All the other actions were already set up correctly to be used by all roles! - Dashboard - Hides org settings gear icon button unless the user is an admin - Hides "Create New" dropdown for viewers - Workflow Details - Hides workflow edit icon button for viewers - Hides the "Delete Crawl" option in archived items for viewers - Hides the "Run Crawl" option for viewers - Workflow List - Hides all edit-related options for viewers, the only option now is copying tags - Removes the deactivate / delete options (were only visible when running a crawl) in the workflow list actions --------- Co-authored-by: Ilya Kreymer <ikreymer@gmail.com> Co-authored-by: sua yoo <sua@suayoo.com>	2023-11-17 14:41:21 -08:00
emma	bc6f362861	update pages as well	2023-11-15 18:24:50 -05:00
Ilya Kreymer	0935d43a97	exclusion optimizations: dynamic exclusions (part of #1216 ): (#1268 ) - instead of restarting crawler when exclusion added/removed, add a message to a redis list (per crawler instance) - no longer filtering existing queue on backend, now handled via crawler (implemented in 0.12.0 via webrecorder/browsertrix-crawler#408) - match response optimization: instead of returning first 1000 matches, limits response to 500K and returns however many matches fit in that response size (for optional pagination on frontend)	2023-11-06 09:36:25 -08:00
Tessa Walsh	38f32f11ea	Enforce quota and hard cap for monthly execution minutes (#1284 ) Fixes #1261 Closes #1092 The quota for monthly execution minutes is treated as a hard cap. Once it is exceeded, an alert indicating that an org has exceeded its monthly execution minutes will display and the user will be unable to start new crawls. Any running crawls will be stopped once the quota is exceeded. An execution minutes meter bar is also added in the Org Dashboard and displayed if a quota is set. More detail in #1305 which was merged into this branch. ## Changes - Enable setting 'maxExecMinutesPerMonth' in orgs list quotas by superadmin - Enforce quota by stopping crawls in operator once quota is reached - Show alert banner once execution time quota is hit: - Once quota is hit, disable Run Crawl buttons in frontend, return 403 message with `exec_minutes_quota_reached` detail in backend from crawl config `/run` endpoint, and don't run new workflows on creation (similar to storage quota) - Display execution time for crawls in the crawl details overview, immediately below - Show execution minutes meter on dashboard (from #1305) --------- Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics> Co-authored-by: Ilya Kreymer <ikreymer@gmail.com> Co-authored-by: sua yoo <sua@webrecorder.org>	2023-10-26 15:38:51 -07:00
sua yoo	4610d95cd7	Use org slug in place of UUIDs in app URLs (#1277 ) - Replaces org UUID in URL/browser location bar with org slug. - Refactor: Adds shared app state utility using https://sijakret.github.io/lit-shared-state/ to access org data from deep descendants. - Backwards compatible: org UUID URLs should auto-redirect to org slug URLs. - Show the org UUID in org settings general tab for use with APIs (Resolves #1258, Follows #1279)	2023-10-18 09:28:30 -07:00
Henry Wilkinson	0bd8748e68	Minor Workflow Creator UX Changes (#1267 ) - Adds `position: sticky` to the workflow creator / editor controls to affix them to the bottom of the screen, they are now always visible! - Renames "Extra URLs in Scope" to "Extra URL Prefixes in Scope" - Updates documentation accordingly - Adjusts casing for checkboxes - Adds the multiplication sign to the crawler instances settings to better communicate that they are increases in scale and not arbitrary numbers.	2023-10-13 16:55:54 -07:00
Tessa Walsh	e9bac4c088	API delete endpoint improvements (#1232 ) - Applies user permissions check before deleting anything in all /delete endpoints - Shuts down running crawls before deleting anything in /all-crawls/delete as well as /crawls/delete - Splits delete_list.crawl_ids into crawls and upload lists at same time as checks in /all-crawls/delete - Updates frontend notification message to Only org owners can delete other users' archived items. when a crawler user attempts to delete another users' archived items	2023-10-03 13:05:00 -07:00
sua yoo	df190e12b9	Show running workflow error logs (#1224 ) - Adds "Logs" tab to workflow detail - Shows error logs in expandable section in "Watch" tab - Show corresponding message (no logs yet or logs temporarily unavailable) when `/errors` returns 503 based on crawl state - text tweaks: use error logs instead of logs, change 'crawl start' -> 'crawl continue' in log message --------- Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>	2023-10-03 00:03:21 -07:00
sua yoo	941a75ef12	Separate seeds into a new endpoints (#1217 ) - Remove config.seeds from workflow and crawl detail endpoints - Add new paginated GET /crawls/{crawl_id}/seeds and /crawlconfigs/{cid}/seeds endpoints to retrieve seeds for a crawl or workflow - Include firstSeed in GET /crawlconfigs/{cid} endpoint (was missing before) - Modify frontend to fetch seeds from new /seeds endpoints with loading indicator --------- Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>	2023-10-02 10:56:12 -07:00
Henry Wilkinson	e93f195d59	fix: Right Align Copy Buttons & `<btrix-desc-list>` vertical `width: 100%` (#1177 ) * Reorders actions, adds tooltip - All copy buttons on the collection share dialog are now on the right side - Adds a tooltip to tell the user the button opens the link in a new tab * Make vertical `dec-list` items fill 100% width of their parent container - Allows for better placement of items within the container - Adds horizontal padding to info bars * Right align copy button in item details page	2023-09-28 12:08:27 -07:00
sua yoo	730a160f75	New org home page dashboard (#1201 )	2023-09-21 19:20:08 -07:00
sua yoo	d05a27e8a4	Separate "run now" switch from scheduling options (#1175 )	2023-09-21 19:18:57 -07:00
Ilya Kreymer	c9c39d47b7	Scheduled Crawl Refactor: Handle via Operator + Add Skipped Crawls on Quota Reached (#1162 ) * use metacontroller's decoratorcontroller to create CrawlJob from Job * scheduled job work: - use existing job name for scheduled crawljob - use suspended job, set startTime, completionTime and succeeded status on job when crawljob is done - simplify cronjob template: remove job_image, cron_namespace, using same namespace as crawls, placeholder job image for cronjobs * move storage quota check to crawljob handler: - add 'skipped_quota_reached' as new failed status type - check for storage quota before checking if crawljob can be started, fail if not (check before any pods/pvcs created) * frontend: - show all crawls in crawl workflow, no need to filter by status - add 'skipped_quota_reached' status, show as 'Skipped (Quota Reached)', render same as failed * migration: make release namespace available as DEFAULT_NAMESPACE, delete old cronjobs in DEFAULT_NAMESPACE and recreate in crawlers namespace with new template	2023-09-12 13:05:43 -07:00
Tessa Walsh	9377a6f456	Issue all non-upload storage-quota-update events from LiteElement (#1151 ) - More specific toast notification error messages to the action being attempted - Single dismissable global banner shown when org storage is reached - Removed check for storage quota reached in `runNow`, since buttons are disabled in UI, and errors handled if request fails. - Allow creating new workflow when storage quota reached - More responsive storage quota updates: add storageQuotaReached to archived item replay.json, updates w/o reload when crawl pushes quota over limit - Modify LiteElement to check for storageQuotaReached on GET requests --------- Co-authored-by: sua yoo <sua@suayoo.com>	2023-09-11 18:17:48 -07:00
Tessa Walsh	d2ededc895	Add and enforce org storage quota (#1106 ) * Implement in backend - Track bytesStored in org - Add migration to pre-calculate based on size of crawlfiles and profilefiles - Add methods to increase or decrease org storage when crawl or profile files are added or deleted - Include storageQuotaReached boolean in API responses that alter storage - Don't start new crawls and fail uploads if storage quota reached * Implement in frontend - Add to orgs-list quotas - Update org's storageQuotaReached based on backend endpoint responses - Disable buttons when storage quota is met - Show toast notification when attempting to run a crawl when org storage quota is met	2023-09-07 12:45:43 -04:00
Ilya Kreymer	768d1181f8	frontend: fixes for queue / exclusions: (#1076 ) - fix 'Edit Crawler Instances' not showing up when crawl running - urlencode regex params to properly encode '+' - catch server-side regex error, display 'Invalid Regex'	2023-08-15 13:15:43 -07:00
sua yoo	89983542f9	Update archived item URLs (#1064 ) - Changes to URLs in "Crawling", "All Archived Items", and "Collections": - Rename Artifacts -> Items - Unifies view crawl view as loaded from All Archived Items and from Workflows - Includes redirect for /artifacts/uploads -> /items/uploads to support archiveweb.page usage	2023-08-14 18:28:37 -07:00
sua yoo	62d3399223	Add info bar to Collection detail view (#1036 ) - Adds Collection info bar to detail view - Update "Web Captures" -> "Archived Items" - Updates Collection list columns to match - Refactors `btrix-desc-list` and usage in `workflow-details` to reuse horizontal info bar component	2023-08-03 16:58:56 -07:00
Ilya Kreymer	8eeb66e11f	Frontend more upload path fixes (#961 ) * additional fixes for #935: - don't use artifactType for detail pages, ensure correct artifact selected based on path * naming tweaks: - from uploads detail, return to 'All Uploads' with filter - from crawls detail, return to 'All Crawls' with filter - rename general to 'All Archived Data'	2023-07-07 15:41:03 -07:00
sua yoo	de4b18aa67	List crawls, uploads, and all objects in UI (#941 ) - Adds top-level "Archived Data" view, replacing "Finished Crawls" and moving it as "Crawls" into view - Adds list for viewing all artifacts/data - Adds list for viewing all uploaded crawls - Updates crawl detail view to show upload details - Edit upload metadata, including 'name' - Delete uploads --------- Co-authored-by: Ilya Kreymer <ikreymer@gmail.com> Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>	2023-07-07 13:20:28 -07:00
Tessa Walsh	29a6f0f6bc	Fix links in watch crawl after workflow crawl completes (#943 )	2023-07-06 15:04:26 -07:00
Tessa Walsh	bd6dc79449	Add frontend support for auto-adding collections to workflows (#916 ) - Adds collections search and list to workflow editor - Adds collections to workflow details component - Adds namePrefix filter to backend GET /orgs/{oid}/collections endpoint to support case-insensitive searching of collections - Adds documentation for new setting --------- Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>	2023-06-12 18:18:05 -07:00
Henry Wilkinson	2364433932	Admin Panel Minor Frontend Style Updates (#915 ) - Unifies trash icons on all pages to use trash3 (there were a few stragglers!) - Brings styling of org quotas dialogue in-line with the rest of our dialogues - Adds missing localization strings - Swaps button with icon button to match table row action styling elsewhere	2023-06-10 19:21:34 -07:00
sua yoo	66b3befef9	Frontend collections beta UI (#886 ) - Support for creating new collections and editing existing collections - Can select crawling workflows which adds entire workflow, and then deselect individual crawls - Can edit existing collections and add more crawls - Can view, create and delete collections via new Collections top-level nav entry	2023-06-06 17:52:01 -07:00
Ilya Kreymer	00fb8ac048	Concurrent Crawl Limit (#874 ) concurrent crawl limits: (addresses #866) - support limits on concurrent crawls that can be run within a single org - change 'waiting' state to 'waiting_org_limit' for concurrent crawl limit and 'waiting_capacity' for capacity-based limits orgs: - add 'maxConcurrentCrawl' to new 'quotas' object on orgs - add /quotas endpoint for updating quotas object operator: - add all crawljobs as related, appear to be returned in creation order - operator: if concurrent crawl limit set, ensures current job is in the first N set of crawljobs (as provided via 'related' list of crawljob objects) before it can proceed to 'starting', otherwise set to 'waiting_org_limit' - api: add org /quotas endpoint for configuring quotas - remove 'new' state, always start with 'starting' - crawljob: add 'oid' to crawljob spec and label for easier querying - more stringent state transitions: add allowed_from to set_state() - ensure state transitions only happened from allowed states, while failed/canceled can happen from any state - ensure finished and state synched from db if transition not allowed - add crawl indices by oid and cid frontend: - show different waiting states on frontend: 'Waiting (Crawl Limit) and 'Waiting (At Capacity)' - add gear icon on orgs admin page - and initial popup for setting org quotas, showing all properties from org 'quotas' object tests: - add concurrent crawl limit nightly tests - fix state waiting -> waiting_capacity - ci: add logging of operator output on test failure	2023-05-30 15:38:03 -07:00
Henry Wilkinson	f788934ef5	Fix copy tags button disabling when no tags on Crawl Details page (#877 )	2023-05-24 12:30:31 -04:00
Tessa Walsh	bd8b306fbd	Improve sorting workflows by lastUpdated (#826 ) * Precompute config crawl stats Includes a database migration to move preciously dynamically computed crawl stats for workflows into the CrawlConfig model. * Add lastRun sorting option and enable it by default * Add modified as final sort key to order non-run workflows * Remove currCrawl* fields and update frontend accordingly * Add isCrawlRunning field to backend and use in frontend	2023-05-22 18:42:30 -04:00
sua yoo	821fbc12d8	Upgrade Shoelace to stable version (v2) (#856 )	2023-05-22 10:01:48 -07:00
sua yoo	b5781c8869	Fix workflow edit back button (#857 )	2023-05-17 12:07:12 -07:00
sua yoo	f250293794	Fix workflow edit page not loading (#848 ) * fix workflow not loading * don't add hash if editing * remove controller	2023-05-12 07:33:35 +02:00
sua yoo	98d82184e6	Fix superadmin running crawls views (#846 ) - Updates superadmin "Running Crawls" to show active crawls (starting, waiting, running, stopping) and sort by start by default - Navigates to crawl workflow watch view on clicking crawl item - Adds "Copy Crawl ID" to crawl actions for easy paste into "Jump to crawl" - Navigates to crawl workflow watch when jumping to crawl	2023-05-11 08:15:52 +02:00
sua yoo	a6435ae3d0	Improve Workflow Detail tab and button UX (#840 ) - Adds primary action button next to "Actions" dropdown - Switches "Edit Workflow Settings" button to icon button - Redirects user to "Watch Crawl" tab when starting crawl - Now uses crawl ID from `data.started` in API `/run` response for more responsive UI - Keeps "Watch Crawl" tab navigation button in list but disable when crawl is not running - Also handles watch view when workflow is not running to cover navigational edge cases - Adds banner in "Crawls" list to direct users to the Watch Crawl when workflow is running - Shows notification when crawl is done to make redirect to Crawls tab smoother - Uses workflow scale when updating crawl scale - Removes "All" from "View: All Finished Crawls" on Finished Crawl page for wording consistency	2023-05-11 02:57:38 +02:00
sua yoo	42794cad46	Add stop crawl confirmation dialog (#841 ) * switch dialog control * wait for workflow update to complete before showing dialog * add stop dialog * close scale after save * update crawl text	2023-05-10 07:21:16 +02:00
Ilya Kreymer	82b21b6813	frontend crawl stopping improvements (#836 ) (#838 ) * frontend crawl stopping improvements (#836) - support new backend 'stopping' property - for now, keep 'stopping' indicator state when crawl is running but stopping set to true	2023-05-08 23:52:49 -07:00
Ilya Kreymer	2cae065c46	Add Waiting state on the backend and frontend (#839 ) * operator: add waiting state - add pods as related objects - inspect pod status, set crawl status to 'waiting' if no pods are running frontend: - frontend support for 'waiting' state - show waiting icon from mocks --------- Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>	2023-05-08 17:05:01 -07:00
sua yoo	0d23b45dac	Crawl workflow detail page improvements (#823 ) Resolves #817 - Adds relevant action buttons to each Workflow detail tab header - Adds "Delete" action menu item to crawls in Crawls tab - Prevent automatically switching to "Watch" tab after running crawl from detail page - Removes "Stop" confirmation prompt and only shows "Cancel" confirmation prompt if there are one or more pages crawled - Replaces "Cancel" confirmation prompt with web component dialog (partially addresses Switch to in-page dialogue boxes #619) - Fixes hash routing to fix going back with browser back button	2023-05-05 13:50:45 -07:00
Henry Wilkinson	23e398d327	Icon updates - Changes `trash` for `trash3` which I believe wasn't originally available in the version of bootstrap-icons we were using but now it is and I like the tapered edges better :P - Makes browser profiles action button small to fit with the rest of the dropdown components used elswhere - Changes previous file-earmark delete icon to trash icons used everywhere else for delete actions	2023-05-01 03:26:34 -04:00
sua yoo	e6e46b522a	hotfix: prevent polling during workflow edit	2023-04-26 13:41:41 -07:00
sua yoo	7888c4fde3	Frontend crawl workflows rework (#775 )	2023-04-25 14:16:07 -07:00
sua yoo	1458e2cdd9	hotfix: delete crawl workflow without crawls	2023-04-24 15:18:20 -07:00
Ilya Kreymer	88497d2a64	text: rename workflowuration -> workflow (#741 )	2023-04-04 08:48:06 -07:00
Tessa Walsh	4724754efc	Filter and sort crawl and workflow list API endpoints in backend (#724 ) * Re-implement pagination and paginate crawlconfig revs First step toward simplifying pagination to set us up for sorting and filtering of list endpoints. This commit removes fastapi-pagination as a dependency. * Migrate all HttpUrl seeds to Seeds This commit also updates the frontend to always use Seeds and to fix display issues resulting from the change. * Filter and sort crawls and workflows Crawls: - Filter by createdBy (via userid param) - Filter by state (comma-separated string for multiple values) - Filter by first_seed, name, description - Sort by started, finished, fileSize, firstSeed - Sort descending by default to match frontend Workflows: - Filter by createdBy (formerly userid) and modifiedBy - Filter by first_seed, name, description - Sort by created, modified, firstSeed, lastCrawlTime * Add crawlconfigs search-values API endpoint and test	2023-03-28 17:55:40 -04:00
sua yoo	8ca4276c57	Migrate crawl config frontend -> workflow (#686 )	2023-03-10 11:39:42 -08:00

1 2

96 Commits