browsertrix

Author	SHA1	Message	Date
Ilya Kreymer	6dca2f1c03	supports overriding the replayweb.page version without having to be r… (#1122 ) * supports overriding the replayweb.page version without having to be rebuild frontend image: - ensures 'rwp_base_url' from helm chart is passed to nginx - ensures both ui.js and sw.js are loaded based on nginx environment variable, not hard-coded - ui.js loaded via redirect from new /replay/ui.js path - pin RWP to known working release in default values.yaml - remove RWP_BASE_URL from Dockerfile, no longer needed, set via chart env var - set default RWP_BASE_URL for devserver to use CDN - set RWP version to 1.8.11	2023-09-05 20:10:21 -04:00
sua yoo	ff6650d481	Manage collection from archived item details (#1085 ) - Lists collections that an archived item belongs to in item detail view - Improves performance of collection add component --------- Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>	2023-09-05 17:52:17 -04:00
Henry Wilkinson	1af796bd0e	fix: Terminology unification "crawls" & "archive data" → "items" (#1127 ) Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>	2023-09-01 11:09:06 -04:00
Tessa Walsh	e667fe2e97	Add max crawl size option to backend and frontend (#1045 ) Backend: - add 'maxCrawlSize' to models and crawljob spec - add 'MAX_CRAWL_SIZE' to configmap - add maxCrawlSize to new crawlconfig + update APIs - operator: gracefully stop crawl if current size (from stats) exceeds maxCrawlSize - tests: add max crawl size tests Frontend: - Add Max Crawl Size text box Limits tab - Users enter max crawl size in GB, convert to bytes - Add BYTES_PER_GB as constant for converting to bytes - docs: Crawl Size Limit to user guide workflow setup section Operator Refactor: - use 'status.stopping' instead of 'crawl.stopping' to indicate crawl is being stopped, as changing later has no effect in operator - add is_crawl_stopping() to return if crawl is being stopped, based on crawl.stopping or size or time limit being reached - crawlerjob status: store byte size under 'size', human readable size under 'sizeHuman' for clarity - size stat always exists so remove unneeded conditional (defaults to 0) - store raw byte size in 'size', human readable size in 'sizeHuman' Charts: - subchart: update crawlerjob crd in btrix-crds to show status.stopping instead of spec.stopping - subchart: show 'sizeHuman' property instead of 'size' - bump subchart version to 0.1.1 --------- Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>	2023-08-26 22:00:37 -07:00
Ilya Kreymer	2da6c1c905	1.6.3 Fixes - Fix workflow sort order for Latest Crawl + 'Remove From Collection' action menu on archived items in collections (#1113 ) * fix latest crawl (lastRun) sort: - don't cast 'started' value to string when setting as starting crawl time (regression from #937) - caused incorrect sorting as finished crawl time was a datetime, while starting crawl time was a string - move updated config crawl info in one place, simplify to avoid returning started time altogether, just set directly - pass mdb crawlconfigs and crawls collections directly to add_new_crawl() function - fixes #1108 * Add dropdown menu containing 'Remove from Collection' to archived items in collection view (#1110) - Enables users to remove an item from a collection from the collection detail view - menu was previously missing - Fixes: #1102 (missing dropdown menu) by making use of the inactive menu trigger button. - Updates collection items page size to match "Archived Items" page size (20 items per page) --------- Co-authored-by: sua yoo <sua@webrecorder.org>	2023-08-25 21:08:47 -07:00
Anish Lakhwara	8b16124675	feat: implement 'collections' array with {name, id} for archived item details (#1098 ) - rename 'collections' -> 'collectionIds', adding migration 0014 - only populate 'collections' array with {name, id} pair for get_crawl() / single archived item path, but not for aggregate/list methods - remove Crawl.get_crawl(), redundant with BaseCrawl.get_crawl() version - ensure _files_to_resources returns an empty [] instead of none if empty (matching BaseCrawl.get_crawl() behavior to Crawl.get_crawl()) - tests: update tests to use collectionIds for id list, add 'collections' for {name, id} test - frontend: change Crawl object to have collectionIds instead of collections --------- Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>	2023-08-25 00:26:46 -07:00
Ilya Kreymer	989ed2a8da	Use Shared Services for Crawling, Redis, Profile Browsers (#1088 ) * refactor to use shared role-based service shared across pods: - 'crawler' service for all crawler screencasting, scales 0 .. N with crawler-<ID>-N.crawl - 'redis' service for all redis access, redis-<ID>-0.redis - 'browser' service for all browser access (profile browsers), browser-<ID>-0.browser - don't create a new service per crawl/profile at all - enable 'publishNotReadyAddresses' for potentially faster resolving, esp for redis - remove service as type managed by operator as no longer creating services dynamically - remove frontend var CRAWLER_SVC_SUFFIX, suffix always '.crawler' to match crawler service name	2023-08-24 20:08:53 -07:00
Ilya Kreymer	e7f2d93f80	bump version to 1.7.0-beta.0	2023-08-23 12:03:45 -07:00
Tessa Walsh	ce5b52f8af	Add and enforce org maxPagesPerCrawl quota (#1044 )	2023-08-23 10:38:36 -04:00
sua yoo	54cf4f23e4	Paginate Workflows and refactor to use server-side queries (#1078 ) - Paginates Crawl Workflows when there are more than 10 workflows - Refactors workflow search and crawl search to use the same component - Adds sort by first seed, workflow creation date, and workflow modified date - Separates "last run" date from "modified" date - Update column layout into Name & Schedule (or Manual Ru'ri=), Latest Crawl (<finish time> in <duration>), total size, and last modified (modified by and modified time)	2023-08-22 16:29:17 -07:00
Ilya Kreymer	223571b18b	exclusion regex: show unmodified regex string, avoid dropping the '\' when displaying escaped regexes (#1094 )	2023-08-22 10:16:23 -07:00
Ilya Kreymer	422452b5c1	bump to 1.6.2	2023-08-18 18:27:37 -07:00
sua yoo	6044486190	Add button to download error logs (#1080 ) * add button to download logs * render if logs are present * add icon	2023-08-15 21:14:32 -07:00
sua yoo	270e134359	Show details in crawl error log (#1079 ) Shows crawl error log details in a dialog. Since the detail object does not always follow a specific format, this iteration uses the detail key in uppercase as the label.	2023-08-15 21:14:08 -07:00
Ilya Kreymer	768d1181f8	frontend: fixes for queue / exclusions: (#1076 ) - fix 'Edit Crawler Instances' not showing up when crawl running - urlencode regex params to properly encode '+' - catch server-side regex error, display 'Invalid Regex'	2023-08-15 13:15:43 -07:00
sua yoo	4c74fadf91	Update frontend local dev guide (#1073 ) - Clarifies use case for frontend development server - Fixes incorrect sample API URLs - Adds additional detail around requirements and quickstart - Links back to docs from frontend README --------- Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics> Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>	2023-08-15 12:03:39 -07:00
sua yoo	89983542f9	Update archived item URLs (#1064 ) - Changes to URLs in "Crawling", "All Archived Items", and "Collections": - Rename Artifacts -> Items - Unifies view crawl view as loaded from All Archived Items and from Workflows - Includes redirect for /artifacts/uploads -> /items/uploads to support archiveweb.page usage	2023-08-14 18:28:37 -07:00
sua yoo	ffd0e525d9	Webpack config improvements (#1063 ) - Upgrades webpack and webpack-dev-server for bugfixes and performance updates - Removes unnecessary file watching - Enables persistent build cache in dev - Switches to faster dev source map	2023-08-11 13:16:24 -07:00
Ilya Kreymer	d93ddaf620	bump version to 1.6.1	2023-08-11 12:50:41 -07:00
Ilya Kreymer	35ab6d6df6	bump to 1.6.0!	2023-08-09 15:40:27 -07:00
Ilya Kreymer	8ea3dd5dae	terminology tweaks in frontend: (part of #922 ) (#1062 ) * terminology tweaks in frontend: (part of #922) - use 'crawl workflow' instead of 'workflow' where possible - use 'replay' instead of 'replay crawl' - localization: rerun string extraction / processing - "Review Config" → "Review Settings" - "Workflow" → "Crawl Workflow" in error message --------- Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>	2023-08-09 15:38:58 -07:00
sua yoo	37733483d5	Standardize archived item filtering, sorting and labels (#1054 ) Frontend: - Renames list view to "All Archived Items" - Refactors fetches to use single all-crawls endpoints - Removes search by config ID for more search parity with uploads - Adds sort by size - Refactors property and method names to replace crawl* - Replaces remaining references to "crawl" in copy with "item"' - Rename Upload Archive button to Upload WACZ - Fix focusout in item menu so menus close Backend: - Filter search values by type as well - Only get list of cids for crawls in search values - Don't list crawl/workflow ids in search values --------- Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>	2023-08-09 12:13:55 -07:00
Ilya Kreymer	7a8f370bc2	bump version to 1.6.0-beta.4 for testing	2023-08-09 12:09:37 -07:00
Ilya Kreymer	38f67a6cc0	Optimize Frontend Image Build on CI (#1057 ) * Always run yarn only on build platform with --platform=$BUILDPLATFORM * Remove optional dependencies (playwright + chromium) from build with --ignore-optional and move some devDependencies to be optional * Disable husky pre-commit hook checks on frontend Co-authored-by: sua yoo <sua@suayoo.com>	2023-08-09 12:06:20 -07:00
sua yoo	b494070e43	Collection share dialog + copy updates (#1056 ) - Always shows primary "Share" action button in Collection detail page. - Enables toggling shareable status and share info from dialog. Difference from mockups: I made the "Done" button neutral do differentiate from our submit action buttons in the dialog, since toggling will apply changes immediately. - Menu item: "Go to Public View"/"Go to Shareable View" -> "Visit Shareable URL". - Toggle label: "Make Collection Shareable" -> "Collection is Shareable". - Additional dialog copy: adds "This collection can be viewed by anyone with the link." under "Link to Share" and "Share this collection by embedding it into an existing webpage." under "Embed Collection". - Moves share status icon to its own column in list view. - Adds new syntax-highlighted code component that supports js and html. Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>	2023-08-09 10:12:46 -07:00
Anish Lakhwara	9236a07800	fix: run `yarn format` in frontend dir (#1043 )	2023-08-03 19:12:48 -07:00
Ilya Kreymer	362afa47bd	Support for Public / Shareable Collections (#1038 ) * collections: support toggling collections public/private, viewable via RWP - backend: add 'public' to collection model, support patching to update - backend: add .../collections/<id>/public/replay.json for public access - backend: add CORS handling for public endpoint - frontend: support 'make shareable / make private' dropdown actions on collection detail + collection list views - frontend: show shareable / private icons by collection name on detail + list views - frontend: link to replayweb.page for standalone browsing - frontend: add embed code popup when a collection is shareable - refer to public collections as 'shareable' for now --------- Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>	2023-08-03 19:11:01 -07:00
sua yoo	62d3399223	Add info bar to Collection detail view (#1036 ) - Adds Collection info bar to detail view - Update "Web Captures" -> "Archived Items" - Updates Collection list columns to match - Refactors `btrix-desc-list` and usage in `workflow-details` to reuse horizontal info bar component	2023-08-03 16:58:56 -07:00
Anish Lakhwara	af09d56ef6	Merge pull request #1035 from webrecorder/backend-init feat: Display waiting message while backend is initializing	2023-08-02 17:39:47 -07:00
Anish Lakhwara	fa58e77167	fix: remove strange character?	2023-08-02 17:34:09 -07:00
Anish Lakhwara	5ed2faaecc	fix: need to use `window.timeOut` to get a timerId back	2023-08-02 17:31:01 -07:00
Anish Lakhwara	6ecfd8ec24	fix: timerId not timeoutId	2023-08-02 17:28:07 -07:00
Anish Lakhwara	3985cf014e	fix: clear timeout on disconnect callback	2023-08-02 17:26:26 -07:00
Anish Lakhwara	196b26c60e	fix: center text	2023-08-02 17:21:36 -07:00
Anish Lakhwara	f1d91e3bf9	fix: add styling	2023-08-02 17:18:40 -07:00
Anish Lakhwara	a8bedeffb5	fix: take Sua's suggestons, less code needed	2023-08-02 17:10:45 -07:00
Anish Lakhwara	2f26fcefce	fix: make pretty & work correctly	2023-08-02 16:36:28 -07:00
Anish Lakhwara	06918c967b	feat: use html dialog instead	2023-08-02 11:37:55 -07:00
Anish Lakhwara	84a60b54e4	feat: Display waiting message while backend is initializing	2023-08-01 17:18:05 -07:00
Ilya Kreymer	45eaa0b3a3	version: bump to 1.6.0-beta.3	2023-08-01 09:48:17 -07:00
sua yoo	cc52dfd940	Sort Collections by size (#1026 ) - Adds "Size" column to Collections list view - Adds "Size" option to sort dropdown	2023-08-01 09:47:47 -07:00
sua yoo	54e2b2c703	List web captures in Collection (#1024 ) - Adds tab for "Web Captures" in Collection detail view - Move Collection description under Replay section - Fixes app reloading when clicking into a Collection - Standardizes Web Capture list headers from "Finished -> "Created Date"	2023-08-01 09:14:27 -07:00
Ilya Kreymer	06cf9c7cc3	add crawl ending states: 'generate-wacz', 'uploading-wacz', 'pending-wait' that occur after a crawl is finished or is being stopped (#1022 ) operator: ensure transitions from each of these states is supported, including to 'waiting_capacity' add extra check on stopping to avoid transitioning back to a running state after crawl is finished ui: add states to UI display, localization, add as active states fixes #263	2023-08-01 00:15:59 -07:00
Anish Lakhwara	d8502da885	fix(build): use `/usr/bin/env bash` instead of `/bin/bash` (#1020 ) * fix: add to various other shell scripts	2023-07-28 21:50:04 -07:00
sua yoo	7069b33646	Show only running crawls in superadmin view (#1015 ) - Show separate crawls list for admin view, fixes #1010	2023-07-26 15:48:20 -07:00
Ilya Kreymer	6506965d98	Streaming Download for Collections (#1012 ) * support streaming download of collections (part of #927) - WACZ zip created on the fly using stream-zip - add 'Download Collection' option to collection detail and list - after editing collection, return to collection view - tests: add test for streaming download, ensure WACZ files + datapackage present, STORE compression used --------- Co-authored-by: sua yoo <sua@suayoo.com>	2023-07-26 15:42:17 -07:00
Tessa Walsh	c21153255a	Rename notes to description in frontend and backend (#1011 ) - Rename crawl notes to description - Add migration renaming notes -> description - Stop inheriting workflow description in crawl - Update frontend to replace crawl/upload notes with description - Remove setting of config description from crawl list - Adjust tests for changes	2023-07-26 13:00:04 -07:00
sua yoo	75b011f951	Upload WACZ via UI (#992 ) - Users can now upload .WACZ archives from the "Archived Data" page. - Can specify name, description, tags and collection(s) to add upload to - Show progress of upload - Support canceling upload	2023-07-21 16:45:52 +02:00
sua yoo	85913112a2	Upgrade lit + shoelace to reduce build size (#938 ) * upgrade lit * upgrade shoelace * upgrade testing libraries * add webpack bundle analyzer * revert shoelace changes * remove bundle analyzer * remove console log	2023-07-20 11:50:05 +02:00
Tessa Walsh	d5c3a8519f	Add crawler Use Sitemap option to Browsertrix Cloud (#978 ) * Add user-guide docs for Use Sitemap option --------- Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>	2023-07-19 13:57:52 -04:00

1 2 3 4 5 ...

426 Commits