browsertrix

Author	SHA1	Message	Date
Ilya Kreymer	f6dc26eeb5	nginx: enable worker processes autotune to correctly set the number of processes for nginx, possible fix for #780 (#785 )	2023-04-21 18:13:22 -07:00
Tessa Walsh	a2435a013b	Add totalSize to workflow API endpoints (#783 )	2023-04-20 17:23:59 -04:00
Ilya Kreymer	3f41498c5c	quickfix: fix typo, remove unnecessary async	2023-04-18 16:14:15 -07:00
Tessa Walsh	a9c1c54194	Make btrix helper work with microk8s (#768 ) * Check for microk8s * Use python3 * Add note about installing pytest * Add chart/local.yaml to .gitignore to avoid committing	2023-04-18 08:50:46 -04:00
Ilya Kreymer	821d29bd2a	crawlconfig api: add 'currCrawlState' and 'currCrawlTimeStart' to crawlconfig list api (already queried on backend) (#770 ) * crawlconfig api: add 'currCrawlState' and 'currCrawlTimeStart' to crawlconfig list api (already queried on backend)	2023-04-17 23:13:13 -07:00
Tessa Walsh	6b19f72a89	Add crawl errors endpoint (#757 ) * Add crawl errors endpoint If this endpoint is called while the crawl is running, errors are pulled directly from redis. If this endpoint is called when the crawl is finished, errors are pulled from mongodb, where they're written when crawls complete. * Add nightly backend test for errors endpoint * Add errors for failed and cancelled crawls to mongo Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>	2023-04-17 12:59:25 -04:00
Ilya Kreymer	4a46f894a2	backend: add 'lastCrawlStartTime' and 'lastStartedByName' fields to crawlconfigs apis (#753 )	2023-04-17 08:34:29 -07:00
Tessa Walsh	59e49eacd5	Update collections backend API (#759 ) * Re-implement collections, storing crawlIds in collection * Return collections for crawl endpoints and filter on coll name * Remove crawl from all collections when deleted * Revert get_collection_crawls to flat array of resources * Fix tests	2023-04-14 12:17:18 -04:00
Henry Wilkinson	a62a452c07	Merge pull request #758 from webrecorder/docs-fonts&icons	2023-04-13 22:05:48 -04:00
Tessa Walsh	1ad82a63e6	Add crawl timeout nightly test (#762 )	2023-04-11 19:36:18 -07:00
Ilya Kreymer	85b6a05419	Upgrade to mongo 6 and use sortArray for workflow crawls (#764 ) (#765 ) fixes from 1.4.1: * Upgrade to mongo 6 and use for workflow crawls * update readiness probe with timeouts doubled, and failure threshold increased for slower 'mongosh' readiness check update versions to 1.5.0-beta.0 in backend and frontend Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>	2023-04-11 18:22:07 -07:00
Henry Wilkinson	d50fab67a9	Link accessibility improvements - Nav bar text is now 20% higher opacity, hover state also differentiated with weight - In-body links are now underlined - Lightened BG colour and darkened link colour — now achieves an APCA score of 84!	2023-04-11 19:51:48 -04:00
Sara Tavares	07fb7317fe	Delete proofread-action.yaml (#760 ) Resulting in a lot of false positives (to revisit later)	2023-04-11 15:49:27 -07:00
Tessa Walsh	f261967de8	Bump version to 1.5.0-beta.0	2023-04-11 11:51:17 -04:00
Tessa Walsh	fb80a04f18	Add crawl /log API endpoint If a crawl is completed, the endpoint streams the logs from the log files in all of the created WACZ files, sorted by timestamp. The API endpoint supports filtering by log_level and context whether the crawl is still running or not. This is not yet proper streaming because the entire log file is read into memory before being streamed to the client. We will want to switch to proper streaming eventually, but are currently blocked by an aiobotocore bug - see: https://github.com/aio-libs/aiobotocore/issues/991?#issuecomment-1490737762	2023-04-11 11:51:17 -04:00
Henry Wilkinson	128aa89d33	Adds the specific icons currently required - Updates writing docs page regarding adding icons	2023-04-10 18:58:24 -04:00
Henry Wilkinson	ec324799c9	removes icons	2023-04-10 03:05:32 -04:00
Henry Wilkinson	8e8f59ec13	Updates main & code block background colors	2023-04-07 00:06:26 -04:00
Henry Wilkinson	f90a85aa66	Merge branch 'main' into docs-fonts&icons	2023-04-06 23:40:49 -04:00
Henry Wilkinson	4852259f1c	Adds the bootstrap icon library to the docs dir	2023-04-06 23:33:07 -04:00
Henry Wilkinson	8d60984760	Typography updates - Sets Recursive as the main typeface for code and text! - Adjusts variable axes and sets stylistic alternates accordingly. - Self hosts the font	2023-04-06 23:28:23 -04:00
Henry Wilkinson	ab8088aec4	merge main into update	2023-04-06 18:39:23 -04:00
Henry Wilkinson	25800b924b	update admonition icons	2023-04-06 17:49:29 -04:00
Henry Wilkinson	883da0bc89	Adds footer license & links - Updates license section in readme clarifying docs licensing	2023-04-06 17:20:50 -04:00
Henry Wilkinson	63bbe4c1ae	Adds bootstrap icons to the docs repo	2023-04-06 17:20:13 -04:00
Ilya Kreymer	631c84e488	version: bump to 1.4.0!	2023-04-06 10:12:43 -07:00
Henry Wilkinson	ba3daf326d	Adds `inputmode` attributes to workflow config fields (#755 ) - Now the appropriate virtual keyboards are shown! :) - Also adjusts type weight for workflow config headers to match mockups	2023-04-06 09:16:48 -07:00
Henry Wilkinson	c6aec84af4	Changes the autoscroll setting to true by default (#756 ) As per my note on #745, currently all our other check boxes turn features on when enabled. For consistency I have reversed the states of the autoscroll checkbox so the page autoscrolls when it is checked and does not run the behavior when it is unchecked. Checked is also now the default state. - Updates help text accordingly - Renames `disableAutoscrollBehavior` → autoscrollBehavior	2023-04-06 09:06:55 -07:00
Ilya Kreymer	3ab62547a9	version: bump to 1.4.0-beta.2	2023-04-06 02:45:20 -07:00
Henry Wilkinson	0a1f5eff8e	Docs: adds mkdocs features, adds theming (#728 ) * Add stylesheet & mkdocs features - Adds a custom stylesheet & brand colours - Adds Recursive as the code font - Adds repo info to the nav bar - Adds auto tracking ID links for deep linking to sections as users scroll the page - Index pages are now a part of their section as determined by their H1 - Removes mkdocs info from future footer * Reorganize content - Renames "Dev" to "Develop" for improved navigation labels - Adds links to tools the first time they're mentioned - Rewords part of the homepage - Hides section navigation on the homepage (now we don't have a blank section nav bar! - Adds some syntax highlighting - Removes some manual word wrapping — this was done very rarely / inconsistently * Rename "Developer Docs" index page - Better title for sidebar * Update docs.md - Adds links to tools - Adds future docs style guide section - Updates name and makes it an H1 - Replaces hyphens on the homepage with em dashes * deployment index page: changed title, removed non-k8s section, cleaned up intro * develop index page: changed title fixed typo on main page --------- Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>	2023-04-06 02:44:19 -07:00
Tessa Walsh	11ca3e678a	Configure crawler disk utilization threshold via helm chart (#748 )	2023-04-05 21:51:53 -07:00
Tessa Walsh	f6f3b7abba	Add btrix CLI dev helper (#732 ) * Add btrix CLI dev helper * Fix identation * Use bash syntax for ifs	2023-04-05 21:51:22 -07:00
sua yoo	80bc4a3eb9	Fix additional URLs (#752 )	2023-04-05 20:11:09 -07:00
sua yoo	91c2c1ad62	Allow users to set additional page time limits (#744 )	2023-04-05 20:06:46 -07:00
sua yoo	72967a0381	Frontend Docker build improvements (#749 )	2023-04-05 20:05:45 -07:00
sua yoo	c60dc5d086	Crawls list backend pagination (#735 )	2023-04-05 10:55:42 -07:00
Ilya Kreymer	63be81d835	ci: make playwright integration tests run only on PRs involving frontend	2023-04-05 09:57:34 -07:00
Ilya Kreymer	7f757d396a	config: add 'pageLoadTimeout' and 'pageExtraDelay' options to backend… (#742 ) * config: add 'pageLoadTimeout' and 'pageExtraDelay' options to backend config - add 'default_page_load_timeout_seconds' to values.yaml, defaulting to 120, for pageLoadTimeout - add 'defaultPageLoadTimeSeconds ' to /api/settings, update tests for /api/settings addresses issue in #636	2023-04-04 19:52:23 -07:00
Ilya Kreymer	67172ca1e2	fix: only include finished crawls in crawlCount value for /api/crawlconfigs (#746 )	2023-04-04 19:50:14 -07:00
Ilya Kreymer	88497d2a64	text: rename workflowuration -> workflow (#741 )	2023-04-04 08:48:06 -07:00
sua yoo	370b8cbd4d	Set max pages to API default (#739 )	2023-04-04 08:47:37 -07:00
Ilya Kreymer	2b0d5ff8b3	misc frontend build fixes: playwright version + chunking (#740 ) * misc frontend build fixes: - fix playwright version to be consistent to fix playwright test - chunking: set max number of chunks generated * lock playwright version * remove intl polyfill --------- Co-authored-by: sua yoo <sua@suayoo.com>	2023-04-03 21:27:44 -07:00
Ilya Kreymer	1c47a648a9	Max page limit override (#737 ) * more page limit: update to #717, instead of setting --limit in each crawlconfig, apply override --maxPageLimit setting, implemented in crawler, to override individually configured page limit * update tests, no longer returning 'crawl_page_limit_exceeds_allowed'	2023-04-03 14:01:32 -07:00
Tessa Walsh	3b99bdf26a	Update nightly test fixtures to use Seed objects (#734 )	2023-04-03 16:21:25 -04:00
Tessa Walsh	e9b61c632d	Add pageSize to pagination format (#736 )	2023-04-03 15:57:47 -04:00
Henry Wilkinson	68ec47cb7f	Moves deployment docs back to the root docs directory - Replaces hyphens on the homepage with em dashes	2023-03-31 00:06:45 -04:00
Ilya Kreymer	887cb16146	Allow configurable max pages per crawl in deployment settings (#717 ) * backend: max pages per crawl limit, part of fix for #716: - set 'max_pages_crawl_limit' in values.yaml, default to 100,000 - if set/non-0, automatically set limit if none provided - if set/non-0, return 400 if adding config with limit exceeding max limit - return limit as 'maxPagesPerCrawl' in /api/settings - api: /all/crawls - add runningOnly=0 to show all crawls, default to 1/true (for more reliable testing) tests: add test for 'max_pages_per_crawl' setting - ensure 'limit' can not be set higher than max_pages_per_crawl - ensure pages crawled is at the limit - set test limit to max 2 pages - add settings test - check for pages.jsonl and extraPages.jsonl when crawling 2 pages	2023-03-28 16:26:29 -07:00
Sara Tavares	948cce3d30	Add README.md related to run playwright tests locally (#722 )	2023-03-28 16:08:28 -07:00
Tessa Walsh	4724754efc	Filter and sort crawl and workflow list API endpoints in backend (#724 ) * Re-implement pagination and paginate crawlconfig revs First step toward simplifying pagination to set us up for sorting and filtering of list endpoints. This commit removes fastapi-pagination as a dependency. * Migrate all HttpUrl seeds to Seeds This commit also updates the frontend to always use Seeds and to fix display issues resulting from the change. * Filter and sort crawls and workflows Crawls: - Filter by createdBy (via userid param) - Filter by state (comma-separated string for multiple values) - Filter by first_seed, name, description - Sort by started, finished, fileSize, firstSeed - Sort descending by default to match frontend Workflows: - Filter by createdBy (formerly userid) and modifiedBy - Filter by first_seed, name, description - Sort by created, modified, firstSeed, lastCrawlTime * Add crawlconfigs search-values API endpoint and test	2023-03-28 17:55:40 -04:00
Sara Tavares	36cfb2591f	ci: fix version related to @playwright/test (#729 ) * fix version, add resolutions to have fixed playwright version	2023-03-28 14:30:36 -07:00

1 2 3 4 5 ...

550 Commits