browsertrix

History

Tessa Walsh e667fe2e97 Add max crawl size option to backend and frontend (#1045 ) Backend: - add 'maxCrawlSize' to models and crawljob spec - add 'MAX_CRAWL_SIZE' to configmap - add maxCrawlSize to new crawlconfig + update APIs - operator: gracefully stop crawl if current size (from stats) exceeds maxCrawlSize - tests: add max crawl size tests Frontend: - Add Max Crawl Size text box Limits tab - Users enter max crawl size in GB, convert to bytes - Add BYTES_PER_GB as constant for converting to bytes - docs: Crawl Size Limit to user guide workflow setup section Operator Refactor: - use 'status.stopping' instead of 'crawl.stopping' to indicate crawl is being stopped, as changing later has no effect in operator - add is_crawl_stopping() to return if crawl is being stopped, based on crawl.stopping or size or time limit being reached - crawlerjob status: store byte size under 'size', human readable size under 'sizeHuman' for clarity - size stat always exists so remove unneeded conditional (defaults to 0) - store raw byte size in 'size', human readable size in 'sizeHuman' Charts: - subchart: update crawlerjob crd in btrix-crds to show status.stopping instead of spec.stopping - subchart: show 'sizeHuman' property instead of 'size' - bump subchart version to 0.1.1 --------- Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>		2023-08-26 22:00:37 -07:00
..
migrations	feat: implement 'collections' array with {name, id} for archived item details (#1098 )	2023-08-25 00:26:46 -07:00
templates	Add max crawl size option to backend and frontend (#1045 )	2023-08-26 22:00:37 -07:00
__init__.py
basecrawls.py	feat: implement 'collections' array with {name, id} for archived item details (#1098 )	2023-08-25 00:26:46 -07:00
colls.py	feat: implement 'collections' array with {name, id} for archived item details (#1098 )	2023-08-25 00:26:46 -07:00
crawlconfigs.py	Add max crawl size option to backend and frontend (#1045 )	2023-08-26 22:00:37 -07:00
crawlmanager.py	Add max crawl size option to backend and frontend (#1045 )	2023-08-26 22:00:37 -07:00
crawls.py	Add max crawl size option to backend and frontend (#1045 )	2023-08-26 22:00:37 -07:00
db.py	feat: implement 'collections' array with {name, id} for archived item details (#1098 )	2023-08-25 00:26:46 -07:00
emailsender.py
invites.py	Move pydantic models to separate module + refactor crawl response endpoints to be consistent (#983 )	2023-07-20 13:05:33 +02:00
k8sapi.py	Add max crawl size option to backend and frontend (#1045 )	2023-08-26 22:00:37 -07:00
main_op.py
main_scheduled_job.py	1.6.3 Fixes - Fix workflow sort order for Latest Crawl + 'Remove From Collection' action menu on archived items in collections (#1113 )	2023-08-25 21:08:47 -07:00
main.py	feat: implement 'collections' array with {name, id} for archived item details (#1098 )	2023-08-25 00:26:46 -07:00
models.py	Add max crawl size option to backend and frontend (#1045 )	2023-08-26 22:00:37 -07:00
operator.py	Add max crawl size option to backend and frontend (#1045 )	2023-08-26 22:00:37 -07:00
orgs.py	Add and enforce org maxPagesPerCrawl quota (#1044 )	2023-08-23 10:38:36 -04:00
pagination.py	Move pydantic models to separate module + refactor crawl response endpoints to be consistent (#983 )	2023-07-20 13:05:33 +02:00
profiles.py	Use Shared Services for Crawling, Redis, Profile Browsers (#1088 )	2023-08-24 20:08:53 -07:00
storages.py	Streaming Download for Collections (#1012 )	2023-07-26 15:42:17 -07:00
uploads.py	feat: implement 'collections' array with {name, id} for archived item details (#1098 )	2023-08-25 00:26:46 -07:00
users.py	Support for Public / Shareable Collections (#1038 )	2023-08-03 19:11:01 -07:00
utils.py
version.py	bump version to 1.7.0-beta.0	2023-08-23 12:03:45 -07:00
zip.py