browsertrix/backend
Ilya Kreymer c9c39d47b7
Scheduled Crawl Refactor: Handle via Operator + Add Skipped Crawls on Quota Reached (#1162)
* use metacontroller's decoratorcontroller to create CrawlJob from Job
* scheduled job work:
- use existing job name for scheduled crawljob
- use suspended job, set startTime, completionTime and succeeded status on job when crawljob is done
- simplify cronjob template: remove job_image, cron_namespace, using same namespace as crawls,
placeholder job image for cronjobs

* move storage quota check to crawljob handler:
- add 'skipped_quota_reached' as new failed status type
- check for storage quota before checking if crawljob can be started, fail if not (check before any pods/pvcs created)

* frontend:
- show all crawls in crawl workflow, no need to filter by status
- add 'skipped_quota_reached' status, show as 'Skipped (Quota Reached)', render same as failed

* migration: make release namespace available as DEFAULT_NAMESPACE, delete old cronjobs in DEFAULT_NAMESPACE and recreate in crawlers namespace with new template
2023-09-12 13:05:43 -07:00
..
btrixcloud Scheduled Crawl Refactor: Handle via Operator + Add Skipped Crawls on Quota Reached (#1162) 2023-09-12 13:05:43 -07:00
test Operator refactor to control pods + pvcs directly instead of statefulsets (#1149) 2023-09-11 10:38:04 -07:00
test_nightly Add max crawl size option to backend and frontend (#1045) 2023-08-26 22:00:37 -07:00
.pylintrc quickfix: pydantic / lint fix (#452) 2023-01-10 18:54:11 -08:00
Dockerfile
requirements.txt better resources scaling by number of browsers per crawler container (#1103) 2023-09-06 01:42:44 -04:00