browsertrix/frontend
Ilya Kreymer ad9bca2e92
Operator refactor to control pods + pvcs directly instead of statefulsets (#1149)
- Ability for pod to be Completed, unlike in Statefulset - eg. if 3 pods are running and first one finishes, all 3 must be running until all 3 are done. With this setup, the first finished pod can remain in Completed state.
- Fixed shutdown order - crawler pods now correctly shutdown first before redis pods, by switching to background deletion.
- Pod priority decreases with scale: 1st instance of a new crawl can preempt 3rd or 2nd instance of another crawl
- Create priority classes upto 'max_crawl_scale, configured in values.yaml
- Improved scale change reconciliation: if increasing scale, immediately scale up. If decreasing scale,
graceful stop scaled-down instance to complete via redis 'stopone' key, wait until they exit with Completed state
before adjust status.scale / removing scaled down pods. Ensures unaccepted interrupts don't cause scaled down data to be deleted.
- Redis pod remains inactive until crawler is first active, or after no crawl pods are active for 60 seconds
- Configurable Redis storage with 'redis_storage' value, set to 3Gi by default
- CrawlJob deletion starts as soon as post-finish crawl operations are run
- Post-crawl operations get their own redis instance, since one during response is being cleaned up in finalizer
- Finalizer ignores request with incorrect state (returns 400 if reported as not finished while crawl is finished)
- Current resource usage added to status
- Profile browser: also manage single pod directly without statefulset for consistency.
- Restart pods via restartTime value: if spec.restartTime != status.restartTime, clear out pods and update status.restartTime (using OnDelete policy to avoid recreate loops in edge cases).
- Update to latest metacontroller (v4.11.0)
- Add --restartOnError flag for crawler (for browsertrix-crawler 0.11.0)
- Failed crawl logging: dd 'fail_crawl()' to be used for failing a crawl, which prints logs for default container (if enabled) as well as pod status
- tests: check other finished states to avoid stuck in infinite loop if crawl fails
- tests: disable disk utilization check, which adds unpredictability to crawl testing!
fixes #1147 

---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2023-09-11 10:38:04 -07:00
..
.husky
.vscode
assets
config supports overriding the replayweb.page version without having to be r… (#1122) 2023-09-05 20:10:21 -04:00
scripts
src Add and enforce org storage quota (#1106) 2023-09-07 12:45:43 -04:00
tests Add README.md related to run playwright tests locally (#722) 2023-03-28 16:08:28 -07:00
xliff terminology tweaks in frontend: (part of #922) (#1062) 2023-08-09 15:38:58 -07:00
.dockerignore
.editorconfig
.gitignore
.prettierignore
00-browsertrix-nginx-init.sh fix(build): use /usr/bin/env bash instead of /bin/bash (#1020) 2023-07-28 21:50:04 -07:00
Dockerfile supports overriding the replayweb.page version without having to be r… (#1122) 2023-09-05 20:10:21 -04:00
frontend.conf.template Operator refactor to control pods + pvcs directly instead of statefulsets (#1149) 2023-09-11 10:38:04 -07:00
index.d.ts Frontend collections beta UI (#886) 2023-06-06 17:52:01 -07:00
lit-localize.json
minio.conf
package.json bump version to 1.7.0-beta.0 2023-08-23 12:03:45 -07:00
playwright.config.ts
postcss.config.js
README.md Update frontend local dev guide (#1073) 2023-08-15 12:03:39 -07:00
sample.env.local
tailwind.config.js
tsconfig.json
web-test-runner.config.mjs misc frontend build fixes: playwright version + chunking (#740) 2023-04-03 21:27:44 -07:00
webpack.config.js supports overriding the replayweb.page version without having to be r… (#1122) 2023-09-05 20:10:21 -04:00
webpack.dev.js Webpack config improvements (#1063) 2023-08-11 13:16:24 -07:00
webpack.prod.js
yarn.lock Webpack config improvements (#1063) 2023-08-11 13:16:24 -07:00