browsertrix/backend
Ilya Kreymer 95969ec747
Attempt to auto-adjust storage if usage is running out while crawl is running (#2023)
Attempt to auto-adjust PVC storage if:
- used storage (as reported in redis by the crawler) * 2.5 >
total_storage
- will cause PVC to resize, if possible (not supported by all drivers)
- uses multiples of 1Gi, rounding up to next GB
- AVAIL_STORAGE_RATIO hard-coded to 2.5 for now, to account for 2x space
for WACZ plus change for fast updating crawls

Some caveats:
- only works if the storageClass used for PVCs has
`allowVolumeExpansion: true`, if not, it will have no effect
- designed as a last resort option: the `crawl_storage` in values and
`--sizeLimit` and `--diskUtilization` should generally result in this
not being needed.
- can be useful in cases where a crawl is rapidly capturing a lot of
content in one page, and there's no time to interrupt / restart, since
the other limits apply only at page end.
- May want to have crawler update the disk usage more frequently, not
just at page end to make this more effective.
2024-08-26 14:19:20 -07:00
..
btrixcloud Attempt to auto-adjust storage if usage is running out while crawl is running (#2023) 2024-08-26 14:19:20 -07:00
test add a crawling defaults on the Org to allow setting certain crawl workflow fields as defaults: (#2031) 2024-08-22 10:36:04 -07:00
test_nightly
.pylintrc
dev-requirements.txt
Dockerfile
mypy.ini
requirements.txt
test-requirements.txt