browsertrix/backend/btrixcloud
Ilya Kreymer 95969ec747
Attempt to auto-adjust storage if usage is running out while crawl is running (#2023)
Attempt to auto-adjust PVC storage if:
- used storage (as reported in redis by the crawler) * 2.5 >
total_storage
- will cause PVC to resize, if possible (not supported by all drivers)
- uses multiples of 1Gi, rounding up to next GB
- AVAIL_STORAGE_RATIO hard-coded to 2.5 for now, to account for 2x space
for WACZ plus change for fast updating crawls

Some caveats:
- only works if the storageClass used for PVCs has
`allowVolumeExpansion: true`, if not, it will have no effect
- designed as a last resort option: the `crawl_storage` in values and
`--sizeLimit` and `--diskUtilization` should generally result in this
not being needed.
- can be useful in cases where a crawl is rapidly capturing a lot of
content in one page, and there's no time to interrupt / restart, since
the other limits apply only at page end.
- May want to have crawler update the disk usage more frequently, not
just at page end to make this more effective.
2024-08-26 14:19:20 -07:00
..
migrations fix resetting of invalid logins: (#2002) 2024-08-07 12:36:06 -07:00
operator Attempt to auto-adjust storage if usage is running out while crawl is running (#2023) 2024-08-26 14:19:20 -07:00
__init__.py refactoring to use statefulsets + job (#245) 2022-06-05 10:37:17 -07:00
auth.py Include user and user org info in login response (#2014) 2024-08-12 18:51:42 -07:00
background_jobs.py api docs cleanup + readd webhooks: (#1949) 2024-07-22 09:00:59 -07:00
basecrawls.py quickfix: webhooks: ensure the 'crawl_reviewed' webhook is sent async, doesn't delay submitting a review (#2033) 2024-08-20 17:50:18 -07:00
colls.py Document all API endpoints with response models (#1928) 2024-07-16 12:48:38 -07:00
crawlconfigs.py stats recompute fixes: (#2022) 2024-08-26 14:18:59 -07:00
crawlmanager.py Remove Crawl Workflow Configmaps (#1894) 2024-06-28 15:25:23 -07:00
crawls.py type fixes on util functions (#2009) 2024-08-12 10:54:45 -07:00
db.py fix resetting of invalid logins: (#2002) 2024-08-07 12:36:06 -07:00
emailsender.py Subscription Update Quotas (#1988) 2024-08-05 15:59:47 -07:00
invites.py Add created date to Organization and fix datetimes across backend (#1921) 2024-07-15 19:46:32 -07:00
k8sapi.py Remove Crawl Workflow Configmaps (#1894) 2024-06-28 15:25:23 -07:00
main_op.py Add superuser API endpoints to export and import org data (#1394) 2024-07-02 17:14:34 -04:00
main.py Add support e-mail to settings (#1960) 2024-07-23 20:58:12 -04:00
models.py add a crawling defaults on the Org to allow setting certain crawl workflow fields as defaults: (#2031) 2024-08-22 10:36:04 -07:00
orgs.py add a crawling defaults on the Org to allow setting certain crawl workflow fields as defaults: (#2031) 2024-08-22 10:36:04 -07:00
pages.py fix resetting of invalid logins: (#2002) 2024-08-07 12:36:06 -07:00
pagination.py Format backend with Black 24 (#1507) 2024-02-07 11:35:34 -08:00
profiles.py optimize org quota lookups (#1973) 2024-07-25 14:00:16 -07:00
storages.py Implement downloading archived item + QA runs as multi-WACZ (#1933) 2024-07-25 10:28:57 -07:00
subs.py Subscription Update Quotas (#1988) 2024-08-05 15:59:47 -07:00
uploads.py optimize org quota lookups (#1973) 2024-07-25 14:00:16 -07:00
users.py fix resetting of invalid logins: (#2002) 2024-08-07 12:36:06 -07:00
utils.py type fixes on util functions (#2009) 2024-08-12 10:54:45 -07:00
version.py version: update to 1.11.4 2024-08-26 12:31:56 -07:00
webhooks.py Add webhooks for qaAnalysisStarted, qaAnalysisFinished, and crawlReviewed (#1974) 2024-07-25 16:53:49 -07:00