browsertrix/backend/btrixcloud
Ilya Kreymer b4fd5e6e94
Crawl Timeout via elapsed time (#1338)
Fixes #1337 

Crawl timeout is tracked via `elapsedCrawlTime` field on the crawl
status, which is similar to regular crawl execution time, but only
counts one pod if scale > 1. If scale == 1, this time is equivalent.

Crawl is gracefully stopped when the elapsed execution time exceeds the
timeout. For more responsiveness, also adding current crawl time since
last update interval.

Details:
- handle crawl timeout via elapsed crawl time - longest running time of a
single pod, instead of expire time.
- include current running from last update for best precision
- more accurately count elapsed time crawl is actually running
- store elapsedCrawlTime in addition to crawlExecTime, storing the
longest duration of each pod since last test interval

---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2023-11-06 16:32:58 -08:00
..
migrations Storage Refactor: Replication + Custom Storage Support (#1296) 2023-10-26 21:44:09 -07:00
__init__.py
auth.py Additional Type Hints / Type Fix Pass (#1320) 2023-10-30 12:59:24 -04:00
background_jobs.py Background Jobs Work (#1321) 2023-11-02 13:02:17 -07:00
basecrawls.py Background Jobs Work (#1321) 2023-11-02 13:02:17 -07:00
colls.py Additional Type Hints / Type Fix Pass (#1320) 2023-10-30 12:59:24 -04:00
crawlconfigs.py Additional Type Hints / Type Fix Pass (#1320) 2023-10-30 12:59:24 -04:00
crawlmanager.py Background Jobs Work (#1321) 2023-11-02 13:02:17 -07:00
crawls.py exclusion optimizations: dynamic exclusions (part of #1216): (#1268) 2023-11-06 09:36:25 -08:00
db.py Additional Type Hints / Type Fix Pass (#1320) 2023-10-30 12:59:24 -04:00
emailsender.py Additional Type Hints / Type Fix Pass (#1320) 2023-10-30 12:59:24 -04:00
invites.py Additional Type Hints / Type Fix Pass (#1320) 2023-10-30 12:59:24 -04:00
k8sapi.py Crawl Timeout via elapsed time (#1338) 2023-11-06 16:32:58 -08:00
main_op.py Background Jobs Work (#1321) 2023-11-02 13:02:17 -07:00
main.py Background Jobs Work (#1321) 2023-11-02 13:02:17 -07:00
models.py Background Jobs Work (#1321) 2023-11-02 13:02:17 -07:00
operator.py Crawl Timeout via elapsed time (#1338) 2023-11-06 16:32:58 -08:00
orgs.py Additional Type Hints / Type Fix Pass (#1320) 2023-10-30 12:59:24 -04:00
pagination.py Move pydantic models to separate module + refactor crawl response endpoints to be consistent (#983) 2023-07-20 13:05:33 +02:00
profiles.py Background Jobs Work (#1321) 2023-11-02 13:02:17 -07:00
storages.py Additional Type Hints / Type Fix Pass (#1320) 2023-10-30 12:59:24 -04:00
uploads.py Background Jobs Work (#1321) 2023-11-02 13:02:17 -07:00
users.py Additional Type Hints / Type Fix Pass (#1320) 2023-10-30 12:59:24 -04:00
utils.py Add slugs to org backend (#1250) 2023-10-10 18:30:09 -07:00
version.py version: bump to 1.8.0-beta.1 2023-10-27 14:35:24 -07:00
webhooks.py Additional Type Hints / Type Fix Pass (#1320) 2023-10-30 12:59:24 -04:00
zip.py Fix: Stream log downloading from WACZ (#1225) 2023-09-28 18:54:52 -07:00