browsertrix/backend/btrixcloud
Ilya Kreymer c134b576ae
Optimize presigning for replay.json (#2516)
Fixes #2515.

This PR introduces a significantly optimized logic for presigning URLs
for crawls and collections.
- For collections, the files needed from all crawls are looked up, and
then the 'presign_urls' table is merged in one pass, resulting in a
unified iterator containing files and presign urls for those files.
- For crawls, the presign URLs are also looked up once, and the same
iterator is used for a single crawl with passed in list of CrawlFiles
- URLs that are already signed are added to the return list.
- For any remaining URLs to be signed, a bulk presigning function is
added, which shares an HTTP connection and signing 8 files in parallels
(customizable via helm chart, though may not be needed). This function
is used to call the presigning API in parallel.
2025-05-20 12:09:35 -07:00
..
migrations Add ISO-639-1 language code validation to backend (#2602) 2025-05-13 16:54:33 -04:00
operator Ensure error and behavior logs are written to database in order (#2540) 2025-04-08 09:35:50 -04:00
__init__.py
auth.py
background_jobs.py
basecrawls.py Optimize presigning for replay.json (#2516) 2025-05-20 12:09:35 -07:00
colls.py Optimize presigning for replay.json (#2516) 2025-05-20 12:09:35 -07:00
crawlconfigs.py Add ISO-639-1 language code validation to backend (#2602) 2025-05-13 16:54:33 -04:00
crawlmanager.py feat: Apply saved workflow settings to current crawl (#2514) 2025-04-29 11:43:14 -07:00
crawls.py Add behavior logs from Redis to database and add endpoint to serve (#2526) 2025-04-08 02:16:10 +02:00
db.py Add ISO-639-1 language code validation to backend (#2602) 2025-05-13 16:54:33 -04:00
emailsender.py
invites.py
k8sapi.py
main_bg.py
main_migrations.py
main_op.py
main.py
models.py Optimize presigning for replay.json (#2516) 2025-05-20 12:09:35 -07:00
ops.py
orgs.py Add ISO-639-1 language code validation to backend (#2602) 2025-05-13 16:54:33 -04:00
pages.py Optimize presigning for replay.json (#2516) 2025-05-20 12:09:35 -07:00
pagination.py
profiles.py support overriding crawler image pull policy per channel (#2523) 2025-03-31 14:11:41 -07:00
storages.py Optimize presigning for replay.json (#2516) 2025-05-20 12:09:35 -07:00
subs.py Add API endpoint to check if subscription is activated (#2582) 2025-05-06 17:36:58 -07:00
uploads.py
users.py
utils.py Add ISO-639-1 language code validation to backend (#2602) 2025-05-13 16:54:33 -04:00
version.py Bump version to 1.16.1 (#2606) 2025-05-13 17:29:49 -04:00
webhooks.py