browsertrix/backend
Ilya Kreymer c134b576ae
Optimize presigning for replay.json (#2516)
Fixes #2515.

This PR introduces a significantly optimized logic for presigning URLs
for crawls and collections.
- For collections, the files needed from all crawls are looked up, and
then the 'presign_urls' table is merged in one pass, resulting in a
unified iterator containing files and presign urls for those files.
- For crawls, the presign URLs are also looked up once, and the same
iterator is used for a single crawl with passed in list of CrawlFiles
- URLs that are already signed are added to the return list.
- For any remaining URLs to be signed, a bulk presigning function is
added, which shares an HTTP connection and signing 8 files in parallels
(customizable via helm chart, though may not be needed). This function
is used to call the presigning API in parallel.
2025-05-20 12:09:35 -07:00
..
btrixcloud Optimize presigning for replay.json (#2516) 2025-05-20 12:09:35 -07:00
test Add ISO-639-1 language code validation to backend (#2602) 2025-05-13 16:54:33 -04:00
test_nightly Fix nightly tests (#2460) 2025-03-06 16:23:30 -08:00
.pylintrc security: tweak get /invite endpoints / InviteOut to: (#2087) 2024-09-20 11:52:56 -07:00
dev-requirements.txt
Dockerfile Add backend support for custom behaviors + validation endpoint (#2505) 2025-04-02 16:20:51 -07:00
mypy.ini
requirements.txt Add ISO-639-1 language code validation to backend (#2602) 2025-05-13 16:54:33 -04:00
test-requirements.txt Fix nightly tests: Add boto3 as test requirement (#2116) 2024-10-23 13:41:22 -07:00