browsertrix/chart/app-templates
Ilya Kreymer 8ae032ff88 More friendly WARC prefix inside WACZ based on Org slug + Crawl Name / First Seed URL. (#1537)
Supports setting WARC prefix for WARCs inside WACZ to `<org slug>-<slug
[crawl name | first seed host]>`.
- Prefix set via WARC_PREFIX env var, supported in browsertrix-crawler
1.0.0-beta.4 or higher
If crawl name is provided, uses crawl name, other hostname of first
seed. The name is 'sluggified', using lowercase alphanum characters
separated by dashes.

Ex: in an organization called `Default Org`, a crawl of
`https://specs.webrecorder.net/` and no name will have WARCs named:
`default-org-specs-webrecorder-net-....warc.gz`
If the crawl is given the name `SPECS`, the WARCs will be named
`default-org-specs-manual-....warc.gz`

Fixes #412 in a default way.
2024-02-22 23:54:23 -08:00
..
crawl_cron_job.yaml charts cleanup: (#1360) 2023-11-08 19:24:00 -08:00
crawl_job.yaml More friendly WARC prefix inside WACZ based on Org slug + Crawl Name / First Seed URL. (#1537) 2024-02-22 23:54:23 -08:00
crawler.yaml More friendly WARC prefix inside WACZ based on Org slug + Crawl Name / First Seed URL. (#1537) 2024-02-22 23:54:23 -08:00
profile_job.yaml Support multiple crawler versions (#1420) 2024-01-16 15:32:12 -08:00
profilebrowser.yaml node affinity: set to required instead of preferred to keep crawlers on dedicated infrastructure (#1366) 2023-11-13 10:02:05 -08:00
redis.yaml
replica_job.yaml quickfix: bump replica_job memory to 200Mi 2023-11-13 13:45:24 -08:00