browsertrix/.github/workflows
Vinzenz Sinapius bb6e703f6a
Configure browsertrix proxies (#1847)
Resolves #1354

Supports crawling through pre-configured proxy servers, allowing users to select which proxy servers to use (requires browsertrix crawler 1.3+)

Config:
- proxies defined in btrix-proxies subchart
- can be configured via btrix-proxies key or separate proxies.yaml file via separate subchart
- proxies list refreshed automatically if crawler_proxies.json changes if subchart is deployed
- support for ssh and socks5 proxies
- proxy keys added to secrets in subchart
- support for default proxy to be always used if no other proxy configured, prevent starting cluster if default proxy not available
- prevent starting manual crawl if previously configured proxy is no longer available, return error
- force 'btrix' username and group name on browsertrix-crawler non-root user to support ssh

Operator:
- support crawling through proxies, pass proxyId in CrawlJob
- support running profile browsers which designated proxy, pass proxyId to ProfileJob
- prevent starting scheduled crawl if previously configured proxy is no longer available

API / Access:
- /api/orgs/all/crawlconfigs/crawler-proxies - get all proxies (superadmin only)
- /api/orgs/{oid}/crawlconfigs/crawler-proxies - get proxies available to particular org
- /api/orgs/{oid}/proxies - update allowed proxies for particular org (superadmin only)
- superadmin can configure which orgs can use which proxies, stored on the org
- superadmin can also allow an org to access all 'shared' proxies, to avoid having to allow a shared proxy on each org.

UI:
- Superadmin has 'Edit Proxies' dialog to configure for each org if it has: dedicated proxies, has access to shared proxies.
- User can select a proxy in Crawl Workflow browser settings
- Users can choose to launch a browser profile with a particular proxy
- Display which proxy is used to create profile in profile selector
- Users can choose with default proxy to use for new workflows in Crawling Defaults

---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2024-10-02 18:35:45 -07:00
..
ansible-lint.yaml Update ansible pipfile (#2088) 2024-09-20 11:41:21 -07:00
deploy-dev.yaml ci fix: deploy-dev.yaml fix, install poetry earlier, add decrypt values to sparse checkout 2024-02-23 18:40:36 -08:00
docs-publish.yaml docs: Publish only on release or manual run (#2055) 2024-08-28 15:28:27 -07:00
frontend-build-prepare.yaml chore: Auto-commit extracted localization strings (#2089) 2024-09-30 10:48:13 -07:00
k3d-ci.yaml Ensure email comparisons are case-insensitive, emails stored as lowercase (#2084) (#2086) (fixes from 1.11.7) 2024-09-19 12:20:34 -07:00
k3d-log-ci.yaml ci: 2023-02-08 11:24:54 -08:00
k3d-nightly-ci.yaml Refactor Invites and Registration, Flatten Per-User Invites (#1902) 2024-07-02 15:13:27 -07:00
lint.yaml quickfix: pin mypy version to avoid issues with latest release 2024-07-19 18:30:57 -07:00
microk8s-ci.yaml Adds Subscription API (#1914) 2024-07-10 17:41:16 -07:00
password-check.yaml Add Repository Index + Chart Rename + Docs Rename (#1708) 2024-04-21 09:42:25 -07:00
project-assign-issue.yml chore: switch actions for issue assign automation 2023-03-08 10:01:00 -08:00
publish-helm-chart.yaml Configure browsertrix proxies (#1847) 2024-10-02 18:35:45 -07:00
release.yaml Add Repository Index + Chart Rename + Docs Rename (#1708) 2024-04-21 09:42:25 -07:00
ui-tests-playwright.yml Serialize datetimes with Z suffix (#2058) 2024-09-12 16:16:13 -07:00