Commit Graph

6 Commits

Author SHA1 Message Date
Vinzenz Sinapius
bb6e703f6a
Configure browsertrix proxies (#1847)
Resolves #1354

Supports crawling through pre-configured proxy servers, allowing users to select which proxy servers to use (requires browsertrix crawler 1.3+)

Config:
- proxies defined in btrix-proxies subchart
- can be configured via btrix-proxies key or separate proxies.yaml file via separate subchart
- proxies list refreshed automatically if crawler_proxies.json changes if subchart is deployed
- support for ssh and socks5 proxies
- proxy keys added to secrets in subchart
- support for default proxy to be always used if no other proxy configured, prevent starting cluster if default proxy not available
- prevent starting manual crawl if previously configured proxy is no longer available, return error
- force 'btrix' username and group name on browsertrix-crawler non-root user to support ssh

Operator:
- support crawling through proxies, pass proxyId in CrawlJob
- support running profile browsers which designated proxy, pass proxyId to ProfileJob
- prevent starting scheduled crawl if previously configured proxy is no longer available

API / Access:
- /api/orgs/all/crawlconfigs/crawler-proxies - get all proxies (superadmin only)
- /api/orgs/{oid}/crawlconfigs/crawler-proxies - get proxies available to particular org
- /api/orgs/{oid}/proxies - update allowed proxies for particular org (superadmin only)
- superadmin can configure which orgs can use which proxies, stored on the org
- superadmin can also allow an org to access all 'shared' proxies, to avoid having to allow a shared proxy on each org.

UI:
- Superadmin has 'Edit Proxies' dialog to configure for each org if it has: dedicated proxies, has access to shared proxies.
- User can select a proxy in Crawl Workflow browser settings
- Users can choose to launch a browser profile with a particular proxy
- Display which proxy is used to create profile in profile selector
- Users can choose with default proxy to use for new workflows in Crawling Defaults

---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2024-10-02 18:35:45 -07:00
sua yoo
ecac4f6939
docs: Reorganize user guide (#2050)
Reorganizes user guide to be more solutions based

---------

Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2024-08-28 09:50:42 -07:00
Ilya Kreymer
b574f00d2b
Add Repository Index + Chart Rename + Docs Rename (#1708)
Repository Index: Generate an index.yaml in ./docx/helm-repo/index.yaml
to allow for browsertrix to be a helm repository.
docs: rename docs.browsertrix.cloud -> docs.browsertrix.com
docs: update deployment doc to mention helm repo as preferred way to
install
docs build action: generate repository index in GH action
publish action: update auto-generated message to mention installing from
the repo.

---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2024-04-21 09:42:25 -07:00
Ilya Kreymer
17f49a52de
email templates update + customization + doc update (fixes #1652) (#1653)
- modify invite email template to answer common questions
- email templates: make each email template overridable with --set-file
- docs: update customization doc to document how to customize email
templates

---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2024-04-08 12:27:47 -07:00
Ilya Kreymer
c1817cbe04
add horizontal pod autoscaler for backend and frontend via helm charts (#1633)
Supports horizontal pod autoscaling (hpa) for backend and frontend pods:
- use cpu and memory averages
- adjust base memory + cpu for backend
- threshold set to 80% cpu and 95% memory utilization by default
(configurable in values.yaml)
- instead of backend and frontend replicas, set max replicas in
values.yaml
- only enable hpa if backend_max_replicas or frontend_max_replicas is
>1, default to 1 for now
2024-03-28 16:39:27 -07:00
Tessa Walsh
144000c7a3
Add guide for customizing Helm chart values (#1556)
Fixes #1555 

This is a first pass at some of the configuration options within the
Helm chart that might be most applicable to users. Emphasis is placed on
configuration that's particular to our application, such as storage and
crawler channels.

---------

Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
2024-03-04 12:03:11 -05:00