Resolves#1354
Supports crawling through pre-configured proxy servers, allowing users to select which proxy servers to use (requires browsertrix crawler 1.3+)
Config:
- proxies defined in btrix-proxies subchart
- can be configured via btrix-proxies key or separate proxies.yaml file via separate subchart
- proxies list refreshed automatically if crawler_proxies.json changes if subchart is deployed
- support for ssh and socks5 proxies
- proxy keys added to secrets in subchart
- support for default proxy to be always used if no other proxy configured, prevent starting cluster if default proxy not available
- prevent starting manual crawl if previously configured proxy is no longer available, return error
- force 'btrix' username and group name on browsertrix-crawler non-root user to support ssh
Operator:
- support crawling through proxies, pass proxyId in CrawlJob
- support running profile browsers which designated proxy, pass proxyId to ProfileJob
- prevent starting scheduled crawl if previously configured proxy is no longer available
API / Access:
- /api/orgs/all/crawlconfigs/crawler-proxies - get all proxies (superadmin only)
- /api/orgs/{oid}/crawlconfigs/crawler-proxies - get proxies available to particular org
- /api/orgs/{oid}/proxies - update allowed proxies for particular org (superadmin only)
- superadmin can configure which orgs can use which proxies, stored on the org
- superadmin can also allow an org to access all 'shared' proxies, to avoid having to allow a shared proxy on each org.
UI:
- Superadmin has 'Edit Proxies' dialog to configure for each org if it has: dedicated proxies, has access to shared proxies.
- User can select a proxy in Crawl Workflow browser settings
- Users can choose to launch a browser profile with a particular proxy
- Display which proxy is used to create profile in profile selector
- Users can choose with default proxy to use for new workflows in Crawling Defaults
---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
Updates docs to clarify difference between self-hosting and hosted
subscription.
---------
Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
Fixes#890
This PR introduces new streaming superuser-only API endpoints to export
and import database information for an organization. New Adminstrator
deployment documentation on how to manage the process and copy files
between S3 buckets as needed is also included.
---------
Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Part of #1241
### Changes
- Renames all instances of "Browsertrix Cloud" to "Browsertrix" on the
front end, emails, and documentation
---------
Co-authored-by: emma <hi@emma.cafe>
Fixes#1555
This is a first pass at some of the configuration options within the
Helm chart that might be most applicable to users. Emphasis is placed on
configuration that's particular to our application, such as storage and
crawler channels.
---------
Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
* Add stylesheet & mkdocs features
- Adds a custom stylesheet & brand colours
- Adds Recursive as the code font
- Adds repo info to the nav bar
- Adds auto tracking ID links for deep linking to sections as users scroll the page
- Index pages are now a part of their section as determined by their H1
- Removes mkdocs info from future footer
* Reorganize content
- Renames "Dev" to "Develop" for improved navigation labels
- Adds links to tools the first time they're mentioned
- Rewords part of the homepage
- Hides section navigation on the homepage (now we don't have a blank section nav bar!
- Adds some syntax highlighting
- Removes some manual word wrapping — this was done very rarely / inconsistently
* Rename "Developer Docs" index page
- Better title for sidebar
* Update docs.md
- Adds links to tools
- Adds future docs style guide section
- Updates name and makes it an H1
- Replaces hyphens on the homepage with em dashes
* deployment index page: changed title, removed non-k8s section, cleaned up intro
* develop index page: changed title
fixed typo on main page
---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
* Initial docs move
* Setup mkdocs
* Adds instructions for building docs
* add new deployment docs, local and prod
* set up three sections: deployment, dev and user guide
* remove old deployment docs
* ci: mkdocs gh-pages publish
Co-authored-by: sua yoo <sua@suayoo.com>
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>