docs: formatting fixes & minor content updates (#1091)

Additional tweaks on Browser Profiles pages + general consistency pass
This commit is contained in:
Henry Wilkinson 2023-08-21 16:26:43 -04:00 committed by GitHub
parent 02a01e7abb
commit 2952988864
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
12 changed files with 42 additions and 53 deletions

View File

@ -2,18 +2,18 @@
*Playbook Path: [ansible/playbooks/install_microk8s.yml](https://github.com/webrecorder/browsertrix-cloud/blob/main/ansible/playbooks/do_setup.yml)* *Playbook Path: [ansible/playbooks/install_microk8s.yml](https://github.com/webrecorder/browsertrix-cloud/blob/main/ansible/playbooks/do_setup.yml)*
This playbook provides an easy way to install BrowserTrix Cloud on DigitalOcean. It automatically sets up Browsertrix with, LetsEncrypt certificates. This playbook provides an easy way to install Browsertrix Cloud on DigitalOcean. It automatically sets up Browsertrix with LetsEncrypt certificates.
### Requirements ### Requirements
To run this ansible playbook, you need to: To run this ansible playbook, you need to:
* Have a [DigitalOcean Account](https://m.do.co/c/e0db3814e33e) where this will run. - Have a [DigitalOcean Account](https://m.do.co/c/e0db3814e33e) where this will run.
* Create a [DigitalOcean API Key](https://cloud.digitalocean.com/account/api) which will need to be set in your terminal sessions environment variables `export DO_API_TOKEN` - Create a [DigitalOcean API Key](https://cloud.digitalocean.com/account/api) which will need to be set in your terminal sessions environment variables `export DO_API_TOKEN`
* `doctl` command line client configured (run `doctl auth init`) - `doctl` command line client configured (run `doctl auth init`)
* Create a [DigitalOcean Spaces](https://docs.digitalocean.com/reference/api/spaces-api/) API Key which will also need to be set in your terminal sessions environment variables, which should be set as `DO_AWS_ACCESS_KEY` and `DO_AWS_SECRET_KEY` - Create a [DigitalOcean Spaces](https://docs.digitalocean.com/reference/api/spaces-api/) API Key which will also need to be set in your terminal sessions environment variables, which should be set as `DO_AWS_ACCESS_KEY` and `DO_AWS_SECRET_KEY`
* Configure a DNS A Record and CNAME record. - Configure a DNS A Record and CNAME record.
* Have a working python and pip configuration through your OS Package Manager - Have a working python and pip configuration through your OS Package Manager
#### Install #### Install

View File

@ -2,17 +2,16 @@
*Playbook Path: [ansible/playbooks/install_microk8s.yml](https://github.com/webrecorder/browsertrix-cloud/blob/main/ansible/playbooks/install_microk8s.yml)* *Playbook Path: [ansible/playbooks/install_microk8s.yml](https://github.com/webrecorder/browsertrix-cloud/blob/main/ansible/playbooks/install_microk8s.yml)*
This playbook provides an easy way to install Browsertrix Cloud on an Ubuntu (tested on Jammy Jellyfish) and a RedHat 9 (tested on Rocky Linux 9). This playbook provides an easy way to install Browsertrix Cloud on Ubuntu (tested on Jammy Jellyfish) and RedHat 9 (tested on Rocky Linux 9). It automatically sets up Browsertrix with Letsencrypt certificates.
It automatically sets up Browsertrix with, Letsencrypt certificates.
### Requirements ### Requirements
To run this ansible playbook, you need to: To run this ansible playbook, you need to:
* Have a server / VPS where browsertrix will run. - Have a server / VPS where browsertrix will run.
* Configure a DNS A Record to point at your server's IP address. - Configure a DNS A Record to point at your server's IP address.
* Make sure you can ssh to it, with a sudo user: ssh <your-user>@<your-domain> - Make sure you can ssh to it, with a sudo user: ssh <your-user>@<your-domain>
* Install Ansible on your local machine (the control machine). - Install Ansible on your local machine (the control machine).
#### Install #### Install

View File

@ -10,6 +10,4 @@ The main requirements for Browsertrix Cloud are:
- [Helm 3](https://helm.sh/) (package manager for Kubernetes) - [Helm 3](https://helm.sh/) (package manager for Kubernetes)
We have prepared a [Local Deployment Guide](./local) which covers several options for testing Browsertrix Cloud locally on a single machine, We have prepared a [Local Deployment Guide](./local) which covers several options for testing Browsertrix Cloud locally on a single machine, as well as a [Production (Self-Hosted and Cloud) Deployment](./production) guides to help with setting up Browsertrix Cloud for different production scenarios.
as well as a [Production (Self-Hosted and Cloud) Deployment](./production) guides to help with
setting up Browsertrix Cloud for different production scenarios.

View File

@ -8,13 +8,13 @@ Before running Browsertrix Cloud, you'll need to set up a running [Kubernetes](h
Today, there are numerous ways to deploy Kubernetes fairly easily, and we recommend trying one of the single-node options, which include Docker Desktop, microk8s, minikube and k3s. Today, there are numerous ways to deploy Kubernetes fairly easily, and we recommend trying one of the single-node options, which include Docker Desktop, microk8s, minikube and k3s.
The instructions below assume you have cloned the [https://github.com/webrecorder/browsertrix-cloud](https://github.com/webrecorder/browsertrix-cloud) repository locally, and have local package managers for your platform (eg. `brew` for Mac, `choco` for Windows, etc...) already installed. The instructions below assume you have cloned the [https://github.com/webrecorder/browsertrix-cloud](https://github.com/webrecorder/browsertrix-cloud) repository locally, and have local package managers for your platform (eg. `brew` for macOS, `choco` for Windows, etc...) already installed.
Here are some environment specific instructions for setting up a local cluster from different Kubernetes vendors: Here are some environment specific instructions for setting up a local cluster from different Kubernetes vendors:
??? info "Docker Desktop (recommended for Mac and Windows)" ??? info "Docker Desktop (recommended for macOS and Windows)"
For Mac and Windows, we recommend testing out Browsertrix Cloud using Kubernetes support in Docker Desktop as that will be one of the simplest options. For macOS and Windows, we recommend testing out Browsertrix Cloud using Kubernetes support in Docker Desktop as that will be one of the simplest options.
1. [Install Docker Desktop](https://www.docker.com/products/docker-desktop/) if not already installed. 1. [Install Docker Desktop](https://www.docker.com/products/docker-desktop/) if not already installed.
@ -22,7 +22,7 @@ Here are some environment specific instructions for setting up a local cluster f
3. Restart Docker Desktop if asked, and wait for it to fully restart. 3. Restart Docker Desktop if asked, and wait for it to fully restart.
4. Install [Helm](https://helm.sh/), which can be installed with `brew install helm` (Mac) or `choco install kubernetes-helm` (Windows) or following some of the [other install options](https://helm.sh/docs/intro/install/) 4. Install [Helm](https://helm.sh/), which can be installed with `brew install helm` (macOS) or `choco install kubernetes-helm` (Windows) or following some of the [other install options](https://helm.sh/docs/intro/install/)
??? info "MicroK8S (recommended for Ubuntu)" ??? info "MicroK8S (recommended for Ubuntu)"
@ -36,19 +36,19 @@ Here are some environment specific instructions for setting up a local cluster f
Note: microk8s comes with its own version helm, so you don't need to install it separately. Replace `helm` with `microk8s helm3` in the subsequent instructions below. Note: microk8s comes with its own version helm, so you don't need to install it separately. Replace `helm` with `microk8s helm3` in the subsequent instructions below.
??? info "Minikube (Windows, Mac or Linux)" ??? info "Minikube (Windows, macOS, or Linux)"
1. Install Minikube [following installation instructions](https://minikube.sigs.k8s.io/docs/start/), eg. `brew install minikube`. 1. Install Minikube [following installation instructions](https://minikube.sigs.k8s.io/docs/start/), eg. `brew install minikube`.
Note that Minikube also requires Docker or another container management system to be installed as well. Note that Minikube also requires Docker or another container management system to be installed as well.
2. Install [Helm](https://helm.sh/), which can be installed with `brew install helm` (Mac) or `choco install kubernetes-helm` (Windows) or following some of the [other install options](https://helm.sh/docs/intro/install/) 2. Install [Helm](https://helm.sh/), which can be installed with `brew install helm` (macOS) or `choco install kubernetes-helm` (Windows) or following some of the [other install options](https://helm.sh/docs/intro/install/)
??? info "K3S (recommended for non-Ubuntu Linux)" ??? info "K3S (recommended for non-Ubuntu Linux)"
1. Install K3s [as per the instructions](https://docs.k3s.io/quick-start) 1. Install K3s [as per the instructions](https://docs.k3s.io/quick-start)
2. Install [Helm](https://helm.sh/), which can be installed with `brew install helm` (Mac) or `choco install kubernetes-helm` (Windows) or following some of the [other install options](https://helm.sh/docs/intro/install/) 2. Install [Helm](https://helm.sh/), which can be installed with `brew install helm` (macOS) or `choco install kubernetes-helm` (Windows) or following some of the [other install options](https://helm.sh/docs/intro/install/)
3. Set `KUBECONFIG` to point to the config for K3S: `export KUBECONFIG=/etc/rancher/k3s/k3s.yaml` to ensure Helm will use the correct version. 3. Set `KUBECONFIG` to point to the config for K3S: `export KUBECONFIG=/etc/rancher/k3s/k3s.yaml` to ensure Helm will use the correct version.
@ -105,9 +105,9 @@ The command will exit when all pods have been loaded, or if there is an error an
If the command succeeds, you should be able to access Browsertrix Cloud by loading: **[http://localhost:30870/](http://localhost:30870/)** in your browser. If the command succeeds, you should be able to access Browsertrix Cloud by loading: **[http://localhost:30870/](http://localhost:30870/)** in your browser.
??? info "Minikube (on Mac)" ??? info "Minikube (on macOS)"
When using Minikube on a Mac, the port will not be 30870. Instead, Minikube opens a tunnel to a random port, When using Minikube on a macOS, the port will not be 30870. Instead, Minikube opens a tunnel to a random port,
obtained by running `minikube service browsertrix-cloud-frontend --url` in a separate terminal. obtained by running `minikube service browsertrix-cloud-frontend --url` in a separate terminal.
Use the provided URL (in the format `http://127.0.0.1:<TUNNEL_PORT>`) instead. Use the provided URL (in the format `http://127.0.0.1:<TUNNEL_PORT>`) instead.
@ -140,8 +140,7 @@ To uninstall, run `helm uninstall btrix`.
By default, the database + storage volumes are not automatically deleted, so you can run `helm upgrade ...` again to restart the cluster in its current state. By default, the database + storage volumes are not automatically deleted, so you can run `helm upgrade ...` again to restart the cluster in its current state.
If you are upgrading from a previous version, and run into issues with `helm upgrade ...`, we recommend If you are upgrading from a previous version, and run into issues with `helm upgrade ...`, we recommend uninstalling and then re-running upgrade.
uninstalling and then re-running upgrade.
## Deleting all Data ## Deleting all Data
@ -149,6 +148,4 @@ To fully delete all persistent data (db + archives) created in the cluster, also
## Deploying for Local Development ## Deploying for Local Development
These instructions are intended for deploying the cluster from the latest release. These instructions are intended for deploying the cluster from the latest release. See [setting up cluster for local development](../develop/local-dev-setup.md) for additional customizations related to developing Browsertrix Cloud and deploying from local images.
See [setting up cluster for local development](../develop/local-dev-setup.md) for additional customizations related to
developing Browsertrix Cloud and deploying from local images.

View File

@ -1,7 +1,6 @@
# Production: Self-Hosted and Cloud # Production: Self-Hosted and Cloud
For production and hosted deployments (both on a single machine or in the cloud), the only requirement is to have a designed domain For production and hosted deployments (both on a single machine or in the cloud), the only requirement is to have a designed domain and (strongly recommended, but not required) second domain for signing web archives.
and (strongly recommended, but not required) second domain for signing web archives.
We are also experimenting with [Ansible playbooks](../deploy/ansible) for cloud deployment setups. We are also experimenting with [Ansible playbooks](../deploy/ansible) for cloud deployment setups.

View File

@ -110,7 +110,7 @@ There are a lot of different options provided by Material for MkDocs — So many
???+ Note ???+ Note
The default call-out, used to highlight something if there isn't a more relevant one — should generally be expanded by default but can be collapsable by the user if the note is long. The default call-out, used to highlight something if there isn't a more relevant one — should generally be expanded by default but can be collapsable by the user if the note is long.
!!! Tip !!! Tip "Tip — May have a title stating the tip or best practice"
Used to highlight a point that is useful for everyone to understand about the documented subject — should be expanded and kept brief. Used to highlight a point that is useful for everyone to understand about the documented subject — should be expanded and kept brief.
???+ Info "Info — Must have a title describing the context under which this information is useful" ???+ Info "Info — Must have a title describing the context under which this information is useful"

View File

@ -72,9 +72,9 @@ If connecting to a local deployment cluster, set `API_BASE_URL` to:
API_BASE_URL=http://localhost:30870 API_BASE_URL=http://localhost:30870
``` ```
??? info "Port when using Minikube (on Mac)" ??? info "Port when using Minikube (on macOS)"
When using Minikube on a Mac, the port will not be 30870. Instead, Minikube opens a tunnel to a random port, When using Minikube on macOS, the port will not be 30870. Instead, Minikube opens a tunnel to a random port,
obtained by running `minikube service browsertrix-cloud-frontend --url` in a separate terminal. obtained by running `minikube service browsertrix-cloud-frontend --url` in a separate terminal.
Set API_BASE_URL to provided URL instead, eg. `API_BASE_URL=http://127.0.0.1:<TUNNEL_PORT>` Set API_BASE_URL to provided URL instead, eg. `API_BASE_URL=http://127.0.0.1:<TUNNEL_PORT>`

View File

@ -13,8 +13,7 @@ The deployment can then be [further customized for local development](./local-de
### Backend ### Backend
The backend is an API-only system, using the FastAPI framework. The latest API reference is available The backend is an API-only system, using the FastAPI framework. The latest API reference is available under ./api of a running cluster.
under ./api of a running cluster.
At this time, the backend must be deployed in the Kubernetes cluster. At this time, the backend must be deployed in the Kubernetes cluster.

View File

@ -125,12 +125,10 @@ Refer back to the [Local Development guide](../deploy/local.md#waiting-for-clust
## Update the Images ## Update the Images
After making any changes to backend code (in `./backend`) or frontend code (in `./frontend`), After making any changes to backend code (in `./backend`) or frontend code (in `./frontend`), you'll need to rebuild the images as specified above, before running `helm upgrade ...` to re-deploy.
you'll need to rebuild the images as specified above, before running `helm upgrade ...` to re-deploy.
Changes to settings in `./chart/local.yaml` can be deployed with `helm upgrade ...` directly. Changes to settings in `./chart/local.yaml` can be deployed with `helm upgrade ...` directly.
## Deploying Frontend Only ## Deploying Frontend Only
If you are just making changes to the frontend, you can also [deploy the frontend separately](frontend-dev.md) If you are just making changes to the frontend, you can also [deploy the frontend separately](frontend-dev.md) using a dev server for quicker iteration.
using a dev server for quicker iteration.

View File

@ -1,21 +1,20 @@
# Browser Profiles # Browser Profiles
Browser Profiles are saved instances of a web browsing session that can be reused to crawl websites as they were configued, with any cookies or saved login sessions. They are specifically useful for crawling websites as a logged in user or accepting cookie consent popups. Browser profiles are saved instances of a web browsing session that can be reused to crawl websites as they were configued, with any cookies or saved login sessions. Using a pre-configured profile also means that content that can only be viewed by logged in users can be archived, without archiving the actual login credentials.
Using a pre-created profile means that paywalled content can be archived, without archiving the actual login credentials. !!! tip "Best practice: Create and use web archiving-specific accounts for crawling with browser profiles"
??? info "Best practice: Create and use web archiving-specific accounts" For the following reasons, we recommend creating dedicated accounts for archiving anything that is locked behind login credentials but otherwise public, especially on social media platforms.
Some websites may rate limit or lock your account if they deem crawling-related activity to be suspicious, such as logging in from a new location. - While user names and passwords are not, the access tokens for logged in websites used in the browser profile creation process _are stored_ by the server.
While your login information (username, password) is not archived, *other* data such as cookies, location, etc.. may be part of a logged in content (after all, personalized content is often the goal of paywalls). - Some websites may rate limit or lock accounts for reasons they deem to be suspicious, such as logging in from a new location or any crawling-related activity.
Due to nature of social media especially, existing accounts may have personally identifiable information, even when accessing otherwise public content. - While login information (username, password) is not archived, *other* data such as cookies, location, etc.. may be included in the resulting crawl (after all, personalized content is often the goal of sites that require credentials to view content).
For these reasons, we recommend creating dedicated accounts for archiving anything that is paywalled but otherwise public, especially on social media platforms. - Due to nature of social media specifically, existing accounts may have personally identifiable information, even when accessing otherwise public content.
Of course, there are exceptions -- such as when the goal is to archive personalized or private content accessible only from designated accounts.
Of course, there are exceptions — such as when the goal is to archive personalized or private content accessible only from designated accounts.
## Creating New Browser Profiles ## Creating New Browser Profiles
@ -28,4 +27,3 @@ Press the _Next_ button to save the browser profile with a _Name_ and _Descripti
Sometimes websites will log users out or expire cookies after a period of time. In these cases, when crawling the browser profile can still be loaded but may not behave as it did when it was initially set up. Sometimes websites will log users out or expire cookies after a period of time. In these cases, when crawling the browser profile can still be loaded but may not behave as it did when it was initially set up.
To update the profile, go to the profile's details page and press the _Edit Browser Profile_ button to load and interact with the sites that need to be re-configured. When finished, press the _Save Browser Profile_ button to return to the profile's details page. To update the profile, go to the profile's details page and press the _Edit Browser Profile_ button to load and interact with the sites that need to be re-configured. When finished, press the _Save Browser Profile_ button to return to the profile's details page.

View File

@ -10,7 +10,8 @@ If you have been sent an [invite](org-settings#members), enter a password and na
If the server has enabled signups and you have been given a registration link, enter your email address, password, and name to create a new account. Your account will be added to the server's default organization. If the server has enabled signups and you have been given a registration link, enter your email address, password, and name to create a new account. Your account will be added to the server's default organization.
!!! info "At this time, the name field is not yet editable." !!! note
Names chosen on signup cannot be changed later.
--- ---

View File

@ -26,7 +26,7 @@ It is also available under the _Additional URLs_ section for Seeded Crawls where
When enabled, the crawler will visit all the links it finds within each page defined in the _List of URLs_ field. When enabled, the crawler will visit all the links it finds within each page defined in the _List of URLs_ field.
??? tip "Crawling tags & search queries with URL List crawls" ??? example "Crawling tags & search queries with URL List crawls"
This setting can be useful for crawling the content of specific tags or searh queries. Specify the tag or search query URL(s) in the _List of URLs_ field, e.g: `https://example.com/search?q=tag`, and enable _Include Any Linked Page_ to crawl all the content present on that search query page. This setting can be useful for crawling the content of specific tags or searh queries. Specify the tag or search query URL(s) in the _List of URLs_ field, e.g: `https://example.com/search?q=tag`, and enable _Include Any Linked Page_ to crawl all the content present on that search query page.
### Crawl Start URL ### Crawl Start URL