From 251aef3ac19fd8c08172a422a97ddb6e261c743e Mon Sep 17 00:00:00 2001
From: Henry Wilkinson <henry@wilkinson.graphics>
Date: Thu, 30 May 2024 14:50:10 -0400
Subject: [PATCH] Docs: Elaborates on using user agents (#1841)

- Provides a link to Mozilla's page explaining what they are (good for
folks new to the concept)
- Provides a link to useragents.me, the same site we link to in the app
- Provides two examples of situations where they may be helpful to get
around content restrictions
---
 docs/user-guide/workflow-setup.md | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/docs/user-guide/workflow-setup.md b/docs/user-guide/workflow-setup.md
index f6d91eeb..e2668386 100644
--- a/docs/user-guide/workflow-setup.md
+++ b/docs/user-guide/workflow-setup.md
@@ -168,7 +168,20 @@ Will prevent any content from the domains listed in [Steven Black's Unified Host
 
 ### User Agent
 
-Sets the browser's user agent in outgoing requests to the specified value. If left blank, the crawler will use the browser's default user agent.
+Sets the browser's [user agent](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent) in outgoing requests to the specified value. If left blank, the crawler will use the Brave browser's default user agent. For a list of common user agents see [useragents.me](https://www.useragents.me/).
+
+??? example "Using custom user agents to get around restrictions"
+    Despite being against best practices, some websites will block specific browsers based on their user agent: a string of text that browsers send web servers to identify what type of browser or operating system is requesting content. If Brave is blocked, using a user agent string of a different browser (such as Chrome or Firefox) may be sufficient to convince the website that a different browser is being used.
+
+    User agents can also be used to voluntarily identify your crawling activity, which can be useful when working with a website's owners to ensure crawls can be completed successfully. We recommend using a user agent string similar to the following, replacing the `orgname` and URL comment with your own:
+
+    ```
+    Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.3 orgname.browsertrix (+https://example.com/crawling-explination-page)
+    ```
+
+    If you have no webpage to identify your organization or statement about your crawling activities available as a link, omit the bracketed comment section at the end entirely.
+
+    This string must be provided to the website's owner so they can allowlist Browsertrix to prevent it from being blocked.
 
 ### Language