browsertrix/docs/user-guide/getting-started.md
sua yoo c01e3dd88b
feat: Improve UX of choosing new workflow crawl type (#2067)
Resolves https://github.com/webrecorder/browsertrix/issues/2066

### Changes
- Allows directly choosing new "Page List" or "Site Crawl from
workflow list
- Reverts terminology introduced in
https://github.com/webrecorder/browsertrix/pull/2032
2024-09-09 16:42:47 -07:00

36 lines
1.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Your First Crawl
Lets crawl your first webpage! Start by opening up a webpage that you'd like to crawl, and note the URL for later.
## Logging in
To start crawling with hosted Browsertrix, you'll need a Browsertrix account. [Sign up for an account](./signup.md) and log in.
!!! note "Self-hosting"
If you'd like to try Browsertrix before signing up, or you have specialized hosting requirements, you can host Browsertrix yourself. [Set up Browsertrix](../deploy/index.md) on your system and log in as your admin user.
## Starting the crawl
Once you've logged in you should see your org [overview](overview.md). If you land somewhere else, navigate to **Overview**.
1. Tap the _Create New..._ shortcut and select **Crawl Workflow**.
2. Choose **Page List**. We'll get into the details of the options [later](./crawl-workflows.md), but this is a good starting point for a simple crawl.
3. Enter the URL of the webpage that you noted earlier in **Page URL(s)**.
4. Tap _Review & Save_.
5. Tap _Save Workflow_.
6. You should now see your new crawl workflow. Give the crawler a few moments to warm up, and then watch as it archives the webpage!
---
## Next steps
After running your first crawl, check out the following to learn more about Browsertrix's features:
- A detailed list of [crawl workflow setup](workflow-setup.md) options.
- Adding [exclusions](workflow-setup.md#exclusions) to limit your crawl's scope and evading crawler traps by [editing exclusion rules while crawling](running-crawl.md#live-exclusion-editing).
- Best practices for crawling with [browser profiles](browser-profiles.md) to capture content only available when logged in to a website.
- Managing archived items, including [uploading previously archived content](archived-items.md#uploading-web-archives).
- Organizing and combining archived items with [collections](collections.md) for sharing and export.
- [Invite collaborators](org-members.md) to your org.