Commit Graph

61 Commits

Author SHA1 Message Date
Emma Segal-Grossman
c0cf6e6fdc
update docs nav with emails page (#2794)
Quick follow-up to #1375. Makes email dev docs visible.

---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-08-07 22:22:40 -07:00
Ilya Kreymer
d6cce1961c
docs: additional tweaks to docs for 'list of pages' (#2793)
Co-authored-by: emma <hi@emma.cafe>
2025-08-07 16:01:13 -04:00
DaleLore
ebfe36a03f
docs: Update new feature of upload seed URL list as file (#2792)
Closes #2653

Updated docs to reflect the uploading seed URL list as file

RE: #2646

---------

Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
2025-08-06 21:26:51 -04:00
Emma Segal-Grossman
72d1529993
Fix ui docs showing up in main docs site (#2787)
Quick fix for some 404 links on the docs site. UI documentation has been
moved to storybook in
https://github.com/webrecorder/browsertrix/pull/2597, but links to these
pages in the docs sidebar wasn't removed in that PR.
2025-08-04 21:53:53 -07:00
Emma Segal-Grossman
8db0e44843
Feat: New email templating system & service (#2712)
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-08-01 17:00:24 -04:00
Tessa Walsh
3a05002491
Add saveStorage option to workflow (#2757)
Fixes #2753 

- Adds `saveStorage` to `RawCrawlConfig` model in backend
- Adds option to Browser Settings pane of workflow editor
- Adds option to config details component
- Adds setting to docs
2025-07-31 22:58:15 -07:00
sua yoo
74324cdab4
build: Discard console debug in frontend production build (#2775)
Discards `console.debug` and allows `console.info` and documents in UI
development guide.
2025-07-28 23:03:24 -07:00
sua yoo
ed580c41e4
feat: Update storage stats with seed file and collection thumbnails (#2767)
Resolves https://github.com/webrecorder/browsertrix/issues/2733

## Changes

- Always displays storage size breakdown in dashboard panel
- Includes "Miscellaneous" storage size
- Fixes storage meter bar background color (tested with
https://www.color-blindness.com/coblis-color-blindness-simulator/)
2025-07-28 23:01:17 -07:00
Tessa Walsh
0c8c397fca
Add option to fail crawl if not logged in (#2754)
This PR adds a new checkbox to both page and seed crawl workflow types,
which will fail the crawl if behaviors detect the browser is not logged
in for supported sites.

Changes include:

- Backend support for the new crawler flag
- A new `failed_not_logged_in` crawl state
- Checkbox workflow editor and config details in the frontend (currently
in the Scope section - I think it makes sense to have this option up
front, but worth considering)
- User Guide documentation of new option
- A new nightly test for the new workflow option and
`failed_not_logged_in` state


---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Co-authored-by: sua yoo <sua@webrecorder.org>
2025-07-28 22:58:43 -07:00
DaleLore
1824fa0757
Deleted QA Run Analysis while its WIP (#2770) 2025-07-24 13:39:28 -04:00
DaleLore
4c5185d973
[Docs:] Add docs for quality assurance (#2769) 2025-07-24 13:10:49 -04:00
sua yoo
309977f7e5
docs: Link to MLA style title case (#2766)
## Changes

Update docs on writing documentation to include MLA style.
2025-07-23 21:09:57 -04:00
sua yoo
f8eeb0a7d3
docs: Update UI localization docs + fix broken links (#2756)
- Adds more guidelines around translatable strings
- Removes broken links to design files migrated to Storybook
2025-07-23 15:56:00 -07:00
Ilya Kreymer
2f9a61f6be
custom prefix additional fixes (#2746)
- follow-up to: #2736: remove '^' custom prefix URLs to avoid accumulating '^' via utility function
- Show URL prefix list in settings for custom prefix scope.
- Update user guide with correct custom prefix field.

---------

Co-authored-by: sua yoo <sua@webrecorder.org>
2025-07-18 18:21:32 -07:00
Ilya Kreymer
5d2b34f3b6
Custom Page Prefix Usability Fixes (#2736)
- Automatically compute prefix from starting URL, if no other prefix is
set in custom prefix mode.
- Ensure each prefix is actually a prefix: add '^' to each custom prefix
URL, as include URL path is a regex
- rename 'Extra URL Prefixes' to just 'URL Prefixes' and adjust help
text to indicate that the prefix list is the list that is in scope
- fixes #2735, follow up to #2722

---------

Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
Co-authored-by: sua yoo <sua@webrecorder.org>
2025-07-15 13:19:20 -07:00
Pierre
80a225c677
docs(deployment): add note about potential firewall issues on RHEL (#2707)
Add a warning box to the troubleshooting
advice (https://docs.browsertrix.com/deploy/local/#debugging-pod-issues)
in the local deployment guide about firewall rules and disabling firewalld on RHEL.

See https://forum.webrecorder.net/t/browsertrix-deployment-stalls-when-initializing-container-migrations/916/5 for context.
2025-07-07 17:16:12 -07:00
Ilya Kreymer
001277ac9d
docs: add docs for path / virtual addressing (#2669)
Add docs about path / virtual 'access_addressing_style' that is
available for each storage option.

---------

Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2025-06-12 13:08:27 -04:00
sua yoo
826d70b649
docs: Update frontend dev docs (#2666)
Resolves https://github.com/webrecorder/browsertrix/issues/2633

## Changes

- Replaces broken link in frontend README with hosted and local links
- Clarifies that frontend is not deployed separately in dev docs
2025-06-12 10:21:58 -04:00
DaleLore
1a6d2a20c2
Documentation Update for Pausing and Resuming Crawl section (#2639)
- Rename 'Modifying Running Crawls' to 'Running Crawls'
- Add section about pausing/resuming crawls, and that paused crawls will eventually become stopped if not resumed.
- Add new crawl pausing, paused, resuming statuses and icons.

Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2025-06-10 21:47:03 -07:00
Vinzenz Sinapius
0e0e663363
helm: add crawler_network_policy_additional_egress (#2641)
- Adds `crawler_network_policy_additional_egress` setting, to add egress
rules to the existing crawler network policy. Useful for when you want
to allow-list a single IPs without replacing the whole network policy.

- Adds docs about `crawler_network_policy_additional_egress` to the customization page.

- Resolves #2121

---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-06-10 16:19:42 -07:00
Emma Segal-Grossman
86c4d326e9
Normalize & document icon usage, and move design documents into storybook (#2597)
- Updates status icons & colors in several places in the app
- Moves "Action Menus" and updated "Status Indicators" design docs from
public docs to Storybook
  - [Storybook] Adds `remark-gfm` to enable tables in MDX
  - [Storybook] Adds a custom `ColorSwatch` block
- [Browsertrix Docs] Swaps out custom colors and fonts included with
docs for color variables from Hickory and Webrecorder CDN's hosted font
files, respectively

---------

Co-authored-by: sua yoo <sua@suayoo.com>
2025-06-10 10:58:18 -07:00
sua yoo
9e581cbb7d
fix: Improve embedded user guide UX (#2630)
Resolves https://github.com/webrecorder/browsertrix/issues/2629

## Changes

- Fixes user guide not opening to the correct page when not using the
workflow editor
- Fixes out of date instructions in "starting a crawl" user guide
- Updates user guide so that the content makes more sense for both
logged in and non-logged in users, including moving the introduction
section so that the user guide navigation categories are all displayed
(see screenshot)

## Screenshots

| Page | Image/video |
| ---- | ----------- |
| Dashboard | <img width="517" alt="Screenshot 2025-05-27 at 5 09 07 PM"
src="https://github.com/user-attachments/assets/481ac817-d591-4ca9-a4be-532fad586fcf"
/> |


<!-- ## Follow-ups -->

---------

Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
2025-06-03 13:38:51 -07:00
Pierre
8b54444b7e
docs: update remote deployment docs with working nginx-install example (#2625)
- Update the docs on k3s deployment for installing `ingress-nginx`, fixes
#2619.
- Also fix the indentation on the code blocks so markdown carries on list
numbering. At the moment the numbering confusingly resets after point 3.
- Update indentation on all code blocks so they show up as part of list +
wrap long commands.
---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-05-28 20:07:02 -07:00
sua yoo
858ae15ce6
feat: Handle paused state + workflow performance improvements (#2610)
- Handles `paused` workflow state.
- Adds "Copy Crawl ID" and "View Archived Item" buttons to workflow
detail
- Fixes file size not updating in workflow crawls list
- Fixes superadmin banner showing over workflow tabs
- Refactors workflow detail API calls to use `Task` to improve poll
performance.
- Fixes execution time rendering when less than a minute

---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-05-28 19:26:38 -07:00
sua yoo
ef93c5ad90
docs: Document latest crawl (#2613)
Follows https://github.com/webrecorder/browsertrix/issues/2603

## Changes

- Updates documentation on "Latest Crawl" tab
- Fixes extra fetch in workflow detail page
- Reverts workflow detail labels from "Duration" back to "Run Duration"
and "Pages" back to "Pages Crawled"
2025-05-20 12:19:09 -07:00
sua yoo
6b510fe89c
fix: Sync user guide to correct workflow section (#2592)
Resolves https://github.com/webrecorder/browsertrix/issues/2560

## Changes

- Syncs workflow current form section with user guide section.
- Stickies "User Guide" button to top of viewport so that user guide can
be opened.
- Makes content behind user guide clickable (fixes issues with stickied
elements shifting when user guide is not contained to the parent
element.)
- Decreases size of user guide text when embedded in an iframe.
- Refactors overflow scrim to reuse CSS variables.

---------
Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
2025-05-08 14:41:35 -07:00
Tessa Walsh
f34b42cb59
Add custom behavior docs to user guide (#2559) 2025-04-23 14:27:39 -04:00
sua yoo
78e2dadf0a
devex: Add Storybook for component development (#2556)
Adds Storybook in preparation for UI component refactoring.
2025-04-21 13:06:31 -07:00
sua yoo
58749602ff
Move custom behaviors behind checkbox (#2545)
WIP for https://github.com/webrecorder/browsertrix/issues/2541

## Changes

- Moves custom behaviors table to behind "Use Custom Behaviors"
checkbox.
- Updates autoclick selector to match checkbox reveal layout.
- Adds minimum viable user guide documentation of custom behaviors.
2025-04-09 00:16:02 +02:00
sua yoo
ba57b85322
feat: Display behavior logs (#2531)
- Displays behavior logs wherever error logs are shown
- Makes page URL in detail dialog clickable rather than in row column to
prevent accidental navigation
- Rename "Download Logs" -> "Download All Logs" and add tooltip with
additional context

---------

Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-04-08 14:38:59 -07:00
Tessa Walsh
55bedcb0b7
feat: Custom autoclick selector (#2517)
Resolves #2504

## Changes

- Allows users to customize autoclick selector in workflows
- Refactors `btrix-syntax-input` to support rendering label and help
text `sl-input`
- Show autoclick selector in workflow / crawl settings
- Adds 'clickSelector' with default of 'a' to backend crawl config.

---------

Co-authored-by: sua yoo <sua@suayoo.com>
Co-authored-by: sua yoo <sua@webrecorder.org>
Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
2025-04-08 05:53:40 +02:00
sua yoo
f6481272f4
feat: Specify custom link selectors (#2487)
- Allows users to specify page link selectors in workflow "Scope"
section
- Adds new `<btrix-syntax-input>` component for syntax-highlighted
inputs
- Refactors highlight.js implementation to prevent unnecessary language
loading
- Updates exclusion table header styles

---------

Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
2025-04-02 00:32:34 -07:00
Ilya Kreymer
62e47a8817
support overriding crawler image pull policy per channel (#2523)
- add 'imagePullPolicy' field to each crawler channel declaration
- if unset, defaults to the setting in the existing
'crawler_image_pull_policy' field.

fixes #2522

---------

Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2025-03-31 14:11:41 -07:00
Henry Wilkinson
c797e8446d
docs: Add UI documentation page on status icons (#2506)
### Changes
- Adds status icons page
- Moves action menus page to the UI development docs folder
- Fixes sentence fragment
2025-03-20 16:51:20 -07:00
Henry Wilkinson
cf6690e74a
docs: add development section on action menus (#2429)
Closes #2428
2025-03-19 18:46:09 -04:00
sua yoo
0bc210d905
devex: Add frontend code snippet & update dev docs (#2494)
- Adds VSCode file template for component unit testing.
- Updates development docs with details on UI dev
2025-03-19 14:22:20 -07:00
sua yoo
bcb73932d4
docs: Organize readme and fix doc links (#2479)
Resolves https://github.com/webrecorder/browsertrix/issues/2478

## Changes

- Organizes README
- Fixes relative links in mkdocs

---------

Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2025-03-11 18:37:20 -07:00
sua yoo
ac1236f15b
feat: Add behaviors section to workflow form (#2464)
- Moves "Per-Page Limits" fields to new "Page Behavior" section
- Fixes workflow settings closing tags with refactor to how sections are
rendered
- Updates user guide with behaviors documentation

---------

Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
2025-03-11 11:40:20 -07:00
Ilya Kreymer
00a42515c8
docs: add public collections gallery howto (#2462)
- Updated how collections gallery and presentation and sharing pages
- Collections gallery page content extracted from blog post, linked from blog post
- Each page has one video covering the gallery setting and individual collection presentation
- Cleaned up text on both to avoid duplicated content (thanks @DaleLore)



---------

Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
Co-authored-by: DaleLore <DaleLoreNY@gmail.com>
2025-03-08 15:57:13 -08:00
Tessa Walsh
f8fb2d2c8d
Rework crawl page migration + MongoDB Query Optimizations (#2412)
Fixes #2406 

Converts migration 0042 to launch a background job (parallelized across
several pods) to migrate all crawls by optimizing their pages and
setting `version: 2` on the crawl when complete.

Also Optimizes MongoDB queries for better performance.

Migration Improvements:

- Add `isMigrating` and `version` fields to `BaseCrawl`
- Add new background job type to use in migration with accompanying
`migration_job.yaml` template that allows for parallelization
- Add new API endpoint to launch this crawl migration job, and ensure
that we have list and retry endpoints for superusers that work with
background jobs that aren't tied to a specific org
- Rework background job models and methods now that not all background
jobs are tied to a single org
- Ensure new crawls and uploads have `version` set to `2`
- Modify crawl and collection replay.json endpoints to only include
fields for replay optimization (`initialPages`, `pageQueryUrl`,
`preloadResources`) if all relevant crawls/uploads have `version` set to
`2`
- Remove `distinct` calls from migration pathways
- Consolidate collection recompute stats

Query Optimizations:
- Remove all uses of $group and $facet
- Optimize /replay.json endpoints to precompute preload_resources, avoid
fetching crawl list twice
- Optimize /collections endpoint by not fetching resources 
- Rename /urls -> /pageUrlCounts and avoid $group, instead sort with
index, either by seed + ts or by url to get top matches.
- Use $gte instead of $regex to get prefix matches on URL
- Use $text instead of $regex to get text search on title
- Remove total from /pages and /pageUrlCounts queries by not using
$facet
- frontend: only call /pageUrlCounts when dialog is opened.


---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>
2025-02-20 15:26:11 -08:00
Henry Wilkinson
edf1edbbd1
docs: Add Documentation for Sharing Collections (#2368)
- Merges existing collection content into one page
- Updates ArchiveWeb.page link
- Adds redirect from /collections → /collection
- Moves content relevant to presentation & sharing out of the intro
- Adds new content about sharing collections!

---------

Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
Co-authored-by: sua yoo <sua@webrecorder.org>
2025-02-12 14:05:52 -05:00
Henry Wilkinson
3586412da1
docs: Adds section for autoclick behavior addition from 1.13.3 (#2385)
- Adds section for the autoclick behavior 
- Removes sections that were removed with the new workflow form... and
in some cases much earlier! 😅
2025-02-12 00:22:05 -05:00
Emma Segal-Grossman
f8a44258d8
Merge pull request #2332 from webrecorder/frontend-collection-editing-dialog
Collection editing and sharing revamp
2025-02-11 18:27:35 -05:00
sua yoo
3c860775b9
feat: Update references to org public profile -> gallery (#2330)
- Renames public URL prefix to `explore`
- Updates org settings sections
- Removes or renames references to "org profile"

---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
2025-01-27 13:48:38 -08:00
Tessa Walsh
4583babecb
feat: Add slug to collections and use it in public collection URLs (#2301)
Resolves https://github.com/webrecorder/browsertrix/issues/2298

## Changes

- Slugs added to collections, can be specified separately when creating
or updating collections or else is based off of supplied collection name
- Migration added to backfill slugs for existing collections
- Redirect collection to newest slug if changed
- Adds option to copy public profile link to "Public Collections" action
menu
- Show "Back to <Org>" link instead of breadcrumbs

---------
Co-authored-by: sua yoo <sua@suayoo.com>
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-01-15 22:44:32 -08:00
Emma Segal-Grossman
04e9127d35
Remove ANALYTICS_NAMESPACE, as it's only usable at build time (#2293)
Replaces `ANALYTICS_NAMESPACE` with setting `window.btrixEvent` via
`inject_extra` config

---------

Co-authored-by: sua yoo <sua@suayoo.com>
2025-01-13 20:13:30 -08:00
sua yoo
b36ed9f730
feat: Track collection events (#2256)
- Renames `inject_analytics` to `inject_extra` and updates docs
- Manually tracks page views to enable passing custom props
- Tracks copying collection share link and downloading a public
collection

---------

Co-authored-by: emma <hi@emma.cafe>
2025-01-13 15:15:49 -08:00
sua yoo
6e48f957f9
feat: Public org profile page (#2172)
- Enables creating a public org profile page with description and
website at `/profile/<org slug>`
- Updates current "Overview" page to be "Dashboard", found under
`/dashboard`
- Organizes org "General" settings tab by "General", "Profile", and
"Developer Tools"
- Adds sign up banner to log in page for consistent CTA banners
- Updates copy and docs to support changes
- Allows user to set collection to private, public, or unlisted
- Adds route for public collection page with basic page layout
- Refactors copy button to abstract clipboard functionality
---------

Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
Co-authored-by: emma <hi@emma.cafe>
2025-01-13 15:15:48 -08:00
sua yoo
3b6f63f030
deps: Upgrade to Node 22 (#2274)
- Upgrades build to use Node 22
- Adds version matrix to GH workflow to test compatibility with 20
2025-01-07 11:58:23 -08:00
Tessa Walsh
589819682e
Optionally delay replica deletion (#2252)
Fixes #2170

The number of days to delay file replication deletion by is configurable
in the Helm chart with `replica_deletion_delay_days` (set by default to
7 days in `values.yaml` to encourage good practice, though we could
change this).

When `replica_deletion_delay_days` is set to an int above 0, when a
delete replica job would otherwise be started as a Kubernetes Job,
a CronJob is created instead with a cron schedule set to run yearly,
starting x days from the current moment. This cronjob is then deleted by
the operator after the job successfully completes. If a failed
background job is retried, it is re-run immediately as a Job rather
than being scheduled out into the future again.

---------
Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>
2024-12-19 18:50:28 -08:00