Commit Graph

1666 Commits

Author SHA1 Message Date
Emma Segal-Grossman
30e1224e8b
Add hardcoded exceptions to preventing enter keypresses in workflow form (#2674)
Fixes https://github.com/webrecorder/browsertrix/issues/2675
2025-06-18 13:05:02 -04:00
Ilya Kreymer
dde23426b2 version: bump to 1.17.0! 2025-06-12 17:37:07 -04:00
sua yoo
9a65102274
Make trial banner informational at start of trial (#2667)
## Changes

Following
bbd5fb81c4,
since the banner is shown throughout the duration of the trial, it should be
made informational at the beginning of the trial so that it's not as obtrusive.
2025-06-12 16:07:20 -04:00
Tessa Walsh
67bf949802
Set fields in AIOConfig to prevent MissingContentLength error on upload (#2665)
Needed to support upload with certain S3 providers.
Fixes #2664
2025-06-12 15:27:38 -04:00
Ilya Kreymer
d4a2a66d6d
additional scale / browser window cleanup to properly support QA: (#2663)
- follow up to #2627 
- use qa_num_browser_windows to set exact number of QA browsers,
fallback to qa_scale
- set num_browser_windows and num_browsers_per_pod using crawler / qa
values depending if QA crawl
- scale_from_browser_windows() accepts optional browsers_per_pod if
dealing with possible QA override
- store 'desiredScale' in CrawlStatus to avoid recomputing for later
scale resolving
- ensure status.scale is always the actual scale observed
2025-06-12 13:09:04 -04:00
Ilya Kreymer
001277ac9d
docs: add docs for path / virtual addressing (#2669)
Add docs about path / virtual 'access_addressing_style' that is
available for each storage option.

---------

Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2025-06-12 13:08:27 -04:00
Ilya Kreymer
8516d70486
Weblate -> Main Merge (#2670)
Merge changes from Weblate -> Main from #2470 and #2647

Co-authored-by: Weblate (bot) <hosted@weblate.org>
Co-authored-by: Anne Paz <anelisespaz@gmail.com>
Co-authored-by: weblate <1607653+weblate@users.noreply.github.com>
Co-authored-by: Bricaud Frédéric <frederic.bricaud@banq.qc.ca>
Co-authored-by: Carole Gagné <carole.gagne@banq.qc.ca>
2025-06-12 12:28:50 -04:00
sua yoo
826d70b649
docs: Update frontend dev docs (#2666)
Resolves https://github.com/webrecorder/browsertrix/issues/2633

## Changes

- Replaces broken link in frontend README with hosted and local links
- Clarifies that frontend is not deployed separately in dev docs
2025-06-12 10:21:58 -04:00
DaleLore
1a6d2a20c2
Documentation Update for Pausing and Resuming Crawl section (#2639)
- Rename 'Modifying Running Crawls' to 'Running Crawls'
- Add section about pausing/resuming crawls, and that paused crawls will eventually become stopped if not resumed.
- Add new crawl pausing, paused, resuming statuses and icons.

Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2025-06-10 21:47:03 -07:00
Ilya Kreymer
3fa0c68922
crawl status related fixes: (#2662)
- only set state to 'paused' if shoudPause is true and crawl is still
running (using FAILED_STATES list)
- treat failed/canceled crawl as inactive, don't show replay (using
RUNNING_STATES list)

---------

Co-authored-by: sua yoo <sua@webrecorder.org>
2025-06-10 21:45:07 -07:00
Vinzenz Sinapius
0e0e663363
helm: add crawler_network_policy_additional_egress (#2641)
- Adds `crawler_network_policy_additional_egress` setting, to add egress
rules to the existing crawler network policy. Useful for when you want
to allow-list a single IPs without replacing the whole network policy.

- Adds docs about `crawler_network_policy_additional_egress` to the customization page.

- Resolves #2121

---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-06-10 16:19:42 -07:00
sua yoo
40ebbd11d3
fix: Handle trial ending without cancelation (#2651)
Resolves https://github.com/webrecorder/browsertrix/issues/2650

## Changes

Differentials between `trialing` and `trialing_canceled` when displaying
messages:
- No changes to messages if `trialing_canceled`.
- If `trialing`, show messaging that subscription will automatically
continue.

---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>
2025-06-10 15:20:57 -07:00
Ilya Kreymer
223221c31e
Add securityContext for Redis pod (#2640)
It seems the latest redis image changed security settings so
root-mounted volumes no longer work.
This change:
- mount redis volumes as redis user/group 999
- needed to run with redis >=8.0.2
2025-06-10 15:20:18 -07:00
sua yoo
1fdd1bf2e4
fix: Display correct page after renaming org slug (#2659)
Resolves https://github.com/webrecorder/browsertrix/issues/2658

## Changes

Removes unnecessary `await` which was causing the 404 page introduced in
7c32e27f94 to show instead.

## Manual testing

See repro steps in
https://github.com/webrecorder/browsertrix/issues/2658
2025-06-10 13:27:55 -07:00
Ilya Kreymer
8ea16393c5
Optimize single-page crawl workflows (#2656)
For single page crawls:
- Always force 1 browser to be used, ignoring browser windows/scale
setting
- Don't use custom PVC volumes in crawler / redis, just use emptyDir -
no chance of crawler being interrupted and restarted on different
machine for a single page.

Adds a 'is_single_page' check to CrawlConfig, checking for either limit
or scopeType / no extra hops.

Fixes #2655
2025-06-10 12:13:57 -07:00
Emma Segal-Grossman
86c4d326e9
Normalize & document icon usage, and move design documents into storybook (#2597)
- Updates status icons & colors in several places in the app
- Moves "Action Menus" and updated "Status Indicators" design docs from
public docs to Storybook
  - [Storybook] Adds `remark-gfm` to enable tables in MDX
  - [Storybook] Adds a custom `ColorSwatch` block
- [Browsertrix Docs] Swaps out custom colors and fonts included with
docs for color variables from Hickory and Webrecorder CDN's hosted font
files, respectively

---------

Co-authored-by: sua yoo <sua@suayoo.com>
2025-06-10 10:58:18 -07:00
Emma Segal-Grossman
54d29aec05
Quick fix: use custom getFns for user-related keys in superadmin (#2649) 2025-06-05 13:13:45 -04:00
sua yoo
580fc6dbb9
devex: Replace inverted tooltip style with popver component (#2644)
Replaces all instances of `sl-tooltip.invert-tooltip` with
`<btrix-popover>`
2025-06-04 10:43:28 -07:00
Emma Segal-Grossman
7f44f43647
Fix issues with superadmin org filtering logic (#2638)
Fixes #2636

## Changes
- Displays trials scheduled for cancellation alongside non-trials
scheduled for cancellation
- Adds filter for "bad states" — active orgs that have a cancelled
subscription, orgs with a cancellation date in the past, and empty
subscription ids currently, but could be extended as necessary
- Displays scheduled-for-cancellation trials in the "trialing" filter as
well
- Improves display of future cancellation durations for both active
subscriptions and trials
- Surfaces issues where a trial cancellation was scheduled for the past
but the org is still active
- Swaps out `sl-tooltip`s for `btrix-popover`s in popovers with longer
details
- Adds correct heading levels, `tabindex`, and orientation for popovers
in use here

## Follow-ups
Once #2637 is merged we can ~~swap out the `sl-tooltip`s for
`btrix-popover`s here~~ _done!_ & in the superadmin stats card
2025-06-04 03:28:49 -04:00
sua yoo
199e28ce7c
gh: Update issue contact links (#2645)
Links directly to help forum.
2025-06-03 18:54:20 -07:00
sua yoo
9e581cbb7d
fix: Improve embedded user guide UX (#2630)
Resolves https://github.com/webrecorder/browsertrix/issues/2629

## Changes

- Fixes user guide not opening to the correct page when not using the
workflow editor
- Fixes out of date instructions in "starting a crawl" user guide
- Updates user guide so that the content makes more sense for both
logged in and non-logged in users, including moving the introduction
section so that the user guide navigation categories are all displayed
(see screenshot)

## Screenshots

| Page | Image/video |
| ---- | ----------- |
| Dashboard | <img width="517" alt="Screenshot 2025-05-27 at 5 09 07 PM"
src="https://github.com/user-attachments/assets/481ac817-d591-4ca9-a4be-532fad586fcf"
/> |


<!-- ## Follow-ups -->

---------

Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
2025-06-03 13:38:51 -07:00
Tessa Walsh
dc41468daf
Allow users to run crawls with 1 or 2 browser windows (#2627)
Fixes #2425 

## Changed

- Switch backend to primarily using number of browser windows rather
than scale multiplier (including migration to calculate `browserWindows`
from `scale` for existing workflows and crawls)
- Still support `scale` in addition to `browserWindows` in input models
for creating and updating workflows and re-adjusting live crawl scale
for backwards compatibility
- Adds new `max_browser_windows` value to Helm chart, but calculates the
value from `max_crawl_scale` as fallback for users with that value
already set in local charts
- Rework frontend to allow users to select multiples of
`crawler_browser_instances` or any value below
`crawler_browser_instances` for browser windows. For instance, with
`crawler_browser_instances=4` and `max_browser_windows=8`, the user
would be presented with the following options: 1, 2, 3, 4, 8
- Sets maximum width of screencast to image width returned by `message`

---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Co-authored-by: sua yoo <sua@suayoo.com>
Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>
2025-06-03 13:37:30 -07:00
Ilya Kreymer
f5c120b529
Don't clobber existing helm chart in release! (#2643)
Switch to different github release action:
- avoids clobbering existing release if already published, updates
existing draft only with latest Helm chart
- also sets name to `Browsertrix <version>`, fills in changelist.
- fixes #2642 

Tested:
- New draft release created (since branch ends in `-release`)
- Running multiple types to ensure chart is updated in draft
- Switching to older release to ensure chart is *NOT* clobbered
2025-06-03 09:28:34 -07:00
Ilya Kreymer
0e06ccd746 version: bump to 1.17.0-beta.0 2025-06-02 14:46:32 -07:00
Emma Segal-Grossman
4ed1a37f9d
Popover styling fixes (#2637) 2025-06-02 13:51:24 -04:00
Pierre
8b54444b7e
docs: update remote deployment docs with working nginx-install example (#2625)
- Update the docs on k3s deployment for installing `ingress-nginx`, fixes
#2619.
- Also fix the indentation on the code blocks so markdown carries on list
numbering. At the moment the numbering confusingly resets after point 3.
- Update indentation on all code blocks so they show up as part of list +
wrap long commands.
---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-05-28 20:07:02 -07:00
sua yoo
2aad7b8dc0
feat: Make saving simple workflow more efficient (#2626)
- Sticks workflow form save/run buttons to the viewport if all the
required fields are filled
- Adds keyboard shortcuts to save (cmd/ctrl + S to save, cmd/ctrl +
Enter to save and run)
- Adds "Cancel" button to new workflow
2025-05-28 20:04:07 -07:00
sua yoo
858ae15ce6
feat: Handle paused state + workflow performance improvements (#2610)
- Handles `paused` workflow state.
- Adds "Copy Crawl ID" and "View Archived Item" buttons to workflow
detail
- Fixes file size not updating in workflow crawls list
- Fixes superadmin banner showing over workflow tabs
- Refactors workflow detail API calls to use `Task` to improve poll
performance.
- Fixes execution time rendering when less than a minute

---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-05-28 19:26:38 -07:00
sua yoo
9f17264aa9
devex: Create btrix-popover component (#2632)
Add and documents new `btrix-popover` component.
2025-05-28 18:29:30 -07:00
sua yoo
7e3e8a594f
gh: Update issue templates (#2621)
- Adds issue type to each template
- Differentiates user-submitted "Change Request" from internal "Planned
Feature". This allows us to separate user-submitted ideas from work
we've planned through the new feature workflow, and automatically set
the github project.
- Adds template for docs change
- Makes additional context section optional, I noticed many issues put
"n/a" or similar in this section anyway.
- Disables blank issue adds generic "Task" issue template
2025-05-27 18:11:55 -07:00
sua yoo
7c32e27f94
fix: Show 404 page for nonexistent org (#2620)
Renders 404 page if org in URL doesn't exist.
2025-05-27 18:10:49 -07:00
Ilya Kreymer
5b0f851857
Fix securityContext for pod (#2623)
Some of the `securityContext` settings need to be on the container, not
on the pod, including the read-only file system, which was not previously enabled.
This now enables the read-only file system.
Also map the crawler /tmp directory to use the same volume as crawls (as
crawler currently uses /tmp dir) as /tmp becomes read-only otherwise.
2025-05-27 10:59:50 -07:00
sua yoo
7674672027
feat: Update superadmin active crawls view (#2618)
- Renames "Running Crawls" -> "Active Crawls" in superadmin app bar
- Shows number of active crawls next to link
- Refreshes active crawl list every 30 seconds
- Standardizes browser title
2025-05-26 12:22:38 -07:00
Ilya Kreymer
cb50c7c2c2
Pause / Resume Crawls Initial Implmentation. (#2572)
- add 'pause' crawl state (fixes #2567)
- gracefully shut down crawler pods, and then redis pod when paused
- crawler uploads WACZ before shutting down (dependent on
webrecorder/browsertrix-crawler#824, supported in 1.6.1+)
- add 'paused_at' on crawl spec to indicate when crawl is paused
- support max pause time limit, after which crawl becomes automatically
stopped.
- add 'stopped_pause_expired' when pause automatically expires and crawl
is stopped
- /crawl/<id>/{pause,resume} apis to toggle 'paused' on crawl spec
- ui: add pause/resume button, paused state (partially addresses #2568)
- ui: add pausing/resuming derivative states when crawl is running and
pausing, or paused and not pausing (partially addresses #2569)
- Designed to work with crawler 1.6.1+ which support pausing + uploading on pause

Work on #2566, Fixes #2576 

---------
Co-authored-by: sua yoo <sua@webrecorder.org>
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
Co-authored-by: sua yoo <sua@suayoo.com>
2025-05-21 14:05:16 -07:00
Ilya Kreymer
e995811dd4 version: bump to 1.16.2 2025-05-20 18:43:22 -07:00
Ilya Kreymer
8a713155ef
remove deleted collections from crawlconfigs (#2615)
simplified version of #2608, add a remove_collection_from_all_configs() in CrawlConfigs, also check org.
update tests to ensure removal
2025-05-20 18:38:40 -07:00
Ilya Kreymer
86e35e358d
Add Org Check for Collection access (#2616)
Ensure collection access checks org membership
2025-05-20 15:30:22 -07:00
Ilya Kreymer
e29db33629
tests: fix nightly test config after #2611 (#2614)
remove namespace from minio config to match settings
2025-05-20 12:25:15 -07:00
sua yoo
ef93c5ad90
docs: Document latest crawl (#2613)
Follows https://github.com/webrecorder/browsertrix/issues/2603

## Changes

- Updates documentation on "Latest Crawl" tab
- Fixes extra fetch in workflow detail page
- Reverts workflow detail labels from "Duration" back to "Run Duration"
and "Pages" back to "Pages Crawled"
2025-05-20 12:19:09 -07:00
Ilya Kreymer
c134b576ae
Optimize presigning for replay.json (#2516)
Fixes #2515.

This PR introduces a significantly optimized logic for presigning URLs
for crawls and collections.
- For collections, the files needed from all crawls are looked up, and
then the 'presign_urls' table is merged in one pass, resulting in a
unified iterator containing files and presign urls for those files.
- For crawls, the presign URLs are also looked up once, and the same
iterator is used for a single crawl with passed in list of CrawlFiles
- URLs that are already signed are added to the return list.
- For any remaining URLs to be signed, a bulk presigning function is
added, which shares an HTTP connection and signing 8 files in parallels
(customizable via helm chart, though may not be needed). This function
is used to call the presigning API in parallel.
2025-05-20 12:09:35 -07:00
Ilya Kreymer
f1fd11c031
storage: use s3v4 signature for presigning urls (#2611)
Use V4 ('s3v4') signature version for for all presigning URLs to support
backblaze, fixes #2472
- add 'access_addressing_style' to be able to choose virtual/path
addressing for access endpoint (default to 'virtual' as before)
- fix minio presigning with v4 by using 'path' addressing style for
minio
- if path matches '/data/' for internal minio bucket, then always use
'path'
- also make minio access path '/data/' configurable

also simplify running in any namespace with default settings:
- don't hardcode 'local-minio.default'
- in crawlers namespace, add a 'local-minio' externalName service which
maps to the main namespace service.
2025-05-19 15:44:36 -07:00
sua yoo
4b1e416eb6
feat: Workflow "latest crawl" tab (#2605)
- Combines "Watch" and "Logs" into single "Latest Crawl" tab
- Updates workflow routes and adds redirects
- Enables replaying and downloading latest crawl from the workflow
detail view
- Tweaks crawl list table header labels and and archived item download
button labels for consistency
- Fixes crawl queue showing error when stopping crawl
2025-05-14 10:23:36 -07:00
sua yoo
7c9627f4bb
chore: Clean up data grid component (#2604)
- Moves data grid styles to separate stylesheet.
- Adds `rowsSelectable` option, renames `rows-` properties to match.
- Adds WIP `rowsExpandable` option.
- Fixes showing tooltip on focus.
- Cleans up rows controller typing.

---------

Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
2025-05-14 09:44:07 -07:00
Tessa Walsh
c73512dbd4
Bump version to 1.16.1 (#2606) 2025-05-13 17:29:49 -04:00
Tessa Walsh
1492397656
Add ISO-639-1 language code validation to backend (#2602)
- Add backend validation for language codes
- Add migration to look for invalid ISO-639-1 language codes in
workflows, crawls, and org crawling defaults, and fix any found
2025-05-13 16:54:33 -04:00
Emma Segal-Grossman
e17772145e
Add minimized superadmin banner (#2598) 2025-05-13 16:32:35 -04:00
Tessa Walsh
6f81d588a9
Ensure crawl page counts are correct when re-adding pages (#2601)
Fixes #2600 

This PR fixes the issue by ensuring that crawl page counts (total,
unique, files, errors) are reset to 0 when crawl pages are deleted, such
as right before being re-added.

It also adds a migration will recalculates file and error page counts
for each crawl without re-adding pages from the WACZ files.
2025-05-13 14:05:41 -04:00
sua yoo
594f5bc171
devex: Data grid component (#2561)
- Adds new `<btrix-data-grid>` component
- Refactors `<btrix-usage-history-table>` to data grid
- Refactors Refactors `<btrix-syntax-input>` and
`<btrix-link-selector-table>` to be form-associated controls.

---------

Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
2025-05-12 10:36:14 -07:00
sua yoo
6b510fe89c
fix: Sync user guide to correct workflow section (#2592)
Resolves https://github.com/webrecorder/browsertrix/issues/2560

## Changes

- Syncs workflow current form section with user guide section.
- Stickies "User Guide" button to top of viewport so that user guide can
be opened.
- Makes content behind user guide clickable (fixes issues with stickied
elements shifting when user guide is not contained to the parent
element.)
- Decreases size of user guide text when embedded in an iframe.
- Refactors overflow scrim to reuse CSS variables.

---------
Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
2025-05-08 14:41:35 -07:00
Ilya Kreymer
652e8a6085 version: bump to 1.16.0 2025-05-08 14:30:00 -07:00