Commit Graph

1654 Commits

Author SHA1 Message Date
Ilya Kreymer
223221c31e
Add securityContext for Redis pod (#2640)
It seems the latest redis image changed security settings so
root-mounted volumes no longer work.
This change:
- mount redis volumes as redis user/group 999
- needed to run with redis >=8.0.2
2025-06-10 15:20:18 -07:00
sua yoo
1fdd1bf2e4
fix: Display correct page after renaming org slug (#2659)
Resolves https://github.com/webrecorder/browsertrix/issues/2658

## Changes

Removes unnecessary `await` which was causing the 404 page introduced in
7c32e27f94 to show instead.

## Manual testing

See repro steps in
https://github.com/webrecorder/browsertrix/issues/2658
2025-06-10 13:27:55 -07:00
Ilya Kreymer
8ea16393c5
Optimize single-page crawl workflows (#2656)
For single page crawls:
- Always force 1 browser to be used, ignoring browser windows/scale
setting
- Don't use custom PVC volumes in crawler / redis, just use emptyDir -
no chance of crawler being interrupted and restarted on different
machine for a single page.

Adds a 'is_single_page' check to CrawlConfig, checking for either limit
or scopeType / no extra hops.

Fixes #2655
2025-06-10 12:13:57 -07:00
Emma Segal-Grossman
86c4d326e9
Normalize & document icon usage, and move design documents into storybook (#2597)
- Updates status icons & colors in several places in the app
- Moves "Action Menus" and updated "Status Indicators" design docs from
public docs to Storybook
  - [Storybook] Adds `remark-gfm` to enable tables in MDX
  - [Storybook] Adds a custom `ColorSwatch` block
- [Browsertrix Docs] Swaps out custom colors and fonts included with
docs for color variables from Hickory and Webrecorder CDN's hosted font
files, respectively

---------

Co-authored-by: sua yoo <sua@suayoo.com>
2025-06-10 10:58:18 -07:00
Emma Segal-Grossman
54d29aec05
Quick fix: use custom getFns for user-related keys in superadmin (#2649) 2025-06-05 13:13:45 -04:00
sua yoo
580fc6dbb9
devex: Replace inverted tooltip style with popver component (#2644)
Replaces all instances of `sl-tooltip.invert-tooltip` with
`<btrix-popover>`
2025-06-04 10:43:28 -07:00
Emma Segal-Grossman
7f44f43647
Fix issues with superadmin org filtering logic (#2638)
Fixes #2636

## Changes
- Displays trials scheduled for cancellation alongside non-trials
scheduled for cancellation
- Adds filter for "bad states" — active orgs that have a cancelled
subscription, orgs with a cancellation date in the past, and empty
subscription ids currently, but could be extended as necessary
- Displays scheduled-for-cancellation trials in the "trialing" filter as
well
- Improves display of future cancellation durations for both active
subscriptions and trials
- Surfaces issues where a trial cancellation was scheduled for the past
but the org is still active
- Swaps out `sl-tooltip`s for `btrix-popover`s in popovers with longer
details
- Adds correct heading levels, `tabindex`, and orientation for popovers
in use here

## Follow-ups
Once #2637 is merged we can ~~swap out the `sl-tooltip`s for
`btrix-popover`s here~~ _done!_ & in the superadmin stats card
2025-06-04 03:28:49 -04:00
sua yoo
199e28ce7c
gh: Update issue contact links (#2645)
Links directly to help forum.
2025-06-03 18:54:20 -07:00
sua yoo
9e581cbb7d
fix: Improve embedded user guide UX (#2630)
Resolves https://github.com/webrecorder/browsertrix/issues/2629

## Changes

- Fixes user guide not opening to the correct page when not using the
workflow editor
- Fixes out of date instructions in "starting a crawl" user guide
- Updates user guide so that the content makes more sense for both
logged in and non-logged in users, including moving the introduction
section so that the user guide navigation categories are all displayed
(see screenshot)

## Screenshots

| Page | Image/video |
| ---- | ----------- |
| Dashboard | <img width="517" alt="Screenshot 2025-05-27 at 5 09 07 PM"
src="https://github.com/user-attachments/assets/481ac817-d591-4ca9-a4be-532fad586fcf"
/> |


<!-- ## Follow-ups -->

---------

Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
2025-06-03 13:38:51 -07:00
Tessa Walsh
dc41468daf
Allow users to run crawls with 1 or 2 browser windows (#2627)
Fixes #2425 

## Changed

- Switch backend to primarily using number of browser windows rather
than scale multiplier (including migration to calculate `browserWindows`
from `scale` for existing workflows and crawls)
- Still support `scale` in addition to `browserWindows` in input models
for creating and updating workflows and re-adjusting live crawl scale
for backwards compatibility
- Adds new `max_browser_windows` value to Helm chart, but calculates the
value from `max_crawl_scale` as fallback for users with that value
already set in local charts
- Rework frontend to allow users to select multiples of
`crawler_browser_instances` or any value below
`crawler_browser_instances` for browser windows. For instance, with
`crawler_browser_instances=4` and `max_browser_windows=8`, the user
would be presented with the following options: 1, 2, 3, 4, 8
- Sets maximum width of screencast to image width returned by `message`

---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Co-authored-by: sua yoo <sua@suayoo.com>
Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>
2025-06-03 13:37:30 -07:00
Ilya Kreymer
f5c120b529
Don't clobber existing helm chart in release! (#2643)
Switch to different github release action:
- avoids clobbering existing release if already published, updates
existing draft only with latest Helm chart
- also sets name to `Browsertrix <version>`, fills in changelist.
- fixes #2642 

Tested:
- New draft release created (since branch ends in `-release`)
- Running multiple types to ensure chart is updated in draft
- Switching to older release to ensure chart is *NOT* clobbered
2025-06-03 09:28:34 -07:00
Ilya Kreymer
0e06ccd746 version: bump to 1.17.0-beta.0 2025-06-02 14:46:32 -07:00
Emma Segal-Grossman
4ed1a37f9d
Popover styling fixes (#2637) 2025-06-02 13:51:24 -04:00
Pierre
8b54444b7e
docs: update remote deployment docs with working nginx-install example (#2625)
- Update the docs on k3s deployment for installing `ingress-nginx`, fixes
#2619.
- Also fix the indentation on the code blocks so markdown carries on list
numbering. At the moment the numbering confusingly resets after point 3.
- Update indentation on all code blocks so they show up as part of list +
wrap long commands.
---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-05-28 20:07:02 -07:00
sua yoo
2aad7b8dc0
feat: Make saving simple workflow more efficient (#2626)
- Sticks workflow form save/run buttons to the viewport if all the
required fields are filled
- Adds keyboard shortcuts to save (cmd/ctrl + S to save, cmd/ctrl +
Enter to save and run)
- Adds "Cancel" button to new workflow
2025-05-28 20:04:07 -07:00
sua yoo
858ae15ce6
feat: Handle paused state + workflow performance improvements (#2610)
- Handles `paused` workflow state.
- Adds "Copy Crawl ID" and "View Archived Item" buttons to workflow
detail
- Fixes file size not updating in workflow crawls list
- Fixes superadmin banner showing over workflow tabs
- Refactors workflow detail API calls to use `Task` to improve poll
performance.
- Fixes execution time rendering when less than a minute

---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-05-28 19:26:38 -07:00
sua yoo
9f17264aa9
devex: Create btrix-popover component (#2632)
Add and documents new `btrix-popover` component.
2025-05-28 18:29:30 -07:00
sua yoo
7e3e8a594f
gh: Update issue templates (#2621)
- Adds issue type to each template
- Differentiates user-submitted "Change Request" from internal "Planned
Feature". This allows us to separate user-submitted ideas from work
we've planned through the new feature workflow, and automatically set
the github project.
- Adds template for docs change
- Makes additional context section optional, I noticed many issues put
"n/a" or similar in this section anyway.
- Disables blank issue adds generic "Task" issue template
2025-05-27 18:11:55 -07:00
sua yoo
7c32e27f94
fix: Show 404 page for nonexistent org (#2620)
Renders 404 page if org in URL doesn't exist.
2025-05-27 18:10:49 -07:00
Ilya Kreymer
5b0f851857
Fix securityContext for pod (#2623)
Some of the `securityContext` settings need to be on the container, not
on the pod, including the read-only file system, which was not previously enabled.
This now enables the read-only file system.
Also map the crawler /tmp directory to use the same volume as crawls (as
crawler currently uses /tmp dir) as /tmp becomes read-only otherwise.
2025-05-27 10:59:50 -07:00
sua yoo
7674672027
feat: Update superadmin active crawls view (#2618)
- Renames "Running Crawls" -> "Active Crawls" in superadmin app bar
- Shows number of active crawls next to link
- Refreshes active crawl list every 30 seconds
- Standardizes browser title
2025-05-26 12:22:38 -07:00
Ilya Kreymer
cb50c7c2c2
Pause / Resume Crawls Initial Implmentation. (#2572)
- add 'pause' crawl state (fixes #2567)
- gracefully shut down crawler pods, and then redis pod when paused
- crawler uploads WACZ before shutting down (dependent on
webrecorder/browsertrix-crawler#824, supported in 1.6.1+)
- add 'paused_at' on crawl spec to indicate when crawl is paused
- support max pause time limit, after which crawl becomes automatically
stopped.
- add 'stopped_pause_expired' when pause automatically expires and crawl
is stopped
- /crawl/<id>/{pause,resume} apis to toggle 'paused' on crawl spec
- ui: add pause/resume button, paused state (partially addresses #2568)
- ui: add pausing/resuming derivative states when crawl is running and
pausing, or paused and not pausing (partially addresses #2569)
- Designed to work with crawler 1.6.1+ which support pausing + uploading on pause

Work on #2566, Fixes #2576 

---------
Co-authored-by: sua yoo <sua@webrecorder.org>
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
Co-authored-by: sua yoo <sua@suayoo.com>
2025-05-21 14:05:16 -07:00
Ilya Kreymer
e995811dd4 version: bump to 1.16.2 2025-05-20 18:43:22 -07:00
Ilya Kreymer
8a713155ef
remove deleted collections from crawlconfigs (#2615)
simplified version of #2608, add a remove_collection_from_all_configs() in CrawlConfigs, also check org.
update tests to ensure removal
2025-05-20 18:38:40 -07:00
Ilya Kreymer
86e35e358d
Add Org Check for Collection access (#2616)
Ensure collection access checks org membership
2025-05-20 15:30:22 -07:00
Ilya Kreymer
e29db33629
tests: fix nightly test config after #2611 (#2614)
remove namespace from minio config to match settings
2025-05-20 12:25:15 -07:00
sua yoo
ef93c5ad90
docs: Document latest crawl (#2613)
Follows https://github.com/webrecorder/browsertrix/issues/2603

## Changes

- Updates documentation on "Latest Crawl" tab
- Fixes extra fetch in workflow detail page
- Reverts workflow detail labels from "Duration" back to "Run Duration"
and "Pages" back to "Pages Crawled"
2025-05-20 12:19:09 -07:00
Ilya Kreymer
c134b576ae
Optimize presigning for replay.json (#2516)
Fixes #2515.

This PR introduces a significantly optimized logic for presigning URLs
for crawls and collections.
- For collections, the files needed from all crawls are looked up, and
then the 'presign_urls' table is merged in one pass, resulting in a
unified iterator containing files and presign urls for those files.
- For crawls, the presign URLs are also looked up once, and the same
iterator is used for a single crawl with passed in list of CrawlFiles
- URLs that are already signed are added to the return list.
- For any remaining URLs to be signed, a bulk presigning function is
added, which shares an HTTP connection and signing 8 files in parallels
(customizable via helm chart, though may not be needed). This function
is used to call the presigning API in parallel.
2025-05-20 12:09:35 -07:00
Ilya Kreymer
f1fd11c031
storage: use s3v4 signature for presigning urls (#2611)
Use V4 ('s3v4') signature version for for all presigning URLs to support
backblaze, fixes #2472
- add 'access_addressing_style' to be able to choose virtual/path
addressing for access endpoint (default to 'virtual' as before)
- fix minio presigning with v4 by using 'path' addressing style for
minio
- if path matches '/data/' for internal minio bucket, then always use
'path'
- also make minio access path '/data/' configurable

also simplify running in any namespace with default settings:
- don't hardcode 'local-minio.default'
- in crawlers namespace, add a 'local-minio' externalName service which
maps to the main namespace service.
2025-05-19 15:44:36 -07:00
sua yoo
4b1e416eb6
feat: Workflow "latest crawl" tab (#2605)
- Combines "Watch" and "Logs" into single "Latest Crawl" tab
- Updates workflow routes and adds redirects
- Enables replaying and downloading latest crawl from the workflow
detail view
- Tweaks crawl list table header labels and and archived item download
button labels for consistency
- Fixes crawl queue showing error when stopping crawl
2025-05-14 10:23:36 -07:00
sua yoo
7c9627f4bb
chore: Clean up data grid component (#2604)
- Moves data grid styles to separate stylesheet.
- Adds `rowsSelectable` option, renames `rows-` properties to match.
- Adds WIP `rowsExpandable` option.
- Fixes showing tooltip on focus.
- Cleans up rows controller typing.

---------

Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
2025-05-14 09:44:07 -07:00
Tessa Walsh
c73512dbd4
Bump version to 1.16.1 (#2606) 2025-05-13 17:29:49 -04:00
Tessa Walsh
1492397656
Add ISO-639-1 language code validation to backend (#2602)
- Add backend validation for language codes
- Add migration to look for invalid ISO-639-1 language codes in
workflows, crawls, and org crawling defaults, and fix any found
2025-05-13 16:54:33 -04:00
Emma Segal-Grossman
e17772145e
Add minimized superadmin banner (#2598) 2025-05-13 16:32:35 -04:00
Tessa Walsh
6f81d588a9
Ensure crawl page counts are correct when re-adding pages (#2601)
Fixes #2600 

This PR fixes the issue by ensuring that crawl page counts (total,
unique, files, errors) are reset to 0 when crawl pages are deleted, such
as right before being re-added.

It also adds a migration will recalculates file and error page counts
for each crawl without re-adding pages from the WACZ files.
2025-05-13 14:05:41 -04:00
sua yoo
594f5bc171
devex: Data grid component (#2561)
- Adds new `<btrix-data-grid>` component
- Refactors `<btrix-usage-history-table>` to data grid
- Refactors Refactors `<btrix-syntax-input>` and
`<btrix-link-selector-table>` to be form-associated controls.

---------

Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
2025-05-12 10:36:14 -07:00
sua yoo
6b510fe89c
fix: Sync user guide to correct workflow section (#2592)
Resolves https://github.com/webrecorder/browsertrix/issues/2560

## Changes

- Syncs workflow current form section with user guide section.
- Stickies "User Guide" button to top of viewport so that user guide can
be opened.
- Makes content behind user guide clickable (fixes issues with stickied
elements shifting when user guide is not contained to the parent
element.)
- Decreases size of user guide text when embedded in an iframe.
- Refactors overflow scrim to reuse CSS variables.

---------
Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
2025-05-08 14:41:35 -07:00
Ilya Kreymer
652e8a6085 version: bump to 1.16.0 2025-05-08 14:30:00 -07:00
Ilya Kreymer
1570011ec7
compute top page origins for each collection (#2483)
A quick PR to fix #2482:
- compute topPageHosts as part of existing collection stats compute
- store top 10 results in collection for now.
- display in collection About sidebar
- fixes #2482 

Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2025-05-08 14:22:40 -07:00
Emma Segal-Grossman
0691f43be6
Sort running crawls first by default (#2587) 2025-05-08 17:21:17 -04:00
Emma Segal-Grossman
5915c24c18
Add "cancellation scheduled" state to superadmin org list (#2594)
Fixes https://github.com/webrecorder/browsertrix/issues/2595

## Changes

Adds "Subscription Cancellation Scheduled" state/icon/tooltip to
superadmin org list, with future cancellation duration/date.

Adds more subscription-related info and features to the action menu in
the same org list
- "Open in Stripe" action is visible if subscription id is a Stripe
object id
- "Plan ID" and "Action on Cancel" correspond to `planId` and
`readOnlyOnCancel` properties on `subscription` object
- There's also some additional highlighting for possible errors
(hopefully only visible on dev) — see the last screenshot for an example

Adds first pass at filters for superadmin org list
- The filters' counts update when searching
- I took an initial pass at figuring out which filters would be most
useful — we can always go back and tweak them later
2025-05-06 18:59:29 -07:00
Tessa Walsh
3e169ebc15
Add API endpoint to check if subscription is activated (#2582)
Subscription Management: used check to ensure subscription can be auto-canceled if
not activated.

---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-05-06 17:36:58 -07:00
sua yoo
cb6e279a3c
fix: Hide incorrect menu item for running workflow crawl (#2591)
- Hides the "Delete" menu item for a running crawl in the workflows
crawls list.
- Slightly grays out row for running crawl to indicate that it's not
clickable.
2025-05-06 15:19:33 -07:00
sua yoo
0ec94098a5
fix: Show correct button for workflow without crawls (#2590)
Shows "Run Now" button instead of "QA Latest Crawl" in workflow "Watch"
tab when there aren't any crawls.
2025-05-06 14:31:26 -07:00
sua yoo
62a53d01d6
fix: Correct post load delay label (#2593) 2025-05-06 10:09:51 -07:00
Emma Segal-Grossman
8b6e1ca9af
Add overflow scroll component with scroll scrim/shadow (#2578) 2025-05-05 20:24:47 -04:00
Emma Segal-Grossman
4ce769ecab
Ensure primary button in button group has its border appear (#2583) 2025-05-05 20:24:34 -04:00
Emma Segal-Grossman
8a707e3b3a
Fix table grid column CSS variable, superadmin list menus being hidden/inoperable, and various other table tweaks (#2573)
Closes #2574
cc @SuaYoo 

## Changes

This adds an internal `--btrix-table-grid-template-columns--internal`
css property to `btrix-table` to set table grid cols, which uses the
`--btrix-table-grid-template-columns` value if defined and otherwise
defaults to the number of header cols **from within the css
declaration**, rather than using JS. In Chrome at least,
`this.style.getPropertyValue` wasn't picking up on css variables defined
outside of the custom component boundary, so this gets around that.

Other changes:
- Adds an additional column to the superadmin org list, as it was
missing one
- Fixes `overflow-dropdown` unintentionally setting its internal
button's size to `undefined` if `size` wasn't set on it
- Swaps the remaining tables to use
`--btrix-table-grid-template-columns` instead of directly setting
`grid-template-columns`
- Adds a min-width of `min-content` to the table container, because
doing so is necessary for left/right scrolling, and this is a common
enough pattern it seems that upstreaming this into the table itself
makes sense — it shouldn't cause breakages, this already generally is
the expected behaviour
- Allows tables to scroll left/right when necessary
- Fix padding/margin for a few left/right scrolling tables
- Allows primary column of collections list to shrink to a smaller min
width

## Testing

Test that none of the other tables are broken. I couldn't find any!
2025-04-29 21:00:16 -04:00
sua yoo
1fa43335c0
feat: Apply saved workflow settings to current crawl (#2514)
Resolves https://github.com/webrecorder/browsertrix/issues/2366

## Changes

Allows users to update current crawl with newly saved workflow settings.

## Manual testing

1. Log in as crawler
2. Start a crawl
3. Go to edit workflow. Verify "Update Crawl" button is shown
4. Click "Update Crawl". Verify crawl is updated with new settings

---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2025-04-29 11:43:14 -07:00
Tessa Walsh
c4a7ebce29
Update button text from "Setup Guide" to "User Guide" for consistency (#2565)
Fixes #2564
2025-04-24 10:58:26 -04:00