Commit Graph

1030 Commits

Author SHA1 Message Date
sua yoo
2aad7b8dc0
feat: Make saving simple workflow more efficient (#2626)
- Sticks workflow form save/run buttons to the viewport if all the
required fields are filled
- Adds keyboard shortcuts to save (cmd/ctrl + S to save, cmd/ctrl +
Enter to save and run)
- Adds "Cancel" button to new workflow
2025-05-28 20:04:07 -07:00
sua yoo
858ae15ce6
feat: Handle paused state + workflow performance improvements (#2610)
- Handles `paused` workflow state.
- Adds "Copy Crawl ID" and "View Archived Item" buttons to workflow
detail
- Fixes file size not updating in workflow crawls list
- Fixes superadmin banner showing over workflow tabs
- Refactors workflow detail API calls to use `Task` to improve poll
performance.
- Fixes execution time rendering when less than a minute

---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-05-28 19:26:38 -07:00
sua yoo
9f17264aa9
devex: Create btrix-popover component (#2632)
Add and documents new `btrix-popover` component.
2025-05-28 18:29:30 -07:00
sua yoo
7c32e27f94
fix: Show 404 page for nonexistent org (#2620)
Renders 404 page if org in URL doesn't exist.
2025-05-27 18:10:49 -07:00
sua yoo
7674672027
feat: Update superadmin active crawls view (#2618)
- Renames "Running Crawls" -> "Active Crawls" in superadmin app bar
- Shows number of active crawls next to link
- Refreshes active crawl list every 30 seconds
- Standardizes browser title
2025-05-26 12:22:38 -07:00
Ilya Kreymer
cb50c7c2c2
Pause / Resume Crawls Initial Implmentation. (#2572)
- add 'pause' crawl state (fixes #2567)
- gracefully shut down crawler pods, and then redis pod when paused
- crawler uploads WACZ before shutting down (dependent on
webrecorder/browsertrix-crawler#824, supported in 1.6.1+)
- add 'paused_at' on crawl spec to indicate when crawl is paused
- support max pause time limit, after which crawl becomes automatically
stopped.
- add 'stopped_pause_expired' when pause automatically expires and crawl
is stopped
- /crawl/<id>/{pause,resume} apis to toggle 'paused' on crawl spec
- ui: add pause/resume button, paused state (partially addresses #2568)
- ui: add pausing/resuming derivative states when crawl is running and
pausing, or paused and not pausing (partially addresses #2569)
- Designed to work with crawler 1.6.1+ which support pausing + uploading on pause

Work on #2566, Fixes #2576 

---------
Co-authored-by: sua yoo <sua@webrecorder.org>
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
Co-authored-by: sua yoo <sua@suayoo.com>
2025-05-21 14:05:16 -07:00
Ilya Kreymer
e995811dd4 version: bump to 1.16.2 2025-05-20 18:43:22 -07:00
sua yoo
ef93c5ad90
docs: Document latest crawl (#2613)
Follows https://github.com/webrecorder/browsertrix/issues/2603

## Changes

- Updates documentation on "Latest Crawl" tab
- Fixes extra fetch in workflow detail page
- Reverts workflow detail labels from "Duration" back to "Run Duration"
and "Pages" back to "Pages Crawled"
2025-05-20 12:19:09 -07:00
Ilya Kreymer
f1fd11c031
storage: use s3v4 signature for presigning urls (#2611)
Use V4 ('s3v4') signature version for for all presigning URLs to support
backblaze, fixes #2472
- add 'access_addressing_style' to be able to choose virtual/path
addressing for access endpoint (default to 'virtual' as before)
- fix minio presigning with v4 by using 'path' addressing style for
minio
- if path matches '/data/' for internal minio bucket, then always use
'path'
- also make minio access path '/data/' configurable

also simplify running in any namespace with default settings:
- don't hardcode 'local-minio.default'
- in crawlers namespace, add a 'local-minio' externalName service which
maps to the main namespace service.
2025-05-19 15:44:36 -07:00
sua yoo
4b1e416eb6
feat: Workflow "latest crawl" tab (#2605)
- Combines "Watch" and "Logs" into single "Latest Crawl" tab
- Updates workflow routes and adds redirects
- Enables replaying and downloading latest crawl from the workflow
detail view
- Tweaks crawl list table header labels and and archived item download
button labels for consistency
- Fixes crawl queue showing error when stopping crawl
2025-05-14 10:23:36 -07:00
sua yoo
7c9627f4bb
chore: Clean up data grid component (#2604)
- Moves data grid styles to separate stylesheet.
- Adds `rowsSelectable` option, renames `rows-` properties to match.
- Adds WIP `rowsExpandable` option.
- Fixes showing tooltip on focus.
- Cleans up rows controller typing.

---------

Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
2025-05-14 09:44:07 -07:00
Tessa Walsh
c73512dbd4
Bump version to 1.16.1 (#2606) 2025-05-13 17:29:49 -04:00
Tessa Walsh
1492397656
Add ISO-639-1 language code validation to backend (#2602)
- Add backend validation for language codes
- Add migration to look for invalid ISO-639-1 language codes in
workflows, crawls, and org crawling defaults, and fix any found
2025-05-13 16:54:33 -04:00
Emma Segal-Grossman
e17772145e
Add minimized superadmin banner (#2598) 2025-05-13 16:32:35 -04:00
sua yoo
594f5bc171
devex: Data grid component (#2561)
- Adds new `<btrix-data-grid>` component
- Refactors `<btrix-usage-history-table>` to data grid
- Refactors Refactors `<btrix-syntax-input>` and
`<btrix-link-selector-table>` to be form-associated controls.

---------

Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
2025-05-12 10:36:14 -07:00
sua yoo
6b510fe89c
fix: Sync user guide to correct workflow section (#2592)
Resolves https://github.com/webrecorder/browsertrix/issues/2560

## Changes

- Syncs workflow current form section with user guide section.
- Stickies "User Guide" button to top of viewport so that user guide can
be opened.
- Makes content behind user guide clickable (fixes issues with stickied
elements shifting when user guide is not contained to the parent
element.)
- Decreases size of user guide text when embedded in an iframe.
- Refactors overflow scrim to reuse CSS variables.

---------
Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
2025-05-08 14:41:35 -07:00
Ilya Kreymer
652e8a6085 version: bump to 1.16.0 2025-05-08 14:30:00 -07:00
Ilya Kreymer
1570011ec7
compute top page origins for each collection (#2483)
A quick PR to fix #2482:
- compute topPageHosts as part of existing collection stats compute
- store top 10 results in collection for now.
- display in collection About sidebar
- fixes #2482 

Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2025-05-08 14:22:40 -07:00
Emma Segal-Grossman
5915c24c18
Add "cancellation scheduled" state to superadmin org list (#2594)
Fixes https://github.com/webrecorder/browsertrix/issues/2595

## Changes

Adds "Subscription Cancellation Scheduled" state/icon/tooltip to
superadmin org list, with future cancellation duration/date.

Adds more subscription-related info and features to the action menu in
the same org list
- "Open in Stripe" action is visible if subscription id is a Stripe
object id
- "Plan ID" and "Action on Cancel" correspond to `planId` and
`readOnlyOnCancel` properties on `subscription` object
- There's also some additional highlighting for possible errors
(hopefully only visible on dev) — see the last screenshot for an example

Adds first pass at filters for superadmin org list
- The filters' counts update when searching
- I took an initial pass at figuring out which filters would be most
useful — we can always go back and tweak them later
2025-05-06 18:59:29 -07:00
sua yoo
cb6e279a3c
fix: Hide incorrect menu item for running workflow crawl (#2591)
- Hides the "Delete" menu item for a running crawl in the workflows
crawls list.
- Slightly grays out row for running crawl to indicate that it's not
clickable.
2025-05-06 15:19:33 -07:00
sua yoo
0ec94098a5
fix: Show correct button for workflow without crawls (#2590)
Shows "Run Now" button instead of "QA Latest Crawl" in workflow "Watch"
tab when there aren't any crawls.
2025-05-06 14:31:26 -07:00
sua yoo
62a53d01d6
fix: Correct post load delay label (#2593) 2025-05-06 10:09:51 -07:00
Emma Segal-Grossman
8b6e1ca9af
Add overflow scroll component with scroll scrim/shadow (#2578) 2025-05-05 20:24:47 -04:00
Emma Segal-Grossman
4ce769ecab
Ensure primary button in button group has its border appear (#2583) 2025-05-05 20:24:34 -04:00
Emma Segal-Grossman
8a707e3b3a
Fix table grid column CSS variable, superadmin list menus being hidden/inoperable, and various other table tweaks (#2573)
Closes #2574
cc @SuaYoo 

## Changes

This adds an internal `--btrix-table-grid-template-columns--internal`
css property to `btrix-table` to set table grid cols, which uses the
`--btrix-table-grid-template-columns` value if defined and otherwise
defaults to the number of header cols **from within the css
declaration**, rather than using JS. In Chrome at least,
`this.style.getPropertyValue` wasn't picking up on css variables defined
outside of the custom component boundary, so this gets around that.

Other changes:
- Adds an additional column to the superadmin org list, as it was
missing one
- Fixes `overflow-dropdown` unintentionally setting its internal
button's size to `undefined` if `size` wasn't set on it
- Swaps the remaining tables to use
`--btrix-table-grid-template-columns` instead of directly setting
`grid-template-columns`
- Adds a min-width of `min-content` to the table container, because
doing so is necessary for left/right scrolling, and this is a common
enough pattern it seems that upstreaming this into the table itself
makes sense — it shouldn't cause breakages, this already generally is
the expected behaviour
- Allows tables to scroll left/right when necessary
- Fix padding/margin for a few left/right scrolling tables
- Allows primary column of collections list to shrink to a smaller min
width

## Testing

Test that none of the other tables are broken. I couldn't find any!
2025-04-29 21:00:16 -04:00
sua yoo
1fa43335c0
feat: Apply saved workflow settings to current crawl (#2514)
Resolves https://github.com/webrecorder/browsertrix/issues/2366

## Changes

Allows users to update current crawl with newly saved workflow settings.

## Manual testing

1. Log in as crawler
2. Start a crawl
3. Go to edit workflow. Verify "Update Crawl" button is shown
4. Click "Update Crawl". Verify crawl is updated with new settings

---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2025-04-29 11:43:14 -07:00
Tessa Walsh
c4a7ebce29
Update button text from "Setup Guide" to "User Guide" for consistency (#2565)
Fixes #2564
2025-04-24 10:58:26 -04:00
sua yoo
573d8ca316
devex: Document workflow table components (#2558)
- Documents the following components in Storybook:
  - `btrix-data-table`
  - `btrix-table`
  - `btrix-crawl-log-table`
  - `btrix-custom-behaviors-table`
  - `btrix-link-selector-table`
  - `btrix-queue-exclusion-table`
  - `btrix-queue-exclusion-form`
- Refactors `btrix-table` and subcomponents to simplify CSS properties
- Fixes crawl exclusion table delete button not rendering
- Fixes Shoelace assets not loading Storybook
2025-04-23 19:31:34 -07:00
Tessa Walsh
f34b42cb59
Add custom behavior docs to user guide (#2559) 2025-04-23 14:27:39 -04:00
Emma Segal-Grossman
76ab3e7eaa
Add grid view to collection list (#2403)
Closes #2498 

Yay for consistency!

## Changes

Adds a grid view to the collections list, alongside the default list
view.

- Refactors edit dialog into `collections-grid-with-edit-dialog`
component for dashboard — collections list already has its own edit
dialog, so no need for this to be duplicated in the grid component
- Adds getter/setter for `page` property of pagination component, which
fixes the dashboard not switching back to page 1 when switching between
"Public" and "All" collection views

## Manual testing

1. On the collections list page, click between "View as Grid" and "View
as List" in the toolbar
2. Verify that pagination, the collection editing dialog, and the action
menu works in grid view
3. On the dashboard in an org with multiple pages of collections, switch
to the second page of "All" collections, then switch back to "Public"
collections. Verify that the page search param disappears when switching
between views.

## Screenshots

| Page | Screenshot |
|--------|--------|
| Collection list | <img width="1282" alt="Screenshot 2025-04-17 at 3 46
55 PM"
src="https://github.com/user-attachments/assets/f6dff74f-d56e-48f6-8d44-11b84bacbafb"
/> |
| Collection list (detail) | <img width="165" alt="Screenshot 2025-04-17
at 3 46 29 PM"
src="https://github.com/user-attachments/assets/3442c5e4-a67f-46a2-b475-ee4d3d1e0259"
/> |

---



Remaining things to do:
- [x] Add full actions menu from list view to grid view, instead of just
having pencil icon
- [x] Reuse collection editing dialog from existing list view, instead
of the grid view having its own separate dialog instance
2025-04-23 14:08:50 -04:00
sua yoo
78e2dadf0a
devex: Add Storybook for component development (#2556)
Adds Storybook in preparation for UI component refactoring.
2025-04-21 13:06:31 -07:00
sua yoo
c2a11ccf10
deps: Upgrade main frontend dependencies (#2551)
- Upgrades typescript-eslint to a more performant version and related
dependencies. Note that these dependencies were not upgraded to the
latest version to avoid upgrading to eslint 9 at this time.
- Upgrades Lit one minor version
2025-04-15 13:31:50 -07:00
sua yoo
f2e6892729
fix: Update custom behavior file placeholder text (#2552)
Follows https://github.com/webrecorder/browsertrix/issues/2151

## Changes

Updates placeholder text for custom behavior files, since we now accept
JSON.
2025-04-09 21:41:53 +02:00
Emma Segal-Grossman
eeda4cd9ff
Persist pagination state in url (#2538)
Closes #1944 

## Changes
- Pagination stores page number in url search params, rather than
internal state, allowing going back to a specific page in a list
- Pagination navigation pushes to history stack, and listens to history
changes to be able to respond to browser history navigation
(back/forward)
- Search parameter reactive controller powers pagination component
- Pagination component allows for multiple simultaneous paginations via
custom `name` property

## Manual testing

1. Log in as any role
2. Go to one of the list views on an org with enough items in the list
to span more than one page
3. Click on one of the pages, and navigate back in your browser. The
selected page should respect this navigation and return to the initial
numbered page.
4. Navigate forward in your browser. The selected page should respect
this navigation and switch to the numbered page from the previous step.
5. Click on a non-default page, and then click on one of the items in
the list to go to its detail page. Then, using your browser's back
button, return to the list page. You should be on the same numbered page
as before.

---------

Co-authored-by: sua yoo <sua@suayoo.com>
2025-04-09 15:40:30 -04:00
sua yoo
b0d1a35563
fix: Handle no crawling defaults (#2549)
Fixes regression introduced by
7c6bae8d61

## Changes

Handles orgs without any crawl defaults correctly. Areas that use
crawling defaults are also more strongly typed now to prevent similar
issues.
2025-04-09 12:48:12 -04:00
Ilya Kreymer
0cb3bd19f6 version: update to 1.15.0 2025-04-09 12:28:01 +02:00
sua yoo
7c6bae8d61
feat: Add custom behaviors to org crawling defaults (#2546)
Resolves https://github.com/webrecorder/browsertrix/issues/2513

## Changes

- Allows org admins to set custom behaviors as crawling defaults
- Shows warning text if both autoscroll/autoclick and custom behaviors
are enabled
- Refactors `infoTextStrings` -> `infoTextFor` to match other
label/string matchers

---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-04-09 04:10:30 -04:00
Emma Segal-Grossman
0a0d2d04d3
Add basic opengraph & twitter card tags & image to browsertrix root (#2547)
Closes #2480 

Put together a quick opengraph (OG) image for Browsertrix: 

![browsertrix-og](https://github.com/user-attachments/assets/355aa810-3a3a-46b4-b43a-1a894fae8a6e)
2025-04-08 19:23:26 -04:00
sua yoo
58749602ff
Move custom behaviors behind checkbox (#2545)
WIP for https://github.com/webrecorder/browsertrix/issues/2541

## Changes

- Moves custom behaviors table to behind "Use Custom Behaviors"
checkbox.
- Updates autoclick selector to match checkbox reveal layout.
- Adds minimum viable user guide documentation of custom behaviors.
2025-04-09 00:16:02 +02:00
sua yoo
ba57b85322
feat: Display behavior logs (#2531)
- Displays behavior logs wherever error logs are shown
- Makes page URL in detail dialog clickable rather than in row column to
prevent accidental navigation
- Rename "Download Logs" -> "Download All Logs" and add tooltip with
additional context

---------

Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-04-08 14:38:59 -07:00
Tessa Walsh
55bedcb0b7
feat: Custom autoclick selector (#2517)
Resolves #2504

## Changes

- Allows users to customize autoclick selector in workflows
- Refactors `btrix-syntax-input` to support rendering label and help
text `sl-input`
- Show autoclick selector in workflow / crawl settings
- Adds 'clickSelector' with default of 'a' to backend crawl config.

---------

Co-authored-by: sua yoo <sua@suayoo.com>
Co-authored-by: sua yoo <sua@webrecorder.org>
Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
2025-04-08 05:53:40 +02:00
sua yoo
0aaae17110
fix: Enable saving workflow with default select links (#2537)
Allows users to save a workflow with an empty "Link Selectors" table,
using the default value. This is aligned with how we use default values
for other empty inputs, and prevents a case where a user may
inadvertently removed a row and now cannot save a workflow with the
default link selector.

Also updates the info text to show the default value.
2025-04-07 19:19:36 -07:00
Tessa Walsh
f84f6f55e0
Add basic backend validation for selectLinks (#2510)
Follow-up to #2152 

Related to https://github.com/webrecorder/browsertrix/pull/2487

This PR provides very basic validation of the `config.selectLinks`
argument on workflow creation and update. Namely, it checks that:
- `config.selectLinks` is not an empty array
- Each entry consists of two non-empty text sequences separated by `->`

At this point we're not validating the actual CSS selector on the
backend, though we could add that down the road.

Tests have been added accordingly.

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-04-07 21:36:05 +02:00
sua yoo
23f9e08a22
feat: Add custom behaviors to workflow (#2520)
Resolves https://github.com/webrecorder/browsertrix/issues/2151
Follows https://github.com/webrecorder/browsertrix/pull/2505

## Changes

- Allows users to set custom behaviors in workflow editor.
- Allows one or more behaviors, as simple URL or Git URL to be added
- Calls validation endpoint to check if URL is valid.

---------

Co-authored-by: emma <hi@emma.cafe>
2025-04-02 17:45:27 -07:00
sua yoo
f6481272f4
feat: Specify custom link selectors (#2487)
- Allows users to specify page link selectors in workflow "Scope"
section
- Adds new `<btrix-syntax-input>` component for syntax-highlighted
inputs
- Refactors highlight.js implementation to prevent unnecessary language
loading
- Updates exclusion table header styles

---------

Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
2025-04-02 00:32:34 -07:00
Ilya Kreymer
b5b4c4da15 version: update to 1.14.8 2025-03-31 14:17:53 -07:00
Ilya Kreymer
62e47a8817
support overriding crawler image pull policy per channel (#2523)
- add 'imagePullPolicy' field to each crawler channel declaration
- if unset, defaults to the setting in the existing
'crawler_image_pull_policy' field.

fixes #2522

---------

Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2025-03-31 14:11:41 -07:00
sua yoo
df8c80f3cc
task: Display built-in behaviors as list (#2518)
- Displays built-in behaviors as single field in workflow settings
- Standardizes how "None" is displayed in workflow settings
- Refactors behavior names into enum
2025-03-26 17:09:02 -07:00
Ilya Kreymer
b3950dd03f version: update to 1.14.7 2025-03-25 17:25:24 -07:00
Ilya Kreymer
46be6a0cf6 version: bump to 1.14.6 2025-03-20 16:52:20 -07:00
Henry Wilkinson
c797e8446d
docs: Add UI documentation page on status icons (#2506)
### Changes
- Adds status icons page
- Moves action menus page to the UI development docs folder
- Fixes sentence fragment
2025-03-20 16:51:20 -07:00
Henry Wilkinson
c770b9ec22
frontend: move name field to the top of the signup form (#2508)
Fixes #2507

Does what it says on the tin!
2025-03-20 16:50:43 -07:00
Henry Wilkinson
cf6690e74a
docs: add development section on action menus (#2429)
Closes #2428
2025-03-19 18:46:09 -04:00
Ilya Kreymer
c9c32d86e2
login: don't set default slug if user not part of any orgs #2491 (#2492)
if logged in user is not part of any orgs, still allow logging in,
instead of throwing an exception due to accessing non-existent org

---------

Co-authored-by: sua yoo <sua@suayoo.com>
2025-03-19 15:23:16 -07:00
sua yoo
0bc210d905
devex: Add frontend code snippet & update dev docs (#2494)
- Adds VSCode file template for component unit testing.
- Updates development docs with details on UI dev
2025-03-19 14:22:20 -07:00
Emma Segal-Grossman
b471192cbc
Workflow editor footer button: ensure isCrawlRunning is false if editing a new workflow (#2496)
Reported by @tw4l 

Quick fix for the bug I introduced in 1bc3c35 in #2481. I didn't
properly test on the workflow editor in a "new workflow" state, and
didn't realize that the component that fetches the workflow state for an
existing workflow wouldn't be rendered for a new workflow, so the update
to the loading state never occurred for new workflows. This fix
explicitly sets `isCrawlRunning` to `false` instead of `null` for new
workflows, so that the loading state isn't displayed.

Tested locally with both new and existing workflows (in both non-running
and running states).
2025-03-19 15:44:16 -04:00
Ilya Kreymer
eb300815a7
Fixes #2488 (#2493)
- Fixes #2488 
- Adds a k8s api call to set `suspend=false` on Job when associated
CrawlJob is finished.
- bump version - released as 1.14.5
2025-03-19 10:06:25 -07:00
sua yoo
d2601a037e
feat: Show running crawl when editing workflow (#2481)
Part of https://github.com/webrecorder/browsertrix/issues/2366

## Changes

- Displays latest running crawl status when editing workflow
- Disables "Run Now" button if crawl is currently running

Currently, clicking "Run Now" will result in a preventable server error
if the crawl is already running. The change in this PR is in preparation
for being able to update a currently running crawl and doesn't require
any backend changes.

## Manual testing

1. Log in as crawler
2. Go to edit crawl workflow
3. Open same workflow in another tab
4. Run the workflow
5. Go back to edit tab. Verify "Starting" status is shown next to "Save"
button and "Run Crawl" button is disabled

## Screenshots

| Page | Image/video |
| ---- | ----------- |
| Edit Workflow | <img width="354" alt="Screenshot 2025-03-11 at 1 34
07 PM"
src="https://github.com/user-attachments/assets/02f7fb4a-219d-43a4-bb1f-1f2b40ac1480"
/> |


<!-- ## Follow-ups -->

---------

Co-authored-by: emma <hi@emma.cafe>
2025-03-18 18:54:04 -04:00
Emma Segal-Grossman
89a6e84377
Fix broken thumbnail images not taking up appropriate size on ff (#2486)
Closes #2485 

Also adds alt text to collection thumbnail images.
2025-03-18 18:53:10 -04:00
sua yoo
bcb73932d4
docs: Organize readme and fix doc links (#2479)
Resolves https://github.com/webrecorder/browsertrix/issues/2478

## Changes

- Organizes README
- Fixes relative links in mkdocs

---------

Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2025-03-11 18:37:20 -07:00
Emma Segal-Grossman
b2c5b9bc59
Hide breadcrumbs for private orgs (#2477)
Hides "Back to [org name]" breadcrumb when viewing a public/unlisted
collection when the public gallery isn't enabled for the org (except
when logged into that org).
2025-03-11 15:05:35 -04:00
sua yoo
ac1236f15b
feat: Add behaviors section to workflow form (#2464)
- Moves "Per-Page Limits" fields to new "Page Behavior" section
- Fixes workflow settings closing tags with refactor to how sections are
rendered
- Updates user guide with behaviors documentation

---------

Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
2025-03-11 11:40:20 -07:00
Ilya Kreymer
d8365c734f version: bump to 1.14.4 2025-03-08 15:58:18 -08:00
Ilya Kreymer
00a42515c8
docs: add public collections gallery howto (#2462)
- Updated how collections gallery and presentation and sharing pages
- Collections gallery page content extracted from blog post, linked from blog post
- Each page has one video covering the gallery setting and individual collection presentation
- Cleaned up text on both to avoid duplicated content (thanks @DaleLore)



---------

Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
Co-authored-by: DaleLore <DaleLoreNY@gmail.com>
2025-03-08 15:57:13 -08:00
Ilya Kreymer
75eb04c37b
Translations update from Hosted Weblate (#2467) (#2471)
Translations update from [Hosted Weblate](https://hosted.weblate.org)
for

[Browsertrix/Browsertrix](https://hosted.weblate.org/projects/browsertrix/browsertrix/).



Current translation status:

![Weblate translation

status](https://hosted.weblate.org/widget/browsertrix/browsertrix/horizontal-auto.svg)

---------

Co-authored-by: Weblate (bot) <hosted@weblate.org>
Co-authored-by: Anne Paz <anelisespaz@gmail.com>
Co-authored-by: weblate <1607653+weblate@users.noreply.github.com>
2025-03-07 12:40:43 -08:00
Emma Segal-Grossman
8078f3866b
Add missing "payment never made" subscription status to superadmin org list (#2457) 2025-03-07 12:38:09 -08:00
sua yoo
fa05d68292
fix: Open and highlight correct workflow form section on tab click (#2463)
Fixes https://github.com/webrecorder/browsertrix/issues/2461

## Changes

Opens workflow form section when clicking on section navigation link,
fixing issue with scroll position impacting unopened panels.
2025-03-07 12:35:24 -08:00
Ilya Kreymer
9466e83d18 version: bump to 1.14.3 2025-03-03 15:20:40 -08:00
sua yoo
65a40c4816
feat: Show additional collection details (#2455)
Resolves https://github.com/webrecorder/browsertrix/issues/2452

## Changes

- Displays page count and collection size in listing grid
- Displays month if collection period is in the same year
- Displays collection size in About > Details section
- Minor refactor: move byte formatting into `localize.ts` utility file,
move slash (`/`) separator into own utility file
2025-03-03 13:15:27 -08:00
Ilya Kreymer
631b019baf
optimize public collection loading: (#2444)
- remove query for /collections endpoint just to get the org name
- add orgName to single /collection endpoint, where it is already
available on the backend
2025-03-03 10:13:30 -08:00
Ilya Kreymer
2e86ee3fcc
Weblate (#2450)
Translations update from [Hosted Weblate](https://hosted.weblate.org)
for
[Browsertrix/Browsertrix](https://hosted.weblate.org/projects/browsertrix/browsertrix/).

Current translation status:

![Weblate translation
status](https://hosted.weblate.org/widget/browsertrix/browsertrix/horizontal-auto.svg)

Co-authored-by: Weblate (bot) <hosted@weblate.org>
Co-authored-by: Anne Paz <anelisespaz@gmail.com>
Co-authored-by: weblate <1607653+weblate@users.noreply.github.com>
2025-03-02 19:46:00 -08:00
Ilya Kreymer
64621ba6c0
frontend: fix rendering when backend not available yet (#2448)
- don't wait for languages to be ready to render UI, as this can result
in empty page if backend can not be reached.
- catch if /api/settings returns an invalid response to show 'backend
initializing' message
- will support initContainers where backend may return 5xx error while
backend is initializing, via #2449

Note: this results in locale picker showing all available locales if
backend is not available, not just filtered ones, but I think that's a
reasonable trade-off.
2025-03-01 14:02:37 -08:00
Emma Segal-Grossman
53b531ce3e
Show download button on public collection pages regardless of collection access (#2442)
Reported here
https://discord.com/channels/895426029194207262/1011678975636013066/1345095899008860224

Public-facing collections (whether public or unlisted) should have the
download button visible if "show download button" is enabled.
2025-02-28 22:07:38 -08:00
Ilya Kreymer
cb52da66dc version: bump to 1.14.2 2025-02-27 14:13:03 -08:00
Ilya Kreymer
376c9981dc version: bump to 1.14.1 2025-02-26 23:15:01 -08:00
Emma Segal-Grossman
00e85c3e94
Add "Copy <item type> ID" to a bunch of menus (#2426)
Addresses feedback from here
https://discord.com/channels/895426029194207262/910966759165657161/1344367205004873819
by @tw4l.

Add "Copy <item type> ID" to a bunch of menus, including all list and
detail pages, as well as all other item/crawl/page lists.

| Screenshots |
|--------|
| <img width="323" alt="Screenshot 2025-02-26 at 3 56 48 PM"
src="https://github.com/user-attachments/assets/32044c47-65f3-4e80-8f39-df5fd2101324"
/> |
| <img width="246" alt="Screenshot 2025-02-26 at 4 02 06 PM"
src="https://github.com/user-attachments/assets/8f2d6272-f450-4923-b5c9-751a2eea9a26"
/> |
| <img width="419" alt="Screenshot 2025-02-26 at 4 02 55 PM"
src="https://github.com/user-attachments/assets/0c005a33-055d-4fb7-a79e-9bedae57b785"
/> |
| <img width="1104" alt="Screenshot 2025-02-26 at 1 57 01 PM"
src="https://github.com/user-attachments/assets/7ee43400-1b30-4c78-89a0-3ddb89ef90ca"
/> |
| <img width="292" alt="Screenshot 2025-02-26 at 4 01 10 PM"
src="https://github.com/user-attachments/assets/929f7870-aa83-4f3c-947a-efad377e0b49"
/> |
| <img width="240" alt="Screenshot 2025-02-26 at 4 03 19 PM"
src="https://github.com/user-attachments/assets/45bff838-f741-45ce-b1a7-a8cfefa9656b"
/> |

---------

Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
2025-02-26 16:58:00 -05:00
Ilya Kreymer
e67708bd4f version: update to 1.14.0 2025-02-24 14:49:46 -08:00
Henry Wilkinson
c56481fc66
Add deepLink attribute to public collection replay embed (#2420)
### Changes

- Public collections can now be deeplinked

### Caveats

- When users click the _About this Collection_ tab and then return to
the _Browse Collection_ tab, the deeplink is gone until they visit
another page.
2025-02-24 14:33:39 -08:00
Ilya Kreymer
8a507f0473
Consolidate list page endpoints + better QA sorting + optimize pages fix (#2417)
- consolidate list_pages() and list_replay_query_pages() into
list_pages()
- to keep backwards compatibility, add <crawl>/pagesSearch that does not
include page totals, keep <crawl>/pages with page total (slower)
- qa frontend: add default 'Crawl Order' sort order, to better show
pages in QA view
- bgjob: account for parallelism in bgjobs, add logging if succeeded
mismatches parallelism
- QA sorting: default to 'crawl order' by default to get better results.
- Optimize pages job: also cover crawls that may not have any pages but have pages listed in done stats
- Bgjobs: give custom op jobs more memory
2025-02-21 13:47:20 -08:00
sua yoo
06f6d9d4f2
feat: Move admin route to own namespace (#2405)
Resolves https://github.com/webrecorder/browsertrix/issues/2382

## Changes
- Moves superadmin to `/admin` URL namespace
- Removes superadmin views from main webpack chunks
2025-02-20 18:43:31 -08:00
sua yoo
8db80f5570
feat: Workflow form collapsible section enhancements (#2381)
Resolves https://github.com/webrecorder/browsertrix/issues/2359

## Changes

- Track when a workflow form section is opened
- Hide workflow form section navigation on small screens

---------

Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>
Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-02-20 18:42:00 -08:00
Ilya Kreymer
3ca68bf1d2 version: 1.14.0-beta.6 2025-02-20 15:37:33 -08:00
Tessa Walsh
f8fb2d2c8d
Rework crawl page migration + MongoDB Query Optimizations (#2412)
Fixes #2406 

Converts migration 0042 to launch a background job (parallelized across
several pods) to migrate all crawls by optimizing their pages and
setting `version: 2` on the crawl when complete.

Also Optimizes MongoDB queries for better performance.

Migration Improvements:

- Add `isMigrating` and `version` fields to `BaseCrawl`
- Add new background job type to use in migration with accompanying
`migration_job.yaml` template that allows for parallelization
- Add new API endpoint to launch this crawl migration job, and ensure
that we have list and retry endpoints for superusers that work with
background jobs that aren't tied to a specific org
- Rework background job models and methods now that not all background
jobs are tied to a single org
- Ensure new crawls and uploads have `version` set to `2`
- Modify crawl and collection replay.json endpoints to only include
fields for replay optimization (`initialPages`, `pageQueryUrl`,
`preloadResources`) if all relevant crawls/uploads have `version` set to
`2`
- Remove `distinct` calls from migration pathways
- Consolidate collection recompute stats

Query Optimizations:
- Remove all uses of $group and $facet
- Optimize /replay.json endpoints to precompute preload_resources, avoid
fetching crawl list twice
- Optimize /collections endpoint by not fetching resources 
- Rename /urls -> /pageUrlCounts and avoid $group, instead sort with
index, either by seed + ts or by url to get top matches.
- Use $gte instead of $regex to get prefix matches on URL
- Use $text instead of $regex to get text search on title
- Remove total from /pages and /pageUrlCounts queries by not using
$facet
- frontend: only call /pageUrlCounts when dialog is opened.


---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>
2025-02-20 15:26:11 -08:00
Ilya Kreymer
f7cd476b1a
Additional French Translations from Weblate (#2410)
Co-authored-by: Weblate (bot) <hosted@weblate.org>
Co-authored-by: Bricaud Frédéric <frederic.bricaud@banq.qc.ca>
Co-authored-by: Webrecorder Dev <dev@webrecorder.org>
Co-authored-by: Carole Gagné <carole.gagne@banq.qc.ca>
Co-authored-by: weblate <1607653+weblate@users.noreply.github.com>
2025-02-20 11:04:34 -08:00
Emma Segal-Grossman
905fe059a4
Add superadmin instance stats card (#2404)
Closes #2401


https://github.com/user-attachments/assets/cbd288d7-8e9c-4e86-ae87-6a308f6bdd58
2025-02-18 17:29:26 -05:00
Emma Segal-Grossman
f1dc790ab4
Org dashboard: update collection grid empty text state when view is set to "all" (#2402)
Tested locally.

cc @SuaYoo
2025-02-17 21:05:48 -05:00
Ilya Kreymer
a7c8ca4028 version: bump to 1.14.0-beta.1 2025-02-17 16:48:27 -08:00
Emma Segal-Grossman
629cf7c404
Add a small sticky banner when logged in as superadmin (#2393)
While ideally we don't need to use superadmin for many things, there are
still a lot of places where it's necessary, especially around customer
service. This makes it a little more visible when that's the case, just
as a reminder. I could see this coming in handy especially for newer
people who might not have the experience to know to look for the "admin"
and "running crawls" buttons.

<img width="1088" alt="Screenshot 2025-02-13 at 1 12 58 PM"
src="https://github.com/user-attachments/assets/70b975e1-af6b-4e8c-9e49-52c4c66e9721"
/>
2025-02-17 17:42:36 -05:00
Emma Segal-Grossman
44ca293999
Replace 2-digit years with numerical years everywhere in the frontend (#2394)
Closes #2365

---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-02-13 22:23:13 -08:00
Tessa Walsh
39d99e7c5d
Add support for custom link selectors to backend (#2346)
Related to #2152 

This PR adds backend support for custom link selectors via `selectLinks`
on the crawl workflow config. Tests have been updated as well.

It also adds `selectLinks` to the frontend in a minimal and for now
hardcoded way that we can use as a basis for proper frontend support
moving forward.

---------

Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>
2025-02-13 22:22:27 -08:00
Emma Segal-Grossman
659e124168
Disable "Update collection thumbnail" checkbox on initial page selection dialog until thumbnail is loaded (#2392)
Closes #2391
2025-02-13 22:03:13 -08:00
Emma Segal-Grossman
0f2da4f785
Allow showing all collections as well as just public ones in org dashboard (#2379)
Adds a switch to switch between viewing public collections only
(default) and all collections on org dashboard.

Also updates the `house-fill` icon to `house` in a couple places
(@Shrinks99)

---------

Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
2025-02-13 21:59:29 -08:00
Ilya Kreymer
4516268a70
misc fixes: cors + disable buffering for uploads (#2395)
- ensure pages endpoint support CORS for local dev
- disable proxy request buffering to support large uploads
2025-02-13 19:38:20 -08:00
Ilya Kreymer
b121076e63
quickfix: add missing dependency for docs (#2388)
follow-up to #2368:
- add mkdocs-redirect to frontend Docker, docs build ci
- build frontend when changing mkdocs
2025-02-12 16:39:06 -05:00
Henry Wilkinson
edf1edbbd1
docs: Add Documentation for Sharing Collections (#2368)
- Merges existing collection content into one page
- Updates ArchiveWeb.page link
- Adds redirect from /collections → /collection
- Moves content relevant to presentation & sharing out of the intro
- Adds new content about sharing collections!

---------

Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
Co-authored-by: sua yoo <sua@webrecorder.org>
2025-02-12 14:05:52 -05:00
sua yoo
f7b9b73a68
fix: Sort filtered collection page URLs (#2384)
Fixes https://github.com/webrecorder/browsertrix/issues/2383

- Fixes unpredictable sort order when typing in collection page URL
- Fixes page URL results flickering in and out while typing

---------

Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2025-02-12 11:59:20 -05:00
Ilya Kreymer
5b02d81991
ensure collection is fully reloaded after an archived item is added o… (#2386)
…r removed

follow up to #2332

Testing:
1. Add or remove an archived item.
2. Switch to Replay view. Collection should reload and update the page
list.
2025-02-11 23:12:47 -08:00
Henry Wilkinson
3586412da1
docs: Adds section for autoclick behavior addition from 1.13.3 (#2385)
- Adds section for the autoclick behavior 
- Removes sections that were removed with the new workflow form... and
in some cases much earlier! 😅
2025-02-12 00:22:05 -05:00
sua yoo
7ce115588e
fix: Update links to running crawls (#2378)
- Updates links to running crawls to redirect to workflow "Watch" tab
- Removes unused "Jump to crawl" superadmin widgets
- Refactors archived item component to remove references to active
crawls
2025-02-11 17:08:27 -08:00
sua yoo
0e04fd98b1
fix: More accurate archived item details (#2364)
- Moves page count out from under "Size" label in archived item detail
- Renames "Pages Crawled" to "Pages" in archived item leading heading
and detail overview
- Renames "Crawl ID" to "Archived Item ID"

---------

Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
2025-02-11 16:46:13 -08:00