Commit Graph

1537 Commits

Author SHA1 Message Date
sua yoo
83211b2f19
fix: Re-enable workflow setup guide button (#2358)
Fixes workflow setup guide not showing when button is clicked
2025-02-03 21:10:30 -08:00
Ilya Kreymer
ea3b5e7322 quickfix: fix typo (missing self) that did not make it into #2351 2025-01-30 13:11:42 -08:00
Tessa Walsh
0a8df62ab4
Ensure collection stats are updated when WACZ is added on upload (#2351)
Fixes #2350 

Collection earliest/latest dates and the collection modified date are
also now updated when crawls or uploads are added to a collection via
the collection auto-add feature.
2025-01-30 13:05:56 -08:00
Tessa Walsh
b0aebb599a
Reformat with Black for 2025 ruleset (#2349) 2025-01-29 16:57:06 -05:00
Ilya Kreymer
514811701f
Translations update from Hosted Weblate (#2317) (#2343)
Translations update from [Hosted Weblate](https://hosted.weblate.org)
for

[Browsertrix/Browsertrix](https://hosted.weblate.org/projects/browsertrix/browsertrix/).

Current translation status:

![Weblate translation

status](https://hosted.weblate.org/widget/browsertrix/browsertrix/horizontal-auto.svg)

---------

Co-authored-by: Weblate (bot) <hosted@weblate.org>
Co-authored-by: Bricaud Frédéric <frederic.bricaud@banq.qc.ca>
Co-authored-by: Webrecorder Dev <dev@webrecorder.org>
2025-01-27 20:43:42 -08:00
Ilya Kreymer
4fa3bc492f
cleanup of loc messages that resulted in errors in some translations (#2340)
- remove str`` where it is not needed
- resolve templates to use simple variable in str``
- combine into single str``
2025-01-27 20:10:47 -08:00
sua yoo
3c860775b9
feat: Update references to org public profile -> gallery (#2330)
- Renames public URL prefix to `explore`
- Updates org settings sections
- Removes or renames references to "org profile"

---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
2025-01-27 13:48:38 -08:00
sua yoo
84ae73df18
feat: UX improvements to collections with single URL (#2325)
Resolves https://github.com/webrecorder/browsertrix/issues/2322

## Changes

- Sets default start page if collection only contains one page
- Removes status code from snapshot options
2025-01-25 17:18:22 -08:00
Tessa Walsh
9363095d62
Validate exclusion regexes on backend (#2316) 2025-01-23 13:32:54 -05:00
Tessa Walsh
763c654484
feat: Update collection sorting, metadata, stats (#2327)
- Refactors dashboard and org profile preview to use private API
endpoint, to fix public collections not showing when the org
visibility is hidden
- Adds additional sorting options for collections
- Adds unique page url counts for archived items, collections, and
organizations to backend and exposes this in collections
- Shows collection period (i.e. `dateEarliest` to `dateLatest`) in
collections list
- Shows same collection metadata in private and public views, updates
private view info bar
- Fixes "Update Org Profile" action item showing for crawler roles

---------

Co-authored-by: sua yoo <sua@webrecorder.org>
Co-authored-by: sua yoo <sua@suayoo.com>
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-01-23 13:32:23 -05:00
sua yoo
f8976e688a
fix: Use default collection thumbnail if selected (#2331)
Fixes issue where collection thumbnail is always the screenshot, even if
a Browsertrix provided default thumbnail is selected after choosing the
screenshot.
2025-01-22 14:02:56 -08:00
Ilya Kreymer
28d39d8c4d
Fix migration to avoid duplicate collection slugs and names (#2318)
Follow-up to #2301 

Updates the 0039 migration to ensure collection slugs and names are
unique by:
- Removing all indexes
- Setting `slug` to random value
- Adding unique index to `slug` field.
- Attempting to set slug from name using `slug_from_name()`
- If rejected due to duplicate, append `-<counter>` at end of slug. Also
update name with ` <counter>`.
- Now that names should also be unique, add unique index on name field.

---------

Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2025-01-21 14:23:32 -08:00
Tessa Walsh
6797b41de0
Add pageCount to crawls and uploads and use in frontend for page counts (#2315)
Fixes #2257 

This is a follow-up to the public collections work, which adds pages to
the database for uploads. All crawls and uploads now have a `pageCount`
field which is populated when the item is successfully added. A new
migration is also added to populate the field for existing archived
items that don't have it set yet.

OrgMetrics have also been modified to include `crawlPageCount` and
`uploadPageCount`, and to include the total of both in `pageCount`, and
all three included in the frontend org dashboard.

The frontend has been updated to use `pageCount` rather than
`stats.done` wherever appropriate, meaning that in archived item lists
and details we now have a consistent page count for both crawls and
uploads.

### New functionality

- Deploy this branch
- Create new crawls and uploads and verify that page count appears
correctly throughout the frontend for all new crawls and uploads

### Migration

- Deploy from latest main
- Create some crawls and uploads
- Change to this branch and re-deploy
- Verify migration ran without errors in backend logs
- Verify that page count has been populated successfully by checking
archived items lists, crawl and upload detail pages, and dashboard to
ensure there are no longer any missing page counts.

---------

Co-authored-by: emma <hi@emma.cafe>
2025-01-16 14:41:14 -08:00
Tessa Walsh
5684e896af
Add support for autoclick (#2313)
Fixes #2259 

This PR brings backend and frontend support for the new autoclick
behavior in Browsertrix, introduces in Browsertrix 1.5.0+

On the backend, we introduce `min_autoclick_crawler_image` to
`values.yaml`, with a default value of
`"docker.io/webrecorder/browsertrix-crawler:1.5.0"`. If this is set and
the crawler version for a new crawl is less than this value, the
autoclick behavior is removed from the behaviors list in the configmap
created for the crawl.

The one caveat for this is that a crawler image tag like "latest" will
always be parsed as greater than `min_autoclick_crawler_image`, so there
is the potential for the crawler to run into issues if using a
non-numeric image tag with an older version of the crawler. For
production we use hardcoded specific versions of the crawler except for
the dev channel, which from here on out will including autoclick
support, so I think this should be okay (and is also true of the
existing implementation for checking `min_qa_crawler_image`).

On the frontend, I've added a checkbox (unchecked by default) in the
"Limits" section just below the current checkbox for autoscroll. We
might want to move these to a different section eventually - I'm not
sure Limits is the right place for them - but I wanted to be consistent
with things as they are.

---------

Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>
2025-01-16 12:44:00 -08:00
Dmitriy Pertsev
246bcc73c5
Use new ingressClassName only by default (#2268)
- By default, use only `ingressClassName` for ingress class name and
corresponding field in cert-manager
- Only use old 'kubernetes.io/ingress.class' if
ingress.useOldClassAnnotation is set
- Allow for using old annotation only for backwards compatibility, eg.
for GCP
- Closes #2267 and #1570

---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-01-15 23:23:50 -08:00
Ilya Kreymer
bce75b35fa
Translations update from Hosted Weblate (#2296) (#2314)
Translations update from [Hosted Weblate](https://hosted.weblate.org)
for

[Browsertrix/Browsertrix](https://hosted.weblate.org/projects/browsertrix/browsertrix/).



Current translation status:

![Weblate translation

status](https://hosted.weblate.org/widget/browsertrix/browsertrix/horizontal-auto.svg)

---------

Co-authored-by: Weblate (bot) <hosted@weblate.org>
Co-authored-by: Bricaud Frédéric <frederic.bricaud@banq.qc.ca>
Co-authored-by: Carole Gagné <carole.gagne@banq.qc.ca>
Co-authored-by: Webrecorder Dev <dev@webrecorder.org>
Co-authored-by: weblate <1607653+weblate@users.noreply.github.com>
2025-01-15 23:19:02 -08:00
sua yoo
a64f3a6c4c
fix: Fully load thumbnail before save (#2307)
Fixes https://github.com/webrecorder/browsertrix/issues/2306

## Changes

Refactors collection view configuration to wait for thumbnail preview
image (using `URL.createObjectURL`, like in QA screenshots) to be fully
loaded from `replay-web-page` before saving.
2025-01-15 22:58:32 -08:00
Tessa Walsh
4583babecb
feat: Add slug to collections and use it in public collection URLs (#2301)
Resolves https://github.com/webrecorder/browsertrix/issues/2298

## Changes

- Slugs added to collections, can be specified separately when creating
or updating collections or else is based off of supplied collection name
- Migration added to backfill slugs for existing collections
- Redirect collection to newest slug if changed
- Adds option to copy public profile link to "Public Collections" action
menu
- Show "Back to <Org>" link instead of breadcrumbs

---------
Co-authored-by: sua yoo <sua@suayoo.com>
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-01-15 22:44:32 -08:00
sua yoo
21db8e1b83
fix: Fix workflow crawl list layout (#2309)
- Fixes workflow detail page crawls tab issue when the crawls list is
long
- Removes extraneous and incorrectly placed spinner
2025-01-15 09:23:18 -08:00
Henry Wilkinson
06eea7979a
ui: Replaces boring thumbnail gradients with fun squiggles! (#2305)
- Updates thumbnails
- Bonus ~30% size reduction per image due to better dialed in
compression settings!
2025-01-14 16:27:13 -05:00
sua yoo
dd22fd11ee
deps: Improve Webpack build performance (#2288)
- Upgrades webpack and webpack tool versions
- Updates dev source map to webpack recommendation
- Implements `webpack.DllPlugin` in dev for faster rebuilds
- Implements `thread-loader` to run `ts-loader` in a worker pool
2025-01-14 12:55:12 -08:00
sua yoo
c53528332b
fix: Validate collection page URL (#2291)
- Disables saving collection start page if valid snapshot is not
selected
- Shows full URL in page URL status check mark
- Shows error in page URL status exclamation mark
- Fixes pasting in URL
2025-01-14 12:54:33 -08:00
Emma Segal-Grossman
318acaf5b3
Ensure PR workflows can run on all PRs (but still skip when they're not needed) (#2237)
Closes #2279

This adds a `paths-filter` step to all workflows that run on PRs so that
we can enable auto-merge, for more info about this read [enabling
auto-merge](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/incorporating-changes-from-a-pull-request/automatically-merging-a-pull-request)!
In order to be able to have required checks, a workflow can't be
entirely skipped: see [Handling skipped but required
checks](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/collaborating-on-repositories-with-code-quality-features/troubleshooting-required-status-checks#handling-skipped-but-required-checks).

This also merges frontend CI workflows into one with multiple parallel
steps, which should speed things up a bit. It also upgrades node
versions to 20 and 22 across the board.
2025-01-14 14:24:27 -05:00
sua yoo
c563b622fe
refactor: Update component used in tabbed views (#2300)
- Refactors instances of `btrix-tab-list` except in workflow editor in
preparation for https://github.com/webrecorder/browsertrix/issues/2169
- Removes the visual space above navigation item since many tab headings
describe the first section in the tab, rather than the entire tab itself
2025-01-14 10:23:19 -08:00
sua yoo
a028ed1808
refactor: Update collections list empty state (#2303)
Makes collection list empty state more consistent with other empty
states.
2025-01-14 08:53:28 -08:00
sua yoo
4347fcdba5
feat: Show collection created date (#2302)
- Shows collection created date in detail view (if present)
- Adds `black` formatter to vscode extension recommendations

---------

Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2025-01-14 11:22:00 -05:00
Tessa Walsh
cbcf087a48
Add last crawl and subscription status indicators to org list (#2273)
Fixes #2260 

- Adds `lastCrawlFinished` to Organization model, updated after crawls
are added/deleted and with an idempotent migration to backfill existing
orgs
- Adds Last Crawl column to end of admin orgs list table
- Adds subscription icon next to existing status icon in orgs list
- Adds "lastCrawlFinished", "subscriptionStatus", and "subscriptionPlan"
sort options to orgs list backend endpoint in anticipation of future
sorting/filtering of orgs list

---------

Co-authored-by: emma <hi@emma.cafe>
Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2025-01-14 10:57:06 -05:00
Emma Segal-Grossman
04e9127d35
Remove ANALYTICS_NAMESPACE, as it's only usable at build time (#2293)
Replaces `ANALYTICS_NAMESPACE` with setting `window.btrixEvent` via
`inject_extra` config

---------

Co-authored-by: sua yoo <sua@suayoo.com>
2025-01-13 20:13:30 -08:00
Ilya Kreymer
12f358b826
Merge pull request #2271 from webrecorder/public-collections-feature
feat: Public collections, includes:
- feat: Public org profile page #2172
- feat: Collection thumbnails, start page, and public view updates #2209
- feat: Track collection events #2256
2025-01-13 19:32:45 -08:00
Ilya Kreymer
bab5345ad5 version: bump to 1.14.0-beta.0 for public collections! 2025-01-13 19:29:54 -08:00
Henry Wilkinson
56a634e593
ui: Public Collections UI Nitpicks (#2287)
- Removes share link from the dialogue footer
- Removes stickied collection navigation, replaces with improved
viewport-based scaling!
- Adds a max-width for the collection description in the logged in view.
- Moves the markdown editor buttons to below the editor
- Controls are now In-line with how we handle dialogue options
elsewhere, fixes a minor responsive design issue.
- Minor copy changes

---------

Co-authored-by: emma <hi@emma.cafe>
Co-authored-by: sua yoo <sua@webrecorder.org>
2025-01-13 15:15:49 -08:00
Tessa Walsh
d8655d3bc6
Use id for thumbnail size error detail 2025-01-13 15:15:49 -08:00
Tessa Walsh
be9ff04ee8
Make more explicit error message for large thumbnails 2025-01-13 15:15:49 -08:00
sua yoo
0c81a2f89e
chore: Refactor page headers (#2282)
- Refactors all page headers to use new `pageHeader`
- Removes border under org name/title in the org dashboard
2025-01-13 15:15:49 -08:00
sua yoo
b36ed9f730
feat: Track collection events (#2256)
- Renames `inject_analytics` to `inject_extra` and updates docs
- Manually tracks page views to enable passing custom props
- Tracks copying collection share link and downloading a public
collection

---------

Co-authored-by: emma <hi@emma.cafe>
2025-01-13 15:15:49 -08:00
Tessa Walsh
eb88e9f90c
Add missing os import 2025-01-13 15:15:48 -08:00
sua yoo
093b114479
feat: Collection thumbnails, start page, and public view updates (#2209)
- Allows user to choose collection replay home page and collection
thumbnail (resolves
https://github.com/webrecorder/browsertrix/issues/2182)
- Displays collection thumbnails on org dashboard and public page
- Enables downloading public collection (resolves
https://github.com/webrecorder/browsertrix/issues/2233)
- Adds caption as "Summary" to metadata dialog
- Moves description editor to "About" tab

---------

Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
2025-01-13 15:15:48 -08:00
Tessa Walsh
a031fab313
Backend work for public collections (#2198)
Fixes #2182 

This rather large PR adds the rest of what should be needed for public
collections work in the frontend.

New API endpoints include:

- Public collections endpoints: GET, streaming download
- Paginated list of URLs in collection with snapshot (page) info for
each
- Collection endpoint to set home URL
- Collection endpoint to upload thumbnail as stream
- DELETE endpoint to remove collection thumbnail

Changes to existing API endpoints include:

- Paginating public collection list results
- Several `pages` endpoints that previously only supported `/crawls/` in
their path, e.g. `/orgs/{oid}/crawls/all/pages/reAdd`, now support
`/uploads/` and `/all-crawls/` namespaces as well. This is necessitated
by adding pages for uploads to the database (see below). For
`/orgs/{oid}/namespace/all/pages/reAdd`, `crawls` or `uploads` will
serve as a filter to only affect crawls of that given type. Other
endpoints are more liberal at this point, and will perform the same
action regardless of the namespace used in the route (we'll likely want
to change this in a follow-up to be more consistent).
- `/orgs/{oid}/namespace/all/pages/reAdd` now kicks off a background job
rather than doing all of the computation in an asyncio task in the
backend container. The background job additionally updates collection
date ranges, page/size counts, and tags for each collection in the org
after pages have been (re)added.

Other big changes:

- New uploads will now have their pages read into the database!
Collection page counts now also include uploads
- A migration was added to start a background job for each org that will
add the pages for previously-uploaded WACZ files to the database and
update collections accordingly
- Adds a new `ImageFile` subclass of `BaseFile` for thumbnails that we
can use for other user-uploaded image files moving forward, with
separate output models for authenticated and public endpoints
2025-01-13 15:15:48 -08:00
sua yoo
f60a99cc26
feat: Make collection public (#2208) 2025-01-13 15:15:48 -08:00
sua yoo
6e48f957f9
feat: Public org profile page (#2172)
- Enables creating a public org profile page with description and
website at `/profile/<org slug>`
- Updates current "Overview" page to be "Dashboard", found under
`/dashboard`
- Organizes org "General" settings tab by "General", "Profile", and
"Developer Tools"
- Adds sign up banner to log in page for consistent CTA banners
- Updates copy and docs to support changes
- Allows user to set collection to private, public, or unlisted
- Adds route for public collection page with basic page layout
- Refactors copy button to abstract clipboard functionality
---------

Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
Co-authored-by: emma <hi@emma.cafe>
2025-01-13 15:15:48 -08:00
Tessa Walsh
190bdeb868
Add public API endpoint for public collections (#2174)
Fixes #1051 

If org with provided slug doesn't exist or no public collections exist
for that org, return same 404 response with a detail of
"public_profile_not_found" to prevent people from using public endpoint
to determine whether an org exists.

Endpoint is `GET /api/public-collections/<org-slug>` (no auth needed) to
avoid collisions with existing org and collection endpoints.
2025-01-13 15:15:48 -08:00
Tessa Walsh
42ebfd303d
Make changes to collections to support publicly listed collections (#2164)
Fixes #2158 

- Adds `Organization.listPublicCollections` field and API endpoint to
update it
- Replaces `Collection.isPublic` boolean with `Collection.access`
(values: `private`, `unlisted`, `public`) and add database migration
- Update frontend to use `Collection.access` instead of `isPublic`,
otherwise not changing current behavior

---------

Co-authored-by: sua yoo <sua@suayoo.com>
2025-01-13 15:15:47 -08:00
Emma Segal-Grossman
19c1d28349
Fix language selector using locale instead of lang (#2294)
Also fixes a shoelace menu-item bug where checkbox menu items would have
their checked state flipped on click, regardless of `checked` value.

~~Deploying to dev to test if this fixes language switching...~~ Yep!
Seems to fix the issues.
2025-01-13 15:01:57 -05:00
Ilya Kreymer
a21b2ff0df version: bump to 1.13.2 2025-01-08 22:58:33 -08:00
Ilya Kreymer
85e400d31a
Translations update from Hosted Weblate (#2254) (#2292)
Translations update from [Hosted Weblate](https://hosted.weblate.org)
for

[Browsertrix/Browsertrix](https://hosted.weblate.org/projects/browsertrix/browsertrix/).



Current translation status:

![Weblate translation

status](https://hosted.weblate.org/widget/browsertrix/browsertrix/horizontal-auto.svg)

---------

Co-authored-by: Weblate (bot) <hosted@weblate.org>
Co-authored-by: Emma Segal-Grossman <emma@webrecorder.org>
Co-authored-by: Carole Gagné <carole.gagne@banq.qc.ca>
Co-authored-by: Bricaud Frédéric <frederic.bricaud@banq.qc.ca>
Co-authored-by: Webrecorder Dev <dev@webrecorder.org>
Co-authored-by: weblate <weblate@users.noreply.github.com>
2025-01-08 22:56:17 -08:00
sua yoo
6a5e070ffc
fix: Allow deleting workflows without any crawls (#2285)
- Uses crawl count to determine whether workflow can be deleted instead
of last crawl ID
- Display delete confirmation dialog when trying to delete a workflow
2025-01-08 16:02:53 -08:00
sua yoo
1260aec976
fix: Crawler proxy selection fixes (#2280)
- Hides proxy form control if there are no proxy servers available
- Fixes org default proxy value not being saved
2025-01-08 16:02:09 -08:00
Emma Segal-Grossman
d6189eee9a
Add fuse-backed org search to superadmin org list (#2277)
Closes #2276 

Adds a simple search bar to the superadmin interface that allows users
to search for orgs by org name, id, users (names and emails), and
subscriptions (subscription id and plan id).

[Extended search](https://www.fusejs.io/examples.html#extended-search)
is enabled, so exact search terms like `=stripe:sub_xxxxxxx` can be used
to find a specific org directly. [See the
docs](https://www.fusejs.io/examples.html#extended-search) for what
operators are available.

<img width="897" alt="Screenshot 2025-01-07 at 1 59 27 PM"
src="https://github.com/user-attachments/assets/56c22fd0-5a61-4665-b904-d4534079158a"
/>
<img width="894" alt="Screenshot 2025-01-07 at 1 59 39 PM"
src="https://github.com/user-attachments/assets/2a9fcee7-bcd0-4959-854c-e43daddbe7cf"
/>
2025-01-07 14:58:33 -05:00
sua yoo
3b6f63f030
deps: Upgrade to Node 22 (#2274)
- Upgrades build to use Node 22
- Adds version matrix to GH workflow to test compatibility with 20
2025-01-07 11:58:23 -08:00
sua yoo
71a83bb2e4
fix: Update superadmin orgs list after create (#2278)
Fixes newly created org not showing in list
2025-01-07 11:12:11 -08:00