- Shows language selector on log in page
- Adds tabs to account settings
- Adds language preference selector on account settings
- Adds issue template for new language
- Updates dev docs for frontend localization
---------
Co-authored-by: SuaYoo <SuaYoo@users.noreply.github.com>
Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
Fixes#2106
Docs are now hosted as part of the frontend at `/docs` by default.
- If `docs_url` is set in the helm chart, the `/docs` endpoint will
redirect to that endpoint instead
- Use multi-stage python image to build mkdocs as part of frontend, then
copy static output
- Dir layout: mkdocs.yml and docs into frontend/docs
- CI: Update docs build GH action to use new path
- Update all frontend paths to use `/docs/` instead of
`https://docs.browsertrix.com/`
---------
Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
- ensure user-guide iframe is always visible if not fading in when
'Setup Guide' button is clicked
- Clicking 'Setup Guide' scrolls to relevant section of guide (scope,
limits, browser settings, scheduling, metadata) based on current
workflow editor tab via hashtag
---------
Co-authored-by: sua yoo <sua@webrecorder.org>
- crawler isActive() should only look at running states, not 'stopping',
since stopping may not be reset to false after crawl is stopped
(probably should reset, but also a record that the crawl ended as being
stopped). crawl is guaranteed to remain in one of the active states
while stopping on the backend
- fixes#2100
Resolves the big red
`[DEP_WEBPACK_DEV_SERVER_ON_BEFORE_SETUP_MIDDLEWARE] DeprecationWarning:
'onBeforeSetupMiddleware' option is deprecated. Please use the
'setupMiddlewares' option.` warnings when starting Webpack.
---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Resolves https://github.com/webrecorder/browsertrix/issues/2091
<!-- Fixes #issue_number -->
### Changes
Saves scope type chosen from "+ New Workflow" dropdown to local storage, as well as from within workflow editor when creating a new workflow (but not editing an existing one).
Resolves#1354
Supports crawling through pre-configured proxy servers, allowing users to select which proxy servers to use (requires browsertrix crawler 1.3+)
Config:
- proxies defined in btrix-proxies subchart
- can be configured via btrix-proxies key or separate proxies.yaml file via separate subchart
- proxies list refreshed automatically if crawler_proxies.json changes if subchart is deployed
- support for ssh and socks5 proxies
- proxy keys added to secrets in subchart
- support for default proxy to be always used if no other proxy configured, prevent starting cluster if default proxy not available
- prevent starting manual crawl if previously configured proxy is no longer available, return error
- force 'btrix' username and group name on browsertrix-crawler non-root user to support ssh
Operator:
- support crawling through proxies, pass proxyId in CrawlJob
- support running profile browsers which designated proxy, pass proxyId to ProfileJob
- prevent starting scheduled crawl if previously configured proxy is no longer available
API / Access:
- /api/orgs/all/crawlconfigs/crawler-proxies - get all proxies (superadmin only)
- /api/orgs/{oid}/crawlconfigs/crawler-proxies - get proxies available to particular org
- /api/orgs/{oid}/proxies - update allowed proxies for particular org (superadmin only)
- superadmin can configure which orgs can use which proxies, stored on the org
- superadmin can also allow an org to access all 'shared' proxies, to avoid having to allow a shared proxy on each org.
UI:
- Superadmin has 'Edit Proxies' dialog to configure for each org if it has: dedicated proxies, has access to shared proxies.
- User can select a proxy in Crawl Workflow browser settings
- Users can choose to launch a browser profile with a particular proxy
- Display which proxy is used to create profile in profile selector
- Users can choose with default proxy to use for new workflows in Crawling Defaults
---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
Resolves https://github.com/webrecorder/browsertrix/issues/2073
### Changes
- Removes "URL List" and "Seeded Crawl" job type distinction and adds as
additional crawl scope types instead.
- 'New Workflow' button defaults to Single Page
- 'New Workflow' dropdown includes Page Crawl (Single Page, Page List, In-Page Links) and Site Crawl (Page in Same Directory, Page on Same Domain, + Subdomains and Custom Page Prefix)
- Enables specifying `DOCS_URL` in `.env`
- Additional follow-ups in #2090, #2091
- Ensures extracted strings get formatted before checking against index
- Fixes index check by switching from `git diff-index` to `git diff`,
and ensures the proper `--exit-code` flag is present (implicitly turned
on by `--quiet`)
- Adds actionable error message when the check fails
- Updates Github's actions versions from v3 to v4 (major version bump is
primarily just for default node version updates, but this way we'll get
future updates)
- Adds formatting step to npm script for extracting messages
- Runs a string extraction & format against current main
Use timezone aware datetimes instead of timezone naive datetimes:
- Update mongodb client to use tz-aware conversion
- Convert dt_now() to return timezone aware UTC date
- Rename to_k8s_date -> date_to_str, just returns ISO UTC date with 'Z'
(instead of '+00:00' suffix)
- Rename from_k8s_date -> str_to_date, returns timezone aware date from
str
- Standardize all string<->date conversion to use either date_to_str or
str_to_date
- Update frontend to assume iso date, not append 'Z' directly
- Update tests to check for 'Z' suffix on some dates
---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
- Fixes empty watch tab during some active states
- Disables exclusion and browser window editing while a crawl is
stopping
- Refactors frontend crawl states to match backend
- Replaces QA Review breadcrumbs with "Back" button that takes you back
to the archived item
- Crawl breadcrumbs always point to workflow
- Shows archived item icon next to item name to differentiate from
workflow
- Adds "Edit Workflow" button to "Crawl Settings"
---------
Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
Fixes https://github.com/webrecorder/browsertrix/issues/2064
### Changes
- Switches MDE library to one that supports shadow DOM
- Refactors collection components to btrix components
- Fixes collection detail not expanding and contracting correctly
- Adds breadcrumb navigation to all detail views, returning to correct
origin for workflow and collection items
- Refactors list page headers into layout utility
- Refactors crawl tab labels and renames "Files" tab to "WACZ Files"
- Adds new org settings tab for updating crawl details
- Refactors `workflow-editor` to move out utility functions
- Updates user guide on org settings
---------
Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Improves time to first load an org with the following:
- Users user info from login response to set org slug and route user on
log in
- Stores user info in session storage so that it's available on reload
- Stores app settings in local storage until user logs out
- Loads critical org components synchronously
WIP for https://github.com/webrecorder/browsertrix/issues/2041
<!-- Fixes #issue_number -->
### Changes
- Adds button to open embedded support guide
- Adds link to help forum
- Refactors app bar to look nicer on smaller screens
- Updates workflow job type copy and adds additional clarifying text
- Changes "List of URLs" label to "Crawl URL(s)"
- Refactors `NewWorkflowDialog` into tailwind element
- Renames "Org Settings" -> "Settings"
- Reduces gap between settings panel heading and panel
- Always show "Pending Invites" section and update heading styles to
match panel heading
- Update "Current Plan" and "Usage History" sections to be on the same
hierarchical level under "Billing"
- Refactors `<btrix-org>` to move `isAdmin` and `isCrawler` helpers to
app state
Also improves the slug editing experience by partially-slugifying the
value as it's entered.
Previously, submitting a org slug value of ".." or similar would cause
the frontend to redirect to a "page not found" page, with all accessible
links leading to only `/account/settings`. This also causes the backend
to reset the org slug to one generated from the org name on a reload.
---------
Co-authored-by: sua yoo <sua@webrecorder.org>
* Fixes issue in FailedLogin model:
- fix data-model to remove nested 'attempted.attempted'
- migrate existing data to remove nested field
* Also, avoid setting dt_now() in model as that results in fixed date for
all objects:
- update FailedLogin to update 'attempted' date on every attempt
- also update PageNote object to set date in constructor
* Update text for too many logins to make it clear it is set only if its a
valid email
* fixes#2001
Following https://github.com/webrecorder/browsertrix/pull/1995, we want
to keep the usage history table on the dashboard for now for video
demos.
### Changes
- Adds usage history table back to org dashboard
- Makes "No usage history" message more apparent
Tweaks to how execution time is tracked for more accuracy + excluding
waiting states:
- don't update if crawl state is in a 'waiting state' (waiting for
capacity or waiting for org limit)
- rename start states -> waiting states for clarity
- reset lastUpdatedTime if two consecutive updates of non-running state,
to ensure non-running states don't count, but also account for
occasional hiccups -- if only one update detects non-running state,
don't reset
- webhooks: move start webhook to when crawl actually starts for first
time (db lastUpdatedTime is not yet + crawl is running)
- don't set lastUpdatedTime until pods actually running
- set crawljob update interval to every 10 seconds for more accurate
execution time tracking
- frontend: show seconds in 'Execution Time' display
- Follow-up to #1914, allows SubscriptionUpdate event to also update
quotas.
- Passes current usage info + current billing page URL to portalUrl
request for external app to be able to respond with best portalUrl
- get_origin() moved to utils to be available more generally.
- Updates billing tab to show current plans, switches order of quotas to
list execution time, storage first
an org is made read-only while crawls are running:
- treat similar to other stopped_* states, do a graceful stop
- update UI to display "Stopped: Crawling Disabled" for this status
- don't add corresponding skipped status - just skip running crawls if org is read-only
Resolves https://github.com/webrecorder/browsertrix/issues/1951
### Changes
- Shows date org was created in superadmin org list
- Visually differentiates unnamed org ID
- Adds "Admin" badge to app header to make current login more apparent
- Fixes logic to show "create org" dialog if there are no orgs in an
instance
- Refactors `btrix-home` to remove unused references to non-superadmin
org list
---------
Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
Resolves https://github.com/webrecorder/browsertrix/issues/1912
### Changes
- Show support email, if available, in invalid invite error message
- Separate error message for invite email that doesn't match current
user's
Fixes#1968
Changes:
- `stopped_quota_reached` and `skipped_quota_reached` migrated to new
values that indicate which quota was reached
- Before crawls are run, the operator checks if storage or exec mins
quotas are reached and if so fails the crawl with the appropriate state
of `skipped_storage_quota_reached` or `skipped_time_quota_reached`
- While crawls are running, the operator checks if the exec mins quota
is reached or if the size of all running crawls will mean the storage
quota is reached once uploaded; if so, the crawl is stopped gracefully
and given `stopped_storage_quota_needed` or `stopped_time_quota_reached`
state as appropriate
- Adds new nightly tests for enforcing storage quota
Fixes#1412
## Changes
### Backend
- Adds `all-crawls`, `crawls`, and `uploads` API endpoints to download
archived item as multi-WACZ
- Download QA runs as multi-WACZ
- Adds backend tests for new endpoints
- Update to new version of stream-zip library which does not require crc-32 to be present for ZIP members,
computes after streaming, fixing invalid crc-32 issues as previously computed crc-32s from crawler may be invalid.
### Frontend
Adds ability to download archived item from:
- Button in archived item detail Files tab
- Archived item details actions menu
- Archived items list menu
---------
Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
Co-authored-by: sua yoo <sua@webrecorder.org>
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
The monthly execution minutes meter was using the wrong metric,
`crawlExecSeconds` instead of `monthlyExecSeconds`.
Since the meter is showing minutes used out of the monthly quota, it
should be using `monthlyExecSeconds`.
This is a quick fix, however, this system may need another pass at some
point.
---------
Co-authored-by: sua yoo <sua@suayoo.com>
Fixes https://github.com/webrecorder/browsertrix/issues/1954
### Changes
- Refactors app state to include org data
- Fixes banner not showing if storage or execution minutes is exceeded
after page load
- Disables closing banners
- Refreshes org when tab changes
---------
Co-authored-by: emma <hi@emma.cafe>
Directs user that doesn't belong to any orgs to account settings page,
with banner.
Also contains some minor out-of-scope changes:
- Refactors `isAdmin` key to `isSuperAdmin` for more legibility on
whether current user is superadmin or regular user without orgs
- Adds "cancel" button to change password form
Fixes#1955
Orgs list endpoint sorting now works as follows:
- Default org is always sorted first
- Name sorting now works on a lowercased version of the org names to
ensure lexical sorting
The lodash `sortBy` resorting of orgs in the "All Organizations"
dropdown list in the nav bar has also been removed so that the backend
sorting is applied instead.
Tests have been updated accordingly.
### Changes
- Updates customer portal link label
- Opens billing portal in the same tab
- Shows separate cancel date
- Makes payment failed appear as error
- Fixes crawl time quota
Fixes a few things that have been bugging me:
- Overflow buttons in list view now (mostly) take up the their full cell
area, instead of there being a couple pixels around the button where
clicking would do nothing or cause navigation
- | before | after |
| --- | --- |
| <img width="238" alt="Screenshot 2024-07-16 at 3 35 25 PM"
src="https://github.com/user-attachments/assets/afbda6d6-703b-4ed8-96be-a9c37660430d">
| <img width="236" alt="Screenshot 2024-07-16 at 3 35 02 PM"
src="https://github.com/user-attachments/assets/417a326a-08d2-42b2-85c3-fa007ea3bff8">
|
- Changes the class that `tab-list` uses internally so that it doesn't
conflict with Tailwind's `container` class, which prevents the tab
content from being limited at the default Tailwind container width
- Adds a couple of Tailwind plugins for styling CSS parts
(`part-[...]:`) and for arbitrary attributes (`attr-[...]:`)
Initial implementation of #1892
- Modifies the backend to return `duplicate_org_name` or
`duplicate_org_slug` as appropriate on a pymongo `DuplicateKeyError`
- Updates frontend to handle `duplicate_org_name`, `duplicate_org_slug`,
and `invalid_slug` error details
- Update errors to be more consistent, also return `duplicate_org_subscription.subId` for duplicate subscription instead of the more generic `already_exists`
---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
No issue created, but noticed issue here
ed0d489cda
### Changes
- Removes unused node module mocks and use commonjs plugin to import
modules in tests
- Fixes org form test after removing temporary stub