Commit Graph

1526 Commits

Author SHA1 Message Date
emma
bc6f362861 update pages as well 2023-11-15 18:24:50 -05:00
Ilya Kreymer
b23eed5003
Email Templates (#1375)
- Emails are now processed from Jinja2 templates found in
`charts/email-templates`, to support easier updates via helm chart in
the future.
- The available templates are: `invite`, `password_reset`, `validate` and
`failed_bg_job`.
- Each template can be text only or also include HTML. The format of the
template is:
```
subject
~~~
<html content>
~~~
text
```
- A new `support_email` field is also added to the email block in
values.yaml

Invite Template: 
- Currently, only the invite template includes an HTML version, other
templates are text only.
- The same template is used for new and existing users, with slightly
different text if adding user to an existing org.
- If user is invited by the superadmin, the invited by field is not
included, otherwise it also includes 'You have been invited by X to join Y'
2023-11-15 15:22:12 -08:00
emma
d8f8e6db73 update first half of custom elements to be defined at their class dfns 2023-11-15 18:14:17 -05:00
Ilya Kreymer
7d985a9688 version: bump to 1.8.0-beta.4 2023-11-14 11:59:04 -08:00
Henry Wilkinson
9d50916230
Remove collection share access column header icon (#1371)
Closes #1351

### Changes
- Removes the collection share access icon's header icon which was just
an icon. This is now mostly in line with how we display status icons in
the archived items list (there is a spacing difference between the two
lists regarding the placement of the icon vs the label and its alignment
with the text (here) vs the icon (archived items list).
2023-11-14 11:18:41 -08:00
Ilya Kreymer
dfba4b3940
Replace partial_complete -> stopped_by_user or stopped_quota_reached + operator edge cases (#1368)
- Adds two new crawl finished state, stopped_by_user and
stopped_quota_reached
- Tracking other possible 'stop reasons' in operator, though not making
them distinct states for now.
- Updated frontend with 'Stopped by User' and 'Stopped: Time Quota
Reached', shown with same icon as current partial_complete
- Added migration of partial_complete to either stopped_by_user or
complete (no historical quota data available)
- Addresses edge case in scaling: if crawl never scaled (no redis entry,
no pod), automatically scale down
- Edge case in status: if crawl is somehow 'canceled' but not deleted,
immediately delete crawl object and begin finalizing.

---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2023-11-14 11:17:16 -08:00
Emma Segal-Grossman
bf0227ccbf
Merge pull request #1373 from webrecorder/1329-show-consistent-unit-of-time-for-execution-minutes-in-dashboard
Update execution time display
2023-11-13 20:23:19 -05:00
emma
060b8d85c9 rename pretty-ms import to be clear about units 2023-11-13 19:37:15 -05:00
emma
5c5eba97a0 revert changes from i18n (will address elsewhere) 2023-11-13 19:36:03 -05:00
emma
d7dc71ae99 remove execution time formatter from non-execution-time bits 2023-11-13 19:31:45 -05:00
Henry Wilkinson
b9eab4c20f
Minor capitalization fixes for workflow options (#1374)
- Updates checkbox casing
2023-11-13 19:22:50 -05:00
emma
ee8ecb20de improve formatting using Intl.NumberFormat 2023-11-13 18:49:28 -05:00
emma
c35bc2b03f update execution and elapsed time display 2023-11-13 18:17:36 -05:00
Ilya Kreymer
3e82c4513b quickfix: bump replica_job memory to 200Mi 2023-11-13 13:45:24 -08:00
Ilya Kreymer
a6a78c9ef2
node affinity: set to required instead of preferred to keep crawlers on dedicated infrastructure (#1366)
Previously, the crawler pods use preferred node affinity, instead of
required node affinity. This results in crawler nodes running on the
main node pool. Instead, we want to ensure crawler nodes are running on
dedicated node pool (if configured).
- Converts 'preferred node affinity' to 'required node affinity' for
the node pool, while keeping preferred pod affinity for keeping all
crawler / redis pods together.
- For profiles, updates to same node affinity, and also adds
resource constraint to match a single crawler for profile browser,
which did not have resource constraints.
2023-11-13 10:02:05 -08:00
Ilya Kreymer
67892994a6 version: bump to 1.8.0-beta.3 2023-11-09 18:20:04 -08:00
Tessa Walsh
f3cbd9e179
Add crawl, upload, and collection delete webhook event notifications (#1363)
Fixes #1307
Fixes #1132
Related to #1306

Deleted webhook notifications include the org id and item/collection id.
This PR also includes API docs for the new webhooks and extends the
existing tests to account for the new webhooks.

This PR also does some additional cleanup for existing webhooks:
- Remove `downloadUrls` from item finished webhook bodies
- Rename collection webhook body `downloadUrls` to `downloadUrl`, since
we only ever have one per collection
- Fix API docs for existing webhooks, one of which had the wrong
response body
2023-11-09 18:19:08 -08:00
Tessa Walsh
1afc411114
Implement retry API endpoint for failed background jobs (#1356)
Fixes #1328 

- Adds /retry endpoint for retrying failed jobs.
- Returns 400 error if previous job still running or has succeeded
- Keeps track of previous failed attempts in previousAttempts array on failed job.
- Also amends the similar webhook /retry endpoint to use `POST` for consistency.
- Remove duplicate api tag for backgroundjobs
2023-11-09 18:09:37 -08:00
Tessa Walsh
82a5d1e4e4
Regression fix: add profiles/ prefix to profile filenames (#1365)
Fixes #1364 

Regression fix for issue introduced in storage refactoring (see issue
for more details).

Changes:
1. Add `profiles/` prefix to profile filename passed in to crawler for
profile creation and written into db
2. Remove hardcoded `profiles/` prefix from crawler YAML
3. Add migration to add `profiles/` prefix to profile filenames that
don't already have it, including updating PROFILE_FILENAME in ConfigMaps

This way between the related storage document and the profile filename,
we have the full path to the object in the database rather than relying
on additional prefixes hardcoded into k8s job YAML files.

Note that this as a follow-up it'll be necessary to manually move any
profiles that had been written into the `<oid>` "directory" in object
storage rather than `<oid>/profiles` to the latter. This should only
affect profiles created very recently in a 1.8.0-beta release.
2023-11-09 17:44:16 -08:00
Henry Wilkinson
a71815a342
Encode the collection sharing URL (#1362) 2023-11-09 16:24:39 -05:00
Tessa Walsh
30bbefbeaa
Send email to superuser when background job fails (#1355)
Fixes #1344

Sends email to superadmin when a background job fails.
2023-11-08 19:55:59 -08:00
Ilya Kreymer
ff10124d01
charts cleanup: (#1360)
- move authsign secret to signer and make port configurable
- rename storages to more general ops-configs
- put 'storages.json' path into env var
- rename backend secret to backend-auth
- cronjobs: don't keep succeeded jobs around, triggers operator update
2023-11-08 19:24:00 -08:00
Ilya Kreymer
e4660dd010 nightly tests: bump page limit to 200 to ensure crawls don't end too quickly when testing quotas/limits 2023-11-08 15:10:55 -08:00
Ilya Kreymer
d2d7240455
background jobs fix: ensure bucket is parsed correctly (#1359)
Follow-up to #1321 
- correctly parse the endpoint_url into prefix and bucket path
- also add region and s3 provider type to storage secrets
2023-11-08 15:08:23 -08:00
Tessa Walsh
ea5650f173
Add checkmark next to replicated/backed up files (#1343)
Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>
2023-11-08 11:21:31 -05:00
Ilya Kreymer
3aebf2e37f version: bump to 1.8.0-beta.2 2023-11-06 16:35:15 -08:00
Ilya Kreymer
b4fd5e6e94
Crawl Timeout via elapsed time (#1338)
Fixes #1337 

Crawl timeout is tracked via `elapsedCrawlTime` field on the crawl
status, which is similar to regular crawl execution time, but only
counts one pod if scale > 1. If scale == 1, this time is equivalent.

Crawl is gracefully stopped when the elapsed execution time exceeds the
timeout. For more responsiveness, also adding current crawl time since
last update interval.

Details:
- handle crawl timeout via elapsed crawl time - longest running time of a
single pod, instead of expire time.
- include current running from last update for best precision
- more accurately count elapsed time crawl is actually running
- store elapsedCrawlTime in addition to crawlExecTime, storing the
longest duration of each pod since last test interval

---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2023-11-06 16:32:58 -08:00
Ilya Kreymer
5530ca92e1
Move backend app templates to be installed from configmap volume (#1331)
Instead of adding the app templates launched from the backend via
`backend/btrixcloud/templates`, add them to a configmap and mount the
configmap in the same location.

This allows these templates to be updated, like other values in
charts/... without having to rebuild any of the images, speeding up dev
and maintenance time.

Changes include:
- move backend/btrixcloud/templates -> chart/app-templates/
- add app-templates/*.yaml to app-templates configmap
- mount app-templates configmap to /app/btrixcloud/templates/ in api and op containers
2023-11-06 09:37:48 -08:00
Ilya Kreymer
0935d43a97
exclusion optimizations: dynamic exclusions (part of #1216): (#1268)
- instead of restarting crawler when exclusion added/removed, add a
message to a redis list (per crawler instance)
- no longer filtering existing queue on backend, now handled via crawler (implemented in 0.12.0 via webrecorder/browsertrix-crawler#408)
- match response optimization: instead of returning first 1000 matches,
limits response to 500K and returns however many matches fit in that
response size (for optional pagination on frontend)
2023-11-06 09:36:25 -08:00
Francesco Servida
0b8bbcf8e6
Allow User to specify custom cluster-issuer (#1332)
Implemented variable and defaults for cluster-issuer to allow users to
specify, if needed, their own cluster issuer. (eg. installations with
only outbound traffic that cannot solve ACME https challenge)
2023-11-04 13:29:17 -07:00
Francesco Servida
4998274ab0
correctly suffix Auth-Signer url when running in custom namespace (#1335) 2023-11-04 10:34:05 -07:00
Ilya Kreymer
a050646017 test: don't pin crawler image to non-working version 2023-11-04 01:02:38 -07:00
Ilya Kreymer
fb3d88291f
Background Jobs Work (#1321)
Fixes #1252 

Supports a generic background job system, with two background jobs,
CreateReplicaJob and DeleteReplicaJob.
- CreateReplicaJob runs on new crawls, uploads, profiles and updates the
`replicas` array with the info about the replica after the job succeeds.
- DeleteReplicaJob deletes the replica.
- Both jobs are created from the new `replica_job.yaml` template. The
CreateReplicaJob sets secrets for primary storage + replica storage,
while DeleteReplicaJob only needs the replica storage.
- The job is processed in the operator when the job is finalized
(deleted), which should happen immediately when the job is done, either
because it succeeds or because the backoffLimit is reached (currently
set to 3).
- /jobs/ api lists all jobs using a paginated response, including filtering and sorting
- /jobs/<job id> returns details for a particular job
- tests: nightly tests updated to check create + delete replica jobs for crawls as well as uploads, job api endpoints
- tests: also fixes to timeouts in nightly tests to avoid crawls finishing too quickly.

---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2023-11-02 13:02:17 -07:00
Ilya Kreymer
6384d8b5f1
Additional Type Hints / Type Fix Pass (#1320)
This PR adds more type safety to the backend codebase:
- All ops classes calls should be type checked
- Avoiding circular references with TYPE_CHECKING conditional
- Consistent UUID usage: uuid.UUID / UUID4 with just UUID
- Crawl states moved to models, made into lists
- Additional typing added as needed, fixed a few type related errors
- CrawlOps / UploadOps / BaseCrawlOps now all have same param init order
to simplify changes
2023-10-30 12:59:24 -04:00
Ilya Kreymer
72f1840ae7
fix regression in concurrent crawls: (#1324)
- check the 'btrix.org' instead of 'oid' labels in getting related
crawls
- fixes regression introduced in #1296 where labels where all org id
labels were switched to 'btrix.org' for consistency
2023-10-30 12:58:07 -04:00
Ilya Kreymer
8c09934298 version: bump to 1.8.0-beta.1 2023-10-27 14:35:24 -07:00
Ilya Kreymer
c1d3beda9c
users: add case-insensitive index to maintain backwards compatibility with fastapi-users (#1319)
follow up to #1290

Based on implementation in:
https://github.com/fastapi-users/fastapi-users-db-mongodb/blob/main/fastapi_users_db_mongodb/__init__.py
2023-10-27 14:31:29 -07:00
Henry Wilkinson
3c884f94c9
Fix z-index footer issue in crawl workflow form (#1313)
Closes #1312

- Adds z-index to footer element.
2023-10-26 21:44:50 -07:00
Ilya Kreymer
6dc452ebad
Storage Refactor: Replication + Custom Storage Support (#1296)
- Refactors storage to support replicas + custom storages on the Org.
- There is a default primary + replica storage, while an Org can also have
primary and replica storages.
- StorageRef object is used to store references to default and custom
storage.

- CrawlFile has been updated to contain a StorageRef instead of a
def_storage_name, which references
either a default storage (in StorageOps) or custom storage (in
Organization)
- There is also a 'replicas' Optional[List[StorageRef]] which contains
replicas, if any.
- CrawlFileOut contain a numReplicas for how many replicas exist for
a given file.
- Migration: migration 0020 added to migrate existing Orgs, CrawlFile and ProfileFile objects to new storage system (CrawlFile and ProfileFile now extend BaseFile)


Part of #1262

---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2023-10-26 21:44:09 -07:00
Henry Wilkinson
21905205dc
Adds <btrix-details> to org dashboard table (#1311)
- Updates text with "Elapsed Time" label in the table
- Makes the table collapsible and collapsed by default.
2023-10-26 19:46:35 -07:00
Tessa Walsh
38f32f11ea
Enforce quota and hard cap for monthly execution minutes (#1284)
Fixes #1261 Closes #1092

The quota for monthly execution minutes is treated as a hard cap. Once
it is exceeded, an alert indicating that an org has exceeded its monthly
execution minutes will display and the user will be unable to start new
crawls. Any running crawls will be stopped once the quota is exceeded.

An execution minutes meter bar is also added in the Org Dashboard and
displayed if a quota is set. More detail in #1305 which was
merged into this branch.

## Changes

- Enable setting 'maxExecMinutesPerMonth' in orgs list quotas by superadmin
- Enforce quota by stopping crawls in operator once quota is reached
- Show alert banner once execution time quota is hit:
- Once quota is hit, disable Run Crawl buttons in frontend, return 403
message with `exec_minutes_quota_reached` detail in backend from
crawl config `/run` endpoint, and don't run new workflows on creation
(similar to storage quota)
- Display execution time for crawls in the crawl details overview,
immediately below
- Show execution minutes meter on dashboard (from #1305)

---------
Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Co-authored-by: sua yoo <sua@webrecorder.org>
2023-10-26 15:38:51 -07:00
Tessa Walsh
5fadc630ce
Check for empty string for SMTP password (#1317)
Follow-up fix for #1136 based on this comment:
https://github.com/webrecorder/browsertrix-cloud/issues/1136#issuecomment-1777119534
2023-10-26 09:44:55 -07:00
Ilya Kreymer
4591db1afe
More stringent UUID types for user input / avoid 500 errors (#1309)
Fixes #1297 
Ensures proper typing for UUIDs in FastAPI input models, to avoid
explicit conversions, which may throw errors.
This avoids possible 500 errors (due to ValueError exceptions) when
converting UUIDs from user input.
Instead, will get more 422 errors from FastAPI. 

UUID conversions remaining are in operator / profile handling where
UUIDs are retrieved from previously set fields, remaining user input
conversions in user auth and collection list are wrapped in exceptions.

For `profileid`, update fastapi models to support union of UUID, null,
and EmptyStr (new empty string only type), to differentiate removing
profile (empty string) vs not changing at all (null) for config updates
2023-10-25 15:15:53 -04:00
Ilya Kreymer
4b9ca44adb
Frontend typo fixes (#1315)
- fix missing org slug instead of org id change
- fix login validation to check for 429 response code
2023-10-25 13:28:41 -04:00
Tessa Walsh
d58747dfa2
Provide full resources in archived items finished webhooks (#1308)
Fixes #1306 

- Include full `resources` with expireAt (as string) in crawlFinished
and uploadFinished webhook notifications rather than using the
`downloadUrls` field (this is retained for collections).
- Set default presigned duration to one minute short of 1 week and enforce
maximum supported by S3
- Add 'storage_presign_duration_minutes' commented out to helm values.yaml
- Update tests

---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2023-10-23 19:01:58 -07:00
sua yoo
2e5952a444
Display crawl time usage history table (#1304)
Partially resolves #1223, fixes #1298

- Adds crawl usage table in dashboard under metrics
- Shows skeleton loading indicator when metrics are loading (@Shrinks99
feel free to adjust how this looks)
- Shows max number of concurrent crawls running if any are running ("`running` / `max` Crawls Running")
2023-10-23 16:25:16 -07:00
Henry Wilkinson
e274462ba0
Update tag spacing and styling for remove button (#1283)
### Context

- Adds custom padding to each side based on if the tag is removable or not
- Improves hover state for the remove button when the tag is focused
- Adds padding to the remove button
2023-10-20 16:02:32 -07:00
Tessa Walsh
5c5ef68a8a
Prevent user from logging in after 5 consecutive failed login attempts until pw is reset (#1281)
Fixes #1270 

After 5 consecutive failed logins from the same user, we now prevent the
user from logging in even with the correct password until they reset it
via their email, or wait an hour.
- After failure threshold is reached, all further login attempts are rejected
- Attempts for invalid email addresses are also tracked
- On 6th try, a reset password email is automatically sent, only once
- Failed login counter resets after an hour of no further logins after last attempted login.

---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2023-10-20 14:10:56 -07:00
Tessa Walsh
733809b5a8
Update user names in crawls and workflows after username update (#1299)
Fixes #1275
2023-10-19 23:34:49 -07:00
Tessa Walsh
fc6dc287c0
Update README features link to new landing site (#1302) 2023-10-20 00:06:36 -04:00