Fixes#1158
Introduces two new API endpoints that stream crawling statistics CSVs
(with a suggested attachment filename header):
- `GET /api/orgs/all/crawls/stats` - crawls from all orgs (superuser
only)
- `GET /api/orgs/{oid}/crawls/stats` - crawls from just one org
(available to org crawler/admin users as well as superusers)
Also includes tests for both endpoints.
If configmap is missing (eg. was accidentally deleted from k8s) recreate
the configmap when updating the crawl workflow or running a crawl.
Previously, this would result in an error, but now the configmap should
be correctly recreated.
Fixes#1358
- Adds `extraExecMinutes` and `giftedExecMinutes` org quotas, which are
not reset monthly but are updateable amounts that carry across months
- Adds `quotaUpdate` field to `Organization` to track when quotas were
updated with timestamp
- Adds `extraExecMinutesAvailable` and `giftedExecMinutesAvailable`
fields to `Organization` to help with tracking available time left
(includes tested migration to initialize these to 0)
- Modifies org backend to track time across multiple categories, using
monthlyExecSeconds, then giftedExecSeconds, then extraExecSeconds.
All time is also written into crawlExecSeconds, which is now the monthly
total and also contains any overage time above the quotas
- Updates Dashboard crawling meter to include all types of execution
time if `extraExecMinutes` and/or `giftedExecMinutes` are set above 0
- Updates Dashboard Usage History table to include all types of
execution time (only displaying columns that have data)
- Adds backend nightly test to check handling of quotas and execution
time
- Includes migration to add new fields and copy crawlExecSeconds to
monthlyExecSeconds for previous months
Co-authored-by: emma <hi@emma.cafe>
Fixes#1395
- Adds new `POST /orgs/<orgid>/jobs/retryFailed` API endpoint to retry all failed
background jobs for a specific org.
- Also adds `POST /orgs/all/jobs/retryFailed` for superadmin to retry all failed background jobs for all orgs
Closes#1294
### Changes
- `crawl-list` component
- Adds a check if there are any items in the actions menu. If not, skip
rendering the actions menu.
- This allows us to give the component no actions! Currently required to
remove them for viewers!
- Collection Details
- Hides "Remove from Collection" option for viewers
- Crawls List
- Removes the single "View Crawl Details" option from archived items for
viewers
- All the other actions were already set up correctly to be used by all
roles!
- Dashboard
- Hides org settings gear icon button unless the user is an admin
- Hides "Create New" dropdown for viewers
- Workflow Details
- Hides workflow edit icon button for viewers
- Hides the "Delete Crawl" option in archived items for viewers
- Hides the "Run Crawl" option for viewers
- Workflow List
- Hides all edit-related options for viewers, the only option now is
copying tags
- Removes the deactivate / delete options (were only visible when
running a crawl) in the workflow list actions
---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Co-authored-by: sua yoo <sua@suayoo.com>
- Emails are now processed from Jinja2 templates found in
`charts/email-templates`, to support easier updates via helm chart in
the future.
- The available templates are: `invite`, `password_reset`, `validate` and
`failed_bg_job`.
- Each template can be text only or also include HTML. The format of the
template is:
```
subject
~~~
<html content>
~~~
text
```
- A new `support_email` field is also added to the email block in
values.yaml
Invite Template:
- Currently, only the invite template includes an HTML version, other
templates are text only.
- The same template is used for new and existing users, with slightly
different text if adding user to an existing org.
- If user is invited by the superadmin, the invited by field is not
included, otherwise it also includes 'You have been invited by X to join Y'
- Adds two new crawl finished state, stopped_by_user and
stopped_quota_reached
- Tracking other possible 'stop reasons' in operator, though not making
them distinct states for now.
- Updated frontend with 'Stopped by User' and 'Stopped: Time Quota
Reached', shown with same icon as current partial_complete
- Added migration of partial_complete to either stopped_by_user or
complete (no historical quota data available)
- Addresses edge case in scaling: if crawl never scaled (no redis entry,
no pod), automatically scale down
- Edge case in status: if crawl is somehow 'canceled' but not deleted,
immediately delete crawl object and begin finalizing.
---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
Fixes#1307Fixes#1132
Related to #1306
Deleted webhook notifications include the org id and item/collection id.
This PR also includes API docs for the new webhooks and extends the
existing tests to account for the new webhooks.
This PR also does some additional cleanup for existing webhooks:
- Remove `downloadUrls` from item finished webhook bodies
- Rename collection webhook body `downloadUrls` to `downloadUrl`, since
we only ever have one per collection
- Fix API docs for existing webhooks, one of which had the wrong
response body
Fixes#1328
- Adds /retry endpoint for retrying failed jobs.
- Returns 400 error if previous job still running or has succeeded
- Keeps track of previous failed attempts in previousAttempts array on failed job.
- Also amends the similar webhook /retry endpoint to use `POST` for consistency.
- Remove duplicate api tag for backgroundjobs
Fixes#1364
Regression fix for issue introduced in storage refactoring (see issue
for more details).
Changes:
1. Add `profiles/` prefix to profile filename passed in to crawler for
profile creation and written into db
2. Remove hardcoded `profiles/` prefix from crawler YAML
3. Add migration to add `profiles/` prefix to profile filenames that
don't already have it, including updating PROFILE_FILENAME in ConfigMaps
This way between the related storage document and the profile filename,
we have the full path to the object in the database rather than relying
on additional prefixes hardcoded into k8s job YAML files.
Note that this as a follow-up it'll be necessary to manually move any
profiles that had been written into the `<oid>` "directory" in object
storage rather than `<oid>/profiles` to the latter. This should only
affect profiles created very recently in a 1.8.0-beta release.
- move authsign secret to signer and make port configurable
- rename storages to more general ops-configs
- put 'storages.json' path into env var
- rename backend secret to backend-auth
- cronjobs: don't keep succeeded jobs around, triggers operator update
Fixes#1337
Crawl timeout is tracked via `elapsedCrawlTime` field on the crawl
status, which is similar to regular crawl execution time, but only
counts one pod if scale > 1. If scale == 1, this time is equivalent.
Crawl is gracefully stopped when the elapsed execution time exceeds the
timeout. For more responsiveness, also adding current crawl time since
last update interval.
Details:
- handle crawl timeout via elapsed crawl time - longest running time of a
single pod, instead of expire time.
- include current running from last update for best precision
- more accurately count elapsed time crawl is actually running
- store elapsedCrawlTime in addition to crawlExecTime, storing the
longest duration of each pod since last test interval
---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
Instead of adding the app templates launched from the backend via
`backend/btrixcloud/templates`, add them to a configmap and mount the
configmap in the same location.
This allows these templates to be updated, like other values in
charts/... without having to rebuild any of the images, speeding up dev
and maintenance time.
Changes include:
- move backend/btrixcloud/templates -> chart/app-templates/
- add app-templates/*.yaml to app-templates configmap
- mount app-templates configmap to /app/btrixcloud/templates/ in api and op containers
- instead of restarting crawler when exclusion added/removed, add a
message to a redis list (per crawler instance)
- no longer filtering existing queue on backend, now handled via crawler (implemented in 0.12.0 via webrecorder/browsertrix-crawler#408)
- match response optimization: instead of returning first 1000 matches,
limits response to 500K and returns however many matches fit in that
response size (for optional pagination on frontend)
Fixes#1252
Supports a generic background job system, with two background jobs,
CreateReplicaJob and DeleteReplicaJob.
- CreateReplicaJob runs on new crawls, uploads, profiles and updates the
`replicas` array with the info about the replica after the job succeeds.
- DeleteReplicaJob deletes the replica.
- Both jobs are created from the new `replica_job.yaml` template. The
CreateReplicaJob sets secrets for primary storage + replica storage,
while DeleteReplicaJob only needs the replica storage.
- The job is processed in the operator when the job is finalized
(deleted), which should happen immediately when the job is done, either
because it succeeds or because the backoffLimit is reached (currently
set to 3).
- /jobs/ api lists all jobs using a paginated response, including filtering and sorting
- /jobs/<job id> returns details for a particular job
- tests: nightly tests updated to check create + delete replica jobs for crawls as well as uploads, job api endpoints
- tests: also fixes to timeouts in nightly tests to avoid crawls finishing too quickly.
---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
This PR adds more type safety to the backend codebase:
- All ops classes calls should be type checked
- Avoiding circular references with TYPE_CHECKING conditional
- Consistent UUID usage: uuid.UUID / UUID4 with just UUID
- Crawl states moved to models, made into lists
- Additional typing added as needed, fixed a few type related errors
- CrawlOps / UploadOps / BaseCrawlOps now all have same param init order
to simplify changes
- check the 'btrix.org' instead of 'oid' labels in getting related
crawls
- fixes regression introduced in #1296 where labels where all org id
labels were switched to 'btrix.org' for consistency
- Refactors storage to support replicas + custom storages on the Org.
- There is a default primary + replica storage, while an Org can also have
primary and replica storages.
- StorageRef object is used to store references to default and custom
storage.
- CrawlFile has been updated to contain a StorageRef instead of a
def_storage_name, which references
either a default storage (in StorageOps) or custom storage (in
Organization)
- There is also a 'replicas' Optional[List[StorageRef]] which contains
replicas, if any.
- CrawlFileOut contain a numReplicas for how many replicas exist for
a given file.
- Migration: migration 0020 added to migrate existing Orgs, CrawlFile and ProfileFile objects to new storage system (CrawlFile and ProfileFile now extend BaseFile)
Part of #1262
---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
Fixes#1261Closes#1092
The quota for monthly execution minutes is treated as a hard cap. Once
it is exceeded, an alert indicating that an org has exceeded its monthly
execution minutes will display and the user will be unable to start new
crawls. Any running crawls will be stopped once the quota is exceeded.
An execution minutes meter bar is also added in the Org Dashboard and
displayed if a quota is set. More detail in #1305 which was
merged into this branch.
## Changes
- Enable setting 'maxExecMinutesPerMonth' in orgs list quotas by superadmin
- Enforce quota by stopping crawls in operator once quota is reached
- Show alert banner once execution time quota is hit:
- Once quota is hit, disable Run Crawl buttons in frontend, return 403
message with `exec_minutes_quota_reached` detail in backend from
crawl config `/run` endpoint, and don't run new workflows on creation
(similar to storage quota)
- Display execution time for crawls in the crawl details overview,
immediately below
- Show execution minutes meter on dashboard (from #1305)
---------
Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Co-authored-by: sua yoo <sua@webrecorder.org>
Fixes#1297
Ensures proper typing for UUIDs in FastAPI input models, to avoid
explicit conversions, which may throw errors.
This avoids possible 500 errors (due to ValueError exceptions) when
converting UUIDs from user input.
Instead, will get more 422 errors from FastAPI.
UUID conversions remaining are in operator / profile handling where
UUIDs are retrieved from previously set fields, remaining user input
conversions in user auth and collection list are wrapped in exceptions.
For `profileid`, update fastapi models to support union of UUID, null,
and EmptyStr (new empty string only type), to differentiate removing
profile (empty string) vs not changing at all (null) for config updates
Fixes#1306
- Include full `resources` with expireAt (as string) in crawlFinished
and uploadFinished webhook notifications rather than using the
`downloadUrls` field (this is retained for collections).
- Set default presigned duration to one minute short of 1 week and enforce
maximum supported by S3
- Add 'storage_presign_duration_minutes' commented out to helm values.yaml
- Update tests
---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
Fixes#1270
After 5 consecutive failed logins from the same user, we now prevent the
user from logging in even with the correct password until they reset it
via their email, or wait an hour.
- After failure threshold is reached, all further login attempts are rejected
- Attempts for invalid email addresses are also tracked
- On 6th try, a reset password email is automatically sent, only once
- Failed login counter resets after an hour of no further logins after last attempted login.
---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
- avoid exception if 'errors' (or 'files' keys) don't exist (part of
#1297)
- ensure 'errors' list always set on output model for consistency,
defaulting to empty list
- fix tests for 'errors' being an empty empty list
follow-up to #1300 (merging 1.7.1 release into main)
Fixes#1050
Major refactor of the user/auth system to remove fastapi_users
dependency. Refactors users.py to be standalone
and adds new auth.py module for handling auth. UserManager now works
similar to other ops classes.
The auth should be fully backwards compatible with fastapi_users auth,
including accepting previous JWT tokens w/o having to re-login. The User
data model in mongodb is also unchanged.
Additional fixes:
- allows updating fastapi to latest
- add webhook docs to openapi (follow up to #1041)
API changes:
- Removing the`GET, PATCH, DELETE /users/<id>` endpoints, which were not
in used before, as users are scoped to orgs. For deletion, probably
auto-delete when user is removed from last org (to be implemented).
- Rename `/users/me-with-orgs` is renamed to just `/users/me/`
- New `PUT /users/me/change-password` endpoint with password required to update password, fixes #1269, supersedes #1272
Frontend changes:
- Fixes from #1272 to support new change password endpoint.
---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
Co-authored-by: sua yoo <sua@suayoo.com>
- Replaces org UUID in URL/browser location bar with org slug.
- Refactor: Adds shared app state utility using https://sijakret.github.io/lit-shared-state/ to
access org data from deep descendants.
- Backwards compatible: org UUID URLs should auto-redirect to org slug URLs.
- Show the org UUID in org settings general tab for use with APIs
(Resolves#1258, Follows #1279)
Optimizes webhooks by passing oid directly to webhooks:
- avoids extra crawl lookup
- possible for crawl to be deleted before webhook is processed via
operator (resulting in crawl lookup to fail)
- add more typing to operator and webhooks
Fixes#1271
Using .log for now due to broader support for opening with default viewers
---------
Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>
Fixes#1278
- Adds `GET /orgs/slug-lookup` endpoint returning `{id: slug}` for all
orgs
- Restricts new endpoint and existing `GET /orgs/slugs` to superadmins
* storage ops: follow up to #1257:
- fix refactor typo
- add type hints for all storageops apis (add mypy_boto3_s3 and types_aiobotocore_s3 for type hints)
- Add slug field with uniqueness constraint to Organization
- Use python-slugify to generate slug from name and import that in migration
- Require name in all /rename and org creation requests
- Auto-generate slug for new org with no slug or when /rename is called w/o a slug
- Auto-generate slug for 'default-org' based on name
- Add /api/orgs/slugs GET endpoint to return all slugs in use
- tests: extend backend test-requirements.txt from requirements to allow testing slugify
- tests: move get_redis_crawl_stats() to avoid extra dependency in utils
* storage ops refactor:
- create StorageOps class similar to other ops classes
- init storages list in StorageOps, no longer require lookup up default storages via CrawlManager
- convert all storage functions to members, add storageops to operator
- remove unused params, ensure crawl exists for rollover restart
- add env var to determine if using local minio to use correct endpoint URL
* crawls /seeds endpoint: just return empty list if not a crawl (eg. upload)
* crawlmanager: remove unused code, rename check_storage -> has_storage