browsertrix

Author	SHA1	Message	Date
Ilya Kreymer	0c29008b7d	version: bump to 1.11.1	2024-07-30 11:23:41 -07:00
Ilya Kreymer	894aa29d4b	remove crc32 from CrawlFile (#1980 ) - no longer being used with latest stream-zip - was not computed correctly in the crawler - counterpart to webrecorder/browsertrix-crawler#657 --------- Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>	2024-07-30 11:23:15 -07:00
Ilya Kreymer	4aca107710	version: bump to 1.11.0	2024-07-29 12:52:39 -07:00
Ilya Kreymer	e9aeff1836	add a 'stopped_org_readonly' state for crawls that are running while org is made read-only (#1977 ) an org is made read-only while crawls are running: - treat similar to other stopped_* states, do a graceful stop - update UI to display "Stopped: Crawling Disabled" for this status - don't add corresponding skipped status - just skip running crawls if org is read-only	2024-07-29 12:24:40 -07:00
Ilya Kreymer	96691a33fa	Fix for cronjob skipping response (#1976 ) If a cronjob is disabled, the operator should quickly return a success value so that the job can be terminated. Was previously returning an incorrect response, causing disabled cronjobs to not be cleaned up. Add proper typing to always return correct response	2024-07-29 12:24:18 -07:00
Tessa Walsh	551660bb62	Add webhooks for qaAnalysisStarted, qaAnalysisFinished, and crawlReviewed (#1974 ) Fixes #1957 Adds three new webhook events related to QA: analysis started, analysis ended, and crawl reviewed. Tests have been updated accordingly. --------- Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>	2024-07-25 16:53:49 -07:00
sua yoo	daeb7448f5	feat: Minor improvements to superadmin view (#1971 ) Resolves https://github.com/webrecorder/browsertrix/issues/1951 ### Changes - Shows date org was created in superadmin org list - Visually differentiates unnamed org ID - Adds "Admin" badge to app header to make current login more apparent - Fixes logic to show "create org" dialog if there are no orgs in an instance - Refactors `btrix-home` to remove unused references to non-superadmin org list --------- Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>	2024-07-25 15:47:40 -07:00
Ilya Kreymer	94e985ae13	optimize org quota lookups (#1973 ) - instead of looking up storage and exec min quotas from oid, and loading an org each time, load org once and then check quotas on the org object - many times the org was already available, and was looked up again - storage and exec quota checks become sync - rename can_run_crawl() to more generic can_write_data(), optionally also checks exec minutes - typing: get_org_by_id() always returns org, or throws, adjust methods accordingly (don't check for none, catch exception) - typing: fix typo in BaseOperator, catch type errors in operator 'org_ops' - operator quota check: use up-to-date 'status.size' for current job, ignore current job in all jobs list to avoid double-counting - follow up to #1969	2024-07-25 14:00:16 -07:00
sua yoo	dd6c33a59d	feat: Show details of invalid invite (#1970 ) Resolves https://github.com/webrecorder/browsertrix/issues/1912 ### Changes - Show support email, if available, in invalid invite error message - Separate error message for invite email that doesn't match current user's	2024-07-25 13:57:02 -07:00
Tessa Walsh	d38abbca7f	Standardize handling of storage and execution time quotas (#1969 ) Fixes #1968 Changes: - `stopped_quota_reached` and `skipped_quota_reached` migrated to new values that indicate which quota was reached - Before crawls are run, the operator checks if storage or exec mins quotas are reached and if so fails the crawl with the appropriate state of `skipped_storage_quota_reached` or `skipped_time_quota_reached` - While crawls are running, the operator checks if the exec mins quota is reached or if the size of all running crawls will mean the storage quota is reached once uploaded; if so, the crawl is stopped gracefully and given `stopped_storage_quota_needed` or `stopped_time_quota_reached` state as appropriate - Adds new nightly tests for enforcing storage quota	2024-07-25 12:49:11 -07:00
sua yoo	2c89edcc36	feat: Disable archiving for read-only orgs (#1965 ) Resolves https://github.com/webrecorder/browsertrix/issues/1915 ### Changes - Disables buttons to create resources, duplicate resources, run crawls, and configure browser profiles. - Updates copy from "read-only" -> "disable archiving"	2024-07-25 12:42:04 -07:00
Tessa Walsh	27ee16d308	Implement downloading archived item + QA runs as multi-WACZ (#1933 ) Fixes #1412 ## Changes ### Backend - Adds `all-crawls`, `crawls`, and `uploads` API endpoints to download archived item as multi-WACZ - Download QA runs as multi-WACZ - Adds backend tests for new endpoints - Update to new version of stream-zip library which does not require crc-32 to be present for ZIP members, computes after streaming, fixing invalid crc-32 issues as previously computed crc-32s from crawler may be invalid. ### Frontend Adds ability to download archived item from: - Button in archived item detail Files tab - Archived item details actions menu - Archived items list menu --------- Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics> Co-authored-by: sua yoo <sua@webrecorder.org> Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>	2024-07-25 10:28:57 -07:00
Ilya Kreymer	b288cd81cc	fix execution minutes meter using wrong metric (#1967 ) The monthly execution minutes meter was using the wrong metric, `crawlExecSeconds` instead of `monthlyExecSeconds`. Since the meter is showing minutes used out of the monthly quota, it should be using `monthlyExecSeconds`. This is a quick fix, however, this system may need another pass at some point. --------- Co-authored-by: sua yoo <sua@suayoo.com>	2024-07-24 09:14:15 -07:00
sua yoo	24c8963dba	chore: Rename upload dialog (#1966 ) Resolves https://github.com/webrecorder/browsertrix/issues/1961 Renames "Upload Archive" dialog to reference WACZ. --------- Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>	2024-07-23 23:02:05 -07:00
sua yoo	08147ec77d	fix: Update org status banner on quota reached (#1956 ) Fixes https://github.com/webrecorder/browsertrix/issues/1954 ### Changes - Refactors app state to include org data - Fixes banner not showing if storage or execution minutes is exceeded after page load - Disables closing banners - Refreshes org when tab changes --------- Co-authored-by: emma <hi@emma.cafe>	2024-07-23 22:55:45 -07:00
Ilya Kreymer	b35669af8d	disable behaviors for QA runs via configmap (#1963 ) - make crawl args a reusable template - adds QA_ARGS to configmap, setting to same value as CRAWL_ARGS but with --behaviors= prepended to disable behaviors for QA, to improve performance of QA runs. fixes #1962	2024-07-23 19:54:21 -07:00
Ilya Kreymer	01ddf95a56	allow disabling of auto-resize of crawler pods (#1964 ) - only enable if 'enable_auto_resize' is true, default to false - if true, set memory limit to 1.2 of memory requests, resize when hitting 'soft oom' of initial request, adjust by 1.2 (current behavior) up to max_crawler_memory - if false, set memory limit to max_crawler_memory and never adjust memory requests or memory limits - part of #1959	2024-07-23 21:00:40 -04:00
Ilya Kreymer	a8c5f07b7c	Add support e-mail to settings (#1960 ) Adds support email to /api/settings Also adds a response model for this endpoint and consolidates api tests Addresses request in #1912	2024-07-23 20:58:12 -04:00
sua yoo	8c4e481bd3	feat: Improve UX when user doesn't belong to any orgs (#1953 ) Directs user that doesn't belong to any orgs to account settings page, with banner. Also contains some minor out-of-scope changes: - Refactors `isAdmin` key to `isSuperAdmin` for more legibility on whether current user is superadmin or regular user without orgs - Adds "cancel" button to change password form	2024-07-23 19:51:28 -04:00
Tessa Walsh	a02f7a6826	Ensure lexical sort for org names (#1958 ) Fixes #1955 Orgs list endpoint sorting now works as follows: - Default org is always sorted first - Name sorting now works on a lowercased version of the org names to ensure lexical sorting The lodash `sortBy` resorting of orgs in the "All Organizations" dropdown list in the nav bar has also been removed so that the backend sorting is applied instead. Tests have been updated accordingly.	2024-07-23 13:13:04 -07:00
Ilya Kreymer	8c0321bdea	Pydantic 2.x update + type fixes + python 3.12 (#1947 ) * updates pydantic to 2.x * also update to python 3.12 * additional type fixes: - all Optional[] types must have a default value - update to constrained types - URL types converted from str - test updates Fixes #1940	2024-07-22 17:23:03 -07:00
Ilya Kreymer	cb909ffc95	api docs cleanup + readd webhooks: (#1949 ) - readd webhooks (regression from #1941) - set order of tags in docs - add missing tag to route	2024-07-22 09:00:59 -07:00
Ilya Kreymer	cd00f52cca	Fix queue response models + additional testing for queue + exclusions (#1948 ) Follow-up to regressions from #1928, this PR: - Fixes response models for queue endpoints, which had incorrect model - Adds tests for queue get, queue match, and exclusions add / remove to ensure regressions like this can be caught via tests. This involves starting a new crawl in test_run_crawls() instead of relying on implicit running via fixtures, make it easier to test crawl while it's running. - Adds additional typing for crawls apis, including making delete_crawls() have correct typing, consistent derived class override - Adds check to ensure queue + exclusion operations can not be called when crawl is not running	2024-07-22 09:00:23 -07:00
Ilya Kreymer	0cc99044e7	quickfix: pin mypy version to avoid issues with latest release	2024-07-19 18:30:57 -07:00
Tessa Walsh	2237120cd5	Add API endpoint to recalculate org storage (#1943 ) Fixes #1942 This process might be a bit slow for large orgs, may consider moving it to background job in #1898.	2024-07-19 18:29:20 -07:00
Tessa Walsh	6ccaad26d8	Ensure org name and slug uniqueness is case-insensitive (#1929 ) Fixes #1927 Also adds tests to ensure index is working as expected, and migration to rename orgs that have names or slugs identical to other orgs except for case before the new case-insensitive index is built.	2024-07-18 15:30:12 -07:00
Ilya Kreymer	b1ccdc4d16	OpenAPI Metadata for API Endpoints (#1941 ) - Updates the `/docs` and `/redoc` API endpoints to have better metadata, including using Browsertrix favicon and our logo for the `/redoc` endpoint. - add new logo file 'docs-logo.svg' to root Based on info at: https://fastapi.tiangolo.com/how-to/extending-openapi/ https://fastapi.tiangolo.com/tutorial/metadata/ --------- Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>	2024-07-18 11:11:38 -07:00
Tessa Walsh	3bf7967754	Fix regression with saving new workflow due to profileid type error (#1946 ) Fixes #1945	2024-07-18 09:35:52 -07:00
sua yoo	f7a675ea2d	feat: Show single org status alert banner (#1937 ) Resolves #1876 ### Changes Displays single banner for critical org alerts. --------- Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com> Co-authored-by: Tessa Walsh <tessa@bitarchivist.net> Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics> Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>	2024-07-17 18:50:53 -07:00
sua yoo	42b4768b59	feat: Billing UI fast-follows (#1936 ) ### Changes - Updates customer portal link label - Opens billing portal in the same tab - Shows separate cancel date - Makes payment failed appear as error - Fixes crawl time quota	2024-07-17 17:13:28 -07:00
Tessa Walsh	c772ee2362	Fix response model for crawl errors API endpoint (#1939 ) Follow-up fix for #1920 for crawl errors endpoint, which returns a 500 following #1928, caught in nightly tests.	2024-07-17 10:52:14 -07:00
Ilya Kreymer	335700e683	Additional typing cleanup (#1938 ) Misc typing fixes, including in profiles and time functions --------- Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>	2024-07-17 10:49:22 -07:00
Ilya Kreymer	4db3053a9f	fix crawlFilenameTemplate + add_crawl_config cleanup (fixes #1932 ) (#1935 ) - ensure crawlFilenameTemplate is part of the CrawlConfig model - change CrawlConfig init to use type-safe construction - add a run_now_internal() that is shared for starting crawl, either on demand or from new config - add OrgOps.can_run_crawls() to check against org quotas for crawling - cleanup profile updates, remove _lookup_profile, only check for EmptyStr in update --------- Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>	2024-07-17 10:48:25 -07:00
Ilya Kreymer	27059c91a5	version: bump to 1.11.0-beta.1	2024-07-17 10:06:49 -07:00
Tessa Walsh	60afb19472	Add API endpoint to import subscription for existing org (#1930 ) Fixes #1926 - adds /subscriptions/import endpoint for importing an existing subscription to an existing org - add SubscriptionImport object and log as 'import' event in subscription events collection --------- Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>	2024-07-16 16:17:02 -07:00
Emma Segal-Grossman	224b011070	Small UI fixes (#1934 ) Fixes a few things that have been bugging me: - Overflow buttons in list view now (mostly) take up the their full cell area, instead of there being a couple pixels around the button where clicking would do nothing or cause navigation - \| before \| after \| \| --- \| --- \| \| <img width="238" alt="Screenshot 2024-07-16 at 3 35 25 PM" src="https://github.com/user-attachments/assets/afbda6d6-703b-4ed8-96be-a9c37660430d"> \| <img width="236" alt="Screenshot 2024-07-16 at 3 35 02 PM" src="https://github.com/user-attachments/assets/417a326a-08d2-42b2-85c3-fa007ea3bff8"> \| - Changes the class that `tab-list` uses internally so that it doesn't conflict with Tailwind's `container` class, which prevents the tab content from being limited at the default Tailwind container width - Adds a couple of Tailwind plugins for styling CSS parts (`part-[...]:`) and for arbitrary attributes (`attr-[...]:`)	2024-07-16 17:01:55 -04:00
sua yoo	5e9e897713	feat: Improve org name and slug validation (#1924 ) - Verifies org slug (name) availability when creating new org - Show org max length error when signing up - Highlights org error field when signing up - Fixes org name max length discrepancy - Standardizes org slug to lowercase	2024-07-16 13:07:09 -07:00
sua yoo	79ff806352	update org url errors	2024-07-16 12:59:54 -07:00
sua yoo	8577b5bd93	update superadmin error	2024-07-16 12:59:54 -07:00
sua yoo	38a877fa8d	Update frontend/src/utils/form.ts Co-authored-by: Emma Segal-Grossman <hi@emma.cafe>	2024-07-16 12:57:12 -07:00
Tessa Walsh	d41647e6c2	Document all API endpoints with response models (#1928 ) Fixes #1920 Adds response models to all API endpoints that were missing them, documenting current behavior without making any changes at this stage to standardize responses. Follow-up work will involve adding generics to some of the response models	2024-07-16 12:48:38 -07:00
Tessa Walsh	aaf18e70a0	Add created date to Organization and fix datetimes across backend (#1921 ) Fixes #1916 - Add `created` field to Organization and OrgOut, set on org creation - Add migration to backfill `created` dates from first workflow `created` - Replace `datetime.now()` and `datetime.utcnow()` across app with consistent timezone-aware `utils.dt_now` helper function, which now uses `datetime.now(timezone.utc)`. This is in part to ensure consistency in how we handle datetimes, and also to get ahead of timezone naive datetime creation methods like `datetime.utcnow()` being deprecated in Python 3.12. For more, see: https://blog.miguelgrinberg.com/post/it-s-time-for-a-change-datetime-utcnow-is-now-deprecated	2024-07-15 19:46:32 -07:00
sua yoo	a234a36057	standarize slugify	2024-07-15 12:06:43 -07:00
sua yoo	bafc96ac94	check org slug	2024-07-15 12:05:19 -07:00
sua yoo	adea46640e	standardize max length	2024-07-15 11:40:15 -07:00
sua yoo	6f031f1059	show correct field when validating	2024-07-15 11:02:10 -07:00
sua yoo	bdd279c4f8	show validation message	2024-07-15 10:36:16 -07:00
Tessa Walsh	a546fb6fe0	Improve handling of duplicate org name/slug (#1917 ) Initial implementation of #1892 - Modifies the backend to return `duplicate_org_name` or `duplicate_org_slug` as appropriate on a pymongo `DuplicateKeyError` - Updates frontend to handle `duplicate_org_name`, `duplicate_org_slug`, and `invalid_slug` error details - Update errors to be more consistent, also return `duplicate_org_subscription.subId` for duplicate subscription instead of the more generic `already_exists` --------- Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>	2024-07-10 19:24:50 -07:00
Ilya Kreymer	9a67e28f13	Adds Subscription API (#1914 ) Fixes https://github.com/webrecorder/browsertrix/issues/1905 - adds a new top-level `/api/subscriptions` endpoint and SubOps handler on the backend. - enable subscriptions API endpoints available only if `billing_enabled` is set in helm chart - new POST /subscriptions/create, /subscriptions/update, /subscriptions/cancel API endpoints - Subscriptions mongo collection storing timestamped /subscription API events - GET /subscriptions/events API to get subscription events, support for filtering and sorting - Subscription data model - Support for setting and handling readOnlyOnCancel on org - /orgs/<id>/billing-portal to lookup portalUrl using external API - subscription in org getter and list views - mark org as readOnly for subscription status `paused_payment_failed`, clears it on status `active` --------- Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>	2024-07-10 17:41:16 -07:00
sua yoo	d4334d42bc	feat: Enable self-service user access to billing portal (#1908 ) Resolves https://github.com/webrecorder/browsertrix/issues/1875 Follows https://github.com/webrecorder/browsertrix/pull/1914 ### Changes - When billing is enabled, adds billing tab to org settings that displays billing information if applicable - Handles external link to manage plan - Refactors org quota type to always be present - Refactors org settings into `TailwindComponent`	2024-07-10 17:11:01 -07:00

1 2 3 4 5 ...

1288 Commits