browsertrix

Author	SHA1	Message	Date
Ilya Kreymer	36bd228115	version: update to 1.8.0-beta.0	2023-10-17 18:06:55 -07:00
sua yoo	6b897e281c	hotfix: display workflow list date as utc	2023-10-17 15:51:24 -07:00
Henry Wilkinson	adf71f132e	Adds missing user documentation for launch! (#1286 ) Closes #1215 - Adds account settings page - Adds overview page - Adds archived items page - Adds note about browser profile metadata editing - Adds note on editing the crawler instances scale while crawling - Adds details on permission levels for the org settings - Removes note about not being able to change your display name (follows #1265)	2023-10-16 19:16:38 -07:00
Ilya Kreymer	a0def4f2b3	ansible microk8s additional cleanup (#1295 ) follow-up to #1264: - microk8s: move default inventory vars role defaults - microk8s: improve debugging of template output - do: move teardown tasks to new role	2023-10-16 18:55:35 -07:00
Ilya Kreymer	b3f530f8e6	version: bump to 1.7.0	2023-10-16 18:39:20 -07:00
sua yoo	ab8e82cd28	Update org custom URL label (#1292 ) Fast follower https://github.com/webrecorder/browsertrix-cloud/pull/1276 Updates label, info text, and preview text for org slug field to be more user-friendly use 'Custom URL Identifier' and 'Custom your organization's web address for accessing Browsertrix Cloud' --------- Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com> Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>	2023-10-16 15:08:43 -07:00
Ilya Kreymer	ddc4e03422	operator status typo fix: (#1293 ) - don't log normal exists as crashes! - set pod_status.exitCode to the exitCode - count exit code 13 as not-a-crash also (force interrupt)	2023-10-16 15:01:46 -07:00
Ilya Kreymer	1bc4697995	optimization: avoid updating whole org when only need to set one field (#1288 ) - add update_users and update_slug_and_name - rename update to update_full	2023-10-16 10:54:04 -07:00
Ilya Kreymer	dc8d510b11	webhook tweak: pass oid to crawl finished and upload finished webhooks (#1287 ) Optimizes webhooks by passing oid directly to webhooks: - avoids extra crawl lookup - possible for crawl to be deleted before webhook is processed via operator (resulting in crawl lookup to fail) - add more typing to operator and webhooks	2023-10-16 10:51:36 -07:00
Henry Wilkinson	6d6fa03ade	Disable collection share button actions for viewer users (#1282 ) Closes #1273 - Viewers can see the share button and the dialogue's sharing info if the collection is sharable - Viewers can't see or change the share toggle - Viewers can't see the share button if the collection is not sharable	2023-10-16 10:50:33 -07:00
Ilya Kreymer	a295f5d05d	version: bump to 1.7.0-beta.3	2023-10-15 18:31:03 -07:00
Tessa Walsh	2383b0d616	Set log download attachment name to crawl_id.log (#1280 ) Fixes #1271 Using .log for now due to broader support for opening with default viewers --------- Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>	2023-10-13 20:00:37 -07:00
Tessa Walsh	aa3f1ebf5f	Add down command to uninstall and delete data (#1285 ) Small improvement to `btrix` helper. Adds `./btrix down` command to uninstall and delete data without resetting the dev environment.	2023-10-13 17:16:12 -07:00
Tessa Walsh	c5ca250f37	Add id-slug lookup and restrict slugs endpoints to superadmins (#1279 ) Fixes #1278 - Adds `GET /orgs/slug-lookup` endpoint returning `{id: slug}` for all orgs - Restricts new endpoint and existing `GET /orgs/slugs` to superadmins	2023-10-13 17:02:19 -07:00
sua yoo	8466caf1d9	Allow org admins to update slug (#1276 ) - Allows editing of org slugs (actual URL updates will be handled in https://github.com/webrecorder/browsertrix-cloud/issues/1258.) - Converts user input to slug using slugify - Adds help text to org name and slug - Renames tab from "information" to "general" settings	2023-10-13 17:00:43 -07:00
Henry Wilkinson	0bd8748e68	Minor Workflow Creator UX Changes (#1267 ) - Adds `position: sticky` to the workflow creator / editor controls to affix them to the bottom of the screen, they are now always visible! - Renames "Extra URLs in Scope" to "Extra URL Prefixes in Scope" - Updates documentation accordingly - Adjusts casing for checkboxes - Adds the multiplication sign to the crawler instances settings to better communicate that they are increases in scale and not arbitrary numbers.	2023-10-13 16:55:54 -07:00
sua yoo	22fbf92ed6	Show storage values for each item type when no quota (#1260 ) Hides chart and shows size values for each Storage line when org has no quota. No changes to orgs with quota. (Follow-up to #1188)	2023-10-13 14:31:33 -07:00
sua yoo	630c00c5b0	Enforce strong passwords in UI (#1266 )	2023-10-12 19:36:59 -07:00
Anish Lakhwara	834fa72baf	Refactor microk8s playbook to follow "new" structure (#1264 ) * Refactor microk8s playbook to follow structure with shared roles - Integrates with btrix/deploy role for deploying - Seperated RedHat and Debian into seperate roles - Created Common role - allow running remotely by default - use 'browsertrix_cloud_home' for charts path - add additional customizable options to btrix_values.j2 (todo: unify all the templates) - docs: update to new playbook path --------- Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>	2023-10-11 19:33:30 -07:00
Ilya Kreymer	41c054d209	Storage ops followup type checking (#1274 ) * storage ops: follow up to #1257: - fix refactor typo - add type hints for all storageops apis (add mypy_boto3_s3 and types_aiobotocore_s3 for type hints)	2023-10-11 14:03:00 -07:00
sua yoo	f1dcc7e48a	Allow users to change display name and email (#1265 )	2023-10-11 13:42:41 -07:00
Ilya Kreymer	c591a5755d	test quickfix: microk8s crawls were not running due to exceeding CI resource capacity to fix: - disable metrics-server - lower per-browser mem/cpu requirements	2023-10-10 23:29:10 -07:00
Tessa Walsh	266afdf8d9	Add slugs to org backend (#1250 ) - Add slug field with uniqueness constraint to Organization - Use python-slugify to generate slug from name and import that in migration - Require name in all /rename and org creation requests - Auto-generate slug for new org with no slug or when /rename is called w/o a slug - Auto-generate slug for 'default-org' based on name - Add /api/orgs/slugs GET endpoint to return all slugs in use - tests: extend backend test-requirements.txt from requirements to allow testing slugify - tests: move get_redis_crawl_stats() to avoid extra dependency in utils	2023-10-10 18:30:09 -07:00
Ilya Kreymer	16e7a1d0a2	Storage Ops Refactor (#1257 ) * storage ops refactor: - create StorageOps class similar to other ops classes - init storages list in StorageOps, no longer require lookup up default storages via CrawlManager - convert all storage functions to members, add storageops to operator - remove unused params, ensure crawl exists for rollover restart - add env var to determine if using local minio to use correct endpoint URL * crawls /seeds endpoint: just return empty list if not a crawl (eg. upload) * crawlmanager: remove unused code, rename check_storage -> has_storage	2023-10-10 15:04:23 -07:00
Ilya Kreymer	5cad9acee9	Compute crawl execution time in operator (#1256 ) * store execution time in operator: - rename isNewCrash -> isNewExit, crashTime -> exitTime - keep track of exitCode - add execTime counter, increment when state has a 'finishedAt' and 'startedAt' state - ensure pods are complete before deleting - store 'crawlExecSeconds' on crawl and org levels, add to Crawl, CrawlOut, Organization models * support for fast cancel: - set redis ':canceled' key to immediately cancel crawl - delete crawl pods to ensure pod exits immediately - in finalizer, don't wait for pods to complete when canceling (but still check if terminated) - add currentTime in pod.status.running.startedAt times for all existing pods - logging: log exec time, missing finishedAt - logging: don't log exit code 11 (interrupt due to time/size limits) as a crash * don't wait for pods completed on failed with existing browsertrix-crawler image --------- Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>	2023-10-09 17:45:00 -07:00
Tessa Walsh	748c86700d	fix: lookup user object operator to pass to CrawlConfig.add_new_crawl (#1254 ) fixes #1253 Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>	2023-10-05 21:30:10 -07:00
Ilya Kreymer	fa86555eed	Track pod resource usage, detect OOM crashes, handle auto-scaling (#1235 ) * keep track of per pod status on crawljob: - crashes time, and reason - 'used' vs 'allocated' resources - 'percent' used / allocated * crawl log errors: log error when crawler crashes via OOM, either via redis error log or to console * add initial autoscaling support! - detect if metrics server is available via K8SApi.is_pod_metrics_available() - if available, use metrics for 'used' fields - if no metrics, set memory used for redis only (using redis apis) - allow overriding memory and cpu via newMemory and newCpu settings on pod status - scale memory / cpu based on newMemory and newCpu setting - templates: update jinja templates to allow restarting crawler and redis with new resources - ci: enable metrics-server on k3d, microk8s and nightly k3d ci runs * roles: cleanup unused roles, add permissions for listing metrics * stats for running crawls: - update in db via operator - avoids losing stats if redis pod happens to be done - tradeoff is more db access in operator, but less extra connections to redis + already loading from db in backend - size stat: ensure size of previous files is added to the stats * crawler deployment tweaks: - adjust cpu/mem per browser - add --headless flag to configmap to use new headless mode by default!	2023-10-05 20:41:18 -07:00
Ilya Kreymer	20560abb81	version: bump to 1.7.0-beta.2	2023-10-05 20:33:38 -07:00
sua yoo	f2261bcb34	Fix frontend not redirecting on 401 (#1244 ) - Ensures need-login event bubbles until handled - Redirects on 401 from /refresh endpoint - Go to previous URL upon login, rather than always to home page - Shows accurate login notification (rather than less precise "couldn't retrieve org" or similar message)	2023-10-04 00:17:22 -07:00
sua yoo	38efeccc25	Limit URL list entry to maximum URLs (#1242 ) - Limits URL list entry to 1,000 URLs - Limits additional URL list entry to 100 URLs - Shows first invalid URL in list in error message - Quick and dirty fix for long URLs wrapping: Show URLs in list on one line, with entire container scrolling --------- Co-authored-by: Henry Wilkinson <henry@wilkinson.graphics>	2023-10-03 21:02:32 -07:00
Henry Wilkinson	99ccdf2de8	Browser Profile Warning & Dialog Style Updates (#1243 ) * Give protocol selection box smaller max-width * Add warning and docs link to browser profile creation - Updates dialog styling to btrix dialog - Updates button sizes - Updates button placement in dialog - Updates button labels for consistency with other buttons in app - Updates docs page with new button labels * Update browser profile edit metadata dialog. Matches updated dialog shown on profile creation * Open docs page in new tab	2023-10-03 18:59:19 -07:00
Tessa Walsh	bbdb7f8ce5	Require that all passwords are between 8 and 64 characters (#1239 ) - Require that all passwords are between 8 and 64 characters - Fixes account settings password reset form to only trigger logged-in event after successful password change. - Password validation can be extended within the UserManager's validate_password method to add or modify requirements. - Add tests for password validation	2023-10-03 18:57:46 -07:00
Tessa Walsh	b1ead614ee	Add --failOnFailedSeed checkbox to URL list workflows (#1236 ) - If set, and any of the seeds fails, the entire crawl is marked as a failure. - Add checkbox which adds --failOnFailedSeed checkbox to URL list workflows - Add 'Fail Crawl On Failed URL' to crawl workflow setup docs	2023-10-03 18:46:09 -07:00
sua yoo	4f36a94bc6	Update local dev docs (#1246 ) Suggest uncommenting backend_image and frontend_image to use local images	2023-10-03 17:05:21 -04:00
Tessa Walsh	e9bac4c088	API delete endpoint improvements (#1232 ) - Applies user permissions check before deleting anything in all /delete endpoints - Shuts down running crawls before deleting anything in /all-crawls/delete as well as /crawls/delete - Splits delete_list.crawl_ids into crawls and upload lists at same time as checks in /all-crawls/delete - Updates frontend notification message to Only org owners can delete other users' archived items. when a crawler user attempts to delete another users' archived items	2023-10-03 13:05:00 -07:00
sua yoo	df190e12b9	Show running workflow error logs (#1224 ) - Adds "Logs" tab to workflow detail - Shows error logs in expandable section in "Watch" tab - Show corresponding message (no logs yet or logs temporarily unavailable) when `/errors` returns 503 based on crawl state - text tweaks: use error logs instead of logs, change 'crawl start' -> 'crawl continue' in log message --------- Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>	2023-10-03 00:03:21 -07:00
Anish Lakhwara	a2dbad35c3	feat: use is_bool to check EMAIL_SMTP_USE_TLS (#1231 ) - use is_bool to check EMAIL_SMTP_USE_TLS - use is_bool for yaml values that are boolean	2023-10-02 21:29:36 -07:00
sua yoo	3fea4cabe2	Show storage meter even with no quota (#1240 ) - Displays how much storage items and browser profiles take up even when quota is not specified	2023-10-02 20:01:39 -07:00
sua yoo	941a75ef12	Separate seeds into a new endpoints (#1217 ) - Remove config.seeds from workflow and crawl detail endpoints - Add new paginated GET /crawls/{crawl_id}/seeds and /crawlconfigs/{cid}/seeds endpoints to retrieve seeds for a crawl or workflow - Include firstSeed in GET /crawlconfigs/{cid} endpoint (was missing before) - Modify frontend to fetch seeds from new /seeds endpoints with loading indicator --------- Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>	2023-10-02 10:56:12 -07:00
Anish Lakhwara	1bf531e1ec	Fix: Make Collections Public on Creation (#1213 ) - Add isPublic to Add Collection endpoint, send isPublic from frontend - Fixes #1212	2023-09-29 12:08:10 -07:00
sua yoo	90e3a300cc	"Add new" dialog for all resources (#1202 ) - Replaces individual "New" buttons in home page with dropdown button in header (includes Crawl Workflow, Upload Collection, Browser Profile) - Refactors required step of new workflow and new collection into dialog	2023-09-29 09:11:24 -07:00
Anish Lakhwara	037396f3d9	Fix: Stream log downloading from WACZ (#1225 ) * Fix(backend): Stream logs without causing OOM Also be smarter about when to use `heapq.merge` and when to use `itertools.chain`: If all the logs are coming from the same instance we `chain` them, otherwise we'll `merge` them iterator fixes: - group wacz files by instance by suffix, eg. -0.wacz, -1.wacz, -2.wacz - sort wacz files, and all logs within each wacz file - chain log iterators for all log files within wacz group - merge log iterators across wacz files in different groups - add type hints to help keep track of iterator helper functions - add iter_lines() from botocore, use that for line parsing for simplicity --------- Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>	2023-09-28 18:54:52 -07:00
Ilya Kreymer	d6bc467c54	improvements to redis pod: (#1219 ) - add liveness check/fix readiness check - ensure 'redis-cli ping' actually returns 'PONG', as exit code is 0 even if errors will detect situations where redis is not available, such as due to to max clients being reached - bump redis memory/cpu for now (until autoscaling/automatic adjustment is available)	2023-09-28 13:00:31 -07:00
Ilya Kreymer	7eac0fdf95	optimization: convert all uses of 'async for' to use iterator directly (#1229 ) - optimization: convert all uses of 'async for' to use iterator directly instead of converting to list to avoid unbounded size lists - additional cursor.to_list() to async for conversions for stats computation, simply crawlconfigs stats computation --------- Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>	2023-09-28 12:31:08 -07:00
Vinzenz Sinapius	cabf4ccc21	Disable `smtp_use_tls` with `false` instead of empty string (#1184 ) `smtp_use_tls = bool(os.environ.get("EMAIL_SMTP_USE_TLS", True))` would only disable tls when `EMAIL_SMTP_USE_TLS` is set to an empty string which is not intuitive	2023-09-28 12:10:20 -07:00
Henry Wilkinson	e93f195d59	fix: Right Align Copy Buttons & `<btrix-desc-list>` vertical `width: 100%` (#1177 ) * Reorders actions, adds tooltip - All copy buttons on the collection share dialog are now on the right side - Adds a tooltip to tell the user the button opens the link in a new tab * Make vertical `dec-list` items fill 100% width of their parent container - Allows for better placement of items within the container - Adds horizontal padding to info bars * Right align copy button in item details page	2023-09-28 12:08:27 -07:00
Ilya Kreymer	86a424af93	migration improvements: (#1228 ) * migration improvements + rerunning migrations: (fixes #1227) - avoid starting some workers while migration is still running - ensure workers that aren't performing migration await for migration to complete - backend will not be valid until migration is run * allow rerunning migration from specified version via --set rerun_from_migration=<VERSION> (replaces rerun_last_migration)	2023-09-28 12:04:19 -07:00
Tessa Walsh	1f74f03447	Recalculate Organization.storedBytes in migration 0017 (#1220 )	2023-09-28 11:22:10 -07:00
Vinzenz Sinapius	9b125bc2c6	Passthrough X-Forwarded-Proto header in frontend nginx (#1226 ) If X-Forwarded-Proto header is already set, pass that through instead of setting to current scheme.	2023-09-28 10:58:57 -07:00
Tessa Walsh	7a56fa23f5	Remove username lookups for crawls and workflows by storing usernames in db (#1199 ) * store usernames (createdByName, modifiedByName, startedByName) in db for workflows * store userName for userid for crawls in db * update output models to return usernames * add migration 0018 to add usernames to existing crawls and crawlconfigs * updated tests for crawl and config usernames * use async for to iterate over crawls and crawlconfigs --------- Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>	2023-09-28 09:37:23 -07:00

1 2 3 4 5 ...

972 Commits