browsertrix

Author	SHA1	Message	Date
sua yoo	8c4348b9f8	Show exclusion editor when creating & editing crawl templates (#353 )	2022-11-14 19:34:15 -08:00
sua yoo	d41b582ef6	Remove exclusion from running crawl (#352 )	2022-11-14 10:58:33 -08:00
Ilya Kreymer	793611e5bb	add exclusion api, fixes #311 (#349 ) * add exclusion api, fixes #311 add new apis: `POST crawls/{crawl_id}/exclusion?regex=...` and `DELETE crawls/{crawl_id}/exclusion?regex=...` which will: - create new config with add 'regex' as exclusion (deleting or making inactive previous config) OR remove as exclusion. - update crawl to point to new config - update statefulset to point to new config, causing crawler pods to restart - filter out urls matching 'regex' from both queue and seen list (currently a bit slow) (when adding only) - return 400 if exclusion already existing when adding, or doesn't exist when removing - api reads redis list in reverse to match how exclusion queue is used	2022-11-12 17:24:30 -08:00
sua yoo	95ec1599ef	Add exclusion to running crawl (#347 )	2022-11-08 18:09:11 -06:00
sua yoo	baacbbdc52	Highlight regular expression syntax in Exclusions Table (#341 )	2022-11-01 15:31:01 -07:00
Ilya Kreymer	d340bceb39	style pass: normalize docstring spacing	2022-10-19 21:47:34 -07:00
Ilya Kreymer	dde4c5ee68	k8s chart: ingress: use separate ingress for authsign to allow ssl-redirect true on main ingress mongo: local: disable readiness check for now due to issues with eval command (for now)	2022-10-15 13:46:31 -07:00
lasztoth	5ccacd8a16	Fixed issue with missing variables affecting deployments on Podman (#338 )	2022-10-14 18:06:51 -07:00
sua yoo	97eb17784d	Display exclusions & list of URLs in crawl queue (#337 ) - including pagination of queue results (30 results per page currently) - show numbering on paginated results - allow user navigation to each result page	2022-10-12 20:19:13 -07:00
Brian Grinstead	cbc0876184	Remove duplicate step 2.5 (#339 )	2022-10-12 19:57:33 -07:00
Ilya Kreymer	f7836c345d	Crawl Queue API (#342 ) * crawl queue api work: (#329) - add api to /crawls/{crawl_id}/queue api to get crawl queue, with offset, count, and optional regex. returns results and regex matches within the results, along with total urls in queue. - add api to match entire crawl queue, /crawls/{crawl_id}/queueMatch with query 'regex' arg, which processes entire crawl queue on backend and returns a list of matches (more experimental) - if crawl not yet started / redis not available, return empty queue - only supported for k8s deployment at the moment	2022-10-12 19:56:13 -07:00
sua yoo	8708c24a74	Improve crawl elapsed time UX (#323 ) Smoother elapsed crawl timer: - Crawls list: show seconds increment up to 2 minutes, then show minutes only - Crawls detail: show seconds increment up to one day	2022-10-05 21:12:31 -04:00
sua yoo	2bfbeab55f	build: copy ts declaration file	2022-10-04 13:38:12 -07:00
sua yoo	0bbb7905bd	Add crawl queue editor UI components (#331 ) WIP #304	2022-10-04 13:13:40 -07:00
Ilya Kreymer	ef7a7e538c	backend: consider crawl complete if pages crawled exceeds pages found, due to retries, should fix #306 (will look at cause separately in crawler)	2022-09-28 11:32:08 -07:00
sua yoo	e696104ffa	Update crawl template copy (#325 )	2022-09-27 19:49:24 -07:00
sua yoo	709936dfa7	hotfix: decrease size of running crawl action button	2022-09-27 19:09:49 -07:00
sua yoo	63ada3e5b3	Update base fonts and text sizes (#327 )	2022-09-27 14:32:57 -07:00
sua yoo	94e3dff27f	update sentry CDN script	2022-09-27 12:29:02 -07:00
sua yoo	20bd8ceecb	Fix browser profile table alignment (#322 )	2022-09-26 17:14:08 -07:00
Ed Summers	6e9fd96a64	Allow for "custom" scopeType (#324 ) At the moment picking "custom" yields a UI error: ``` scopeType: value is not a valid enumeration member; permitted: 'page', 'page-spa', 'prefix', 'host', 'domain', 'any' ```	2022-09-22 19:17:32 -07:00
Ilya Kreymer	447b0bf9b9	k8s chart + values tweak: (#317 ) - mongo chart to avoid requiring username/password if passing db_url - tweaks to default values (set registration enabled by default, longer) add missing options	2022-09-21 12:45:08 -07:00
sua yoo	2ebd1eb2f6	Continue to watch crawl while stopping (#316 ) * show when running * redirect after done * show banner that crawl is stopping	2022-09-21 12:39:00 -07:00
Ilya Kreymer	6b63b72a13	backend config tweaks: - send SIGUSR2 instead of SIGUSR1 for scale down - chart: move persistentVolumeClaimRetentionPolicy to correct place in chart	2022-09-16 16:28:31 -07:00
Ilya Kreymer	2531a03e41	fix stopping crawls + profiles: (fixes #298 ) (#309 ) - regression fix: ensure correct signals are set to stop crawl (SIGUSER1 + SIGTERM) - crawl stop: if crawl is still running after 60 seconds, allow signal to be resent - regression fix: ensure crawling with profile is working in k8s	2022-09-09 18:31:43 -07:00
Ilya Kreymer	1216f6cb66	K8s: update chart for local minio + mongo default (#301 ) * k8s chart fixes: mongo: pin to 5.0.11 version for now minio: create empty dir for local storage for now instead of using mc, use 'btrix-data' as bucket name	2022-09-02 13:07:47 -07:00
Ilya Kreymer	f0c079dc1b	k8s: update default images to dev images in values.yaml	2022-09-01 16:18:15 -07:00
Francis Kayiwa	487110eca3	Deployment: Add Ansible setup to deploy with microk8s (#296 ) - adds an ansible/ directory for management deployments, starting with microk8s - has a microk8s role we will need to add workers - has a playbook with variables that can install most places	2022-08-19 12:49:21 -07:00
Ilya Kreymer	2842ca6a06	quick fix: add back USER_ID to crawlconfig config map, needed by crawler pod	2022-08-11 17:33:12 -07:00
Ilya Kreymer	3859be009a	k8s: don't add entire crawl config as env var from configmap, add only specified env vars from configmap fix issue with crawls with large number of seeds failing due to unusually large env var	2022-08-11 15:33:22 -07:00
sua yoo	319a8a3c07	make clearer that profile selection is optional and that a default profile is used by default (#290 ) - Rename 'Select Profile' -> 'Default Profile' - Rename 'No Profiles' -> 'No Additional Profiles'	2022-08-10 15:54:39 -07:00
sua yoo	ee6161ad43	Frontend browser profile editor enhancements (#288 ) - add button to duplicate profile from main view - add save / cancel button when editing - change location of 'full screen' button	2022-08-10 15:51:34 -07:00
Ilya Kreymer	b11a5f136a	profile browser deletion/removal: - ensure profile browser DELETE command is working - ensure profile browser job expires if no initial ping - logging: print exception for base job if init fails	2022-08-02 18:31:33 -07:00
Ilya Kreymer	df905682a5	backend: fix scaling api response, return error details if available	2022-06-29 18:37:04 -07:00
sua yoo	9606d59c3d	Improve format of crawl template config error from server (#281 ) * better display of api errors, such as fields missing or invalid urls, addresses #280	2022-06-29 17:57:03 -07:00
sua yoo	301b05ff4e	Refactor screencast websocket connection and retry (#276 ) * replace ip with index and retry connection, fixes #252	2022-06-29 17:55:32 -07:00
Ilya Kreymer	2717a60763	improvements / bug fixes for stop/cancel handling: (#279 ) - only send signal if stopping, no need for canceling as pods/containers will be removed - refactor stop/cancel handling to be unified in manager, separate in job - when stopping / graceful shutdown, return false if sending signal fails - return success=true in json response if and only if stop/cancel actually succeeds, return 'error' message in error, should fix #270 - allow canceling after stopping / if stopping fails - ensure finished time is set in case of cancelation before crawl starts, should fix #273	2022-06-29 17:47:25 -07:00
sua yoo	1c52902ea0	Update crawl scale label for UI consistency (#275 ) closes #254	2022-06-29 16:14:03 -07:00
Ilya Kreymer	50c525853f	validation: ensure seed urls, and other url properties, are validated on POST by using pydantic HttpUrl type, fixes #277 (#278 )	2022-06-29 16:09:32 -07:00
Ilya Kreymer	3fec2a9f82	dev server: also proxy /data directory for testing replay from a remote instance locally (#266 ) API_BASE_URL will need to be set to 'http://remote.example.com/' instead of 'http://remote.example.com/api/'	2022-06-29 15:47:20 -07:00
sua yoo	92292591ad	Re-run crawl from detail view + handle inactive crawl template (#268 ) closes #253	2022-06-29 14:17:09 -07:00
sua yoo	d144591dbf	Display & edit crawl schedule in user local time (#271 ) closes #255	2022-06-27 13:01:20 -07:00
sua yoo	c2aa4e6319	Fix AM/PM toggle (#272 )	2022-06-23 16:35:47 -07:00
sua yoo	c2be1a27ce	Handle stopping state in UI (#269 ) closes #262	2022-06-23 16:35:03 -07:00
Ilya Kreymer	c6d9e7d612	config sample: switch back to browsertrix-crawler:latest for now	2022-06-17 13:39:45 -07:00
Ilya Kreymer	37ea3ed2af	config/scripts: - additional fixes to signing.yml config - add missing 'set -o allexport' before 'source'	2022-06-16 22:36:44 -07:00
Ilya Kreymer	b9d7907ab3	Single config and env vars (#267 ) * simplify back to single config.env! - back to good ole env vars! - remove shared secret, which made it difficult to have scheduled crawls, since secrets are immutable, so could not update config if a scheduled crawl existed :/ - all env vars unified in configs/config.env - run-swarm.sh and run-pod.sh 'source' this config - remove config.sample.yaml - customize minio volume dir via config.env - customize redis port via config.env - include authsign ports in debug-ports config	2022-06-16 21:50:03 -07:00
raffaele messuti	70767f0ac2	small fixes on docker swarm deployment (#265 ) * fix COPY with multiple files * Update Deployment.md	2022-06-16 19:56:40 -07:00
sua yoo	b40765134c	Re-run crawl from crawls list view (#264 ) * run crawl from crawls list, and show link to newly started crawl * if crawl is already running, show link to previously running crawl	2022-06-15 18:54:57 -07:00
sua yoo	a8757e2e50	Screencast UX enhancements (#251 ) * animate starting state * consistent fixed-size slots for each browser (url + screencast) * add tooltip for expected number of browsers (workers x scale)	2022-06-15 18:50:14 -07:00

... 5 6 7 8 9 ...

558 Commits