Commit Graph

  • 5b8440f295
    Fix frontend test mocks (#23) sua yoo 2021-11-22 16:31:35 -0800
  • 5722909157
    Frontend responsive UI tweaks (#20) sua yoo 2021-11-22 10:25:34 -0800
  • 14f2d13a73
    Add frontend localization support (#18) sua yoo 2021-11-20 07:44:21 -0800
  • 76e5ceb864
    Replace daisy UI with shoelace (#16) sua yoo 2021-11-19 19:38:58 -0800
  • 316a91f612
    Switch frontend to use Typescript (#12) sua yoo 2021-11-19 14:07:13 -0800
  • 29a3c29b2c
    Configure API base URL in frontend (#14) sua yoo 2021-11-19 13:30:51 -0800
  • 0f97724ad0
    Set up frontend dev tooling (#6) sua yoo 2021-11-18 17:26:10 -0800
  • 57a4b6b46f add collections api: - collections defined by name per archive - can update collections with additional metadata (currently just description) - crawl config api accepts a list of collections by name, resolved to collection uids and stored in config - finished crawls also associated with collection list - /archives/{aid}/collections/{name} can list all crawl artifacts (wacz files) from a named collection (in frictionless data package-ish format) - /archives/{aid}/collections/$all lists all crawled artifacts for the archive Ilya Kreymer 2021-10-27 09:25:58 -0700
  • 666becdb65 initial pass on frontend: using tailwindcss + daisyui + litelement with webpack build + dev server Ilya Kreymer 2021-10-10 12:22:36 -0700
  • c38e0b7bf7 use redis based queue instead of url for crawl done webhook update docker setup to support redis webhook, add consistent CRAWL_ARGS, additional fixes Ilya Kreymer 2021-10-10 12:18:28 -0700
  • 4ae4005d74 add ingress + nginx container for better routing support screencasting to dynamically created service via nginx (k8s only thus far) add crawl /watch endpoint to enable watching, creates service if doesn't exist add crawl /running endpoint to check if crawl is running nginx auth check in place, but not yet enabled add k8s nginx.conf add missing chart files file reorg: move docker config to configs/ k8s: add readiness check for nginx and api containers for smoother reloading ensure service deleted along with job todo: update dockerman with screencast support Ilya Kreymer 2021-10-09 22:41:01 -0700
  • 19879fe349 Storage + Data Model Refactor (fixes #3): - Add default vs custom (s3) storage - K8S: All storages correspond to secrets - K8S: Default storages inited via helm - K8S: Custom storage results in custom secret (per archive) - K8S: Don't add secret per crawl config - API for changing storage per archive - Docker: default storage just hard-coded from env vars (only one for now) - Validate custom storage via aiobotocore before confirming - Data Model: remove usage from users - Data Model: support adding multiple files per crawl for parallel crawls - Data Model: track completions for parallel crawls - Data Model: initial support for tags per crawl, add collection as 'coll' tag Ilya Kreymer 2021-10-09 16:51:19 -0700
  • b6d1e492d7 add redis for storing crawl state data! - supported in both docker and k8s - additional pods with same job id automatically use same crawl state in redis - support dynamic scaling (#2) via /scale endpoint - k8s job parallelism adjusted dynamically for running job (only supported in k8s so far) Ilya Kreymer 2021-09-17 15:02:11 -0700
  • 223658cfa2 misc tweaks: - better error handling for not found resources, ensure 404 - typo in k8smanager - add pylintrc - ensure manual job ares deleted when complete - fix typos, reformat Ilya Kreymer 2021-08-25 18:20:17 -0700
  • f1a816be48 README + docker-restart.sh add Ilya Kreymer 2021-08-25 16:27:22 -0700
  • 9a3356ad0d add missing scheduler! Ilya Kreymer 2021-08-25 16:18:53 -0700
  • 3b956dd537 add sample config for docker Ilya Kreymer 2021-08-25 16:17:37 -0700
  • ee20c659e9 add basic README Ilya Kreymer 2021-08-25 16:13:06 -0700
  • 36fb01cbdf docker-compose: use fixed network name Ilya Kreymer 2021-08-25 16:04:34 -0700
  • 60b48ee8a6 dockermanager + scheduler: - run as child process using aioprocessing - cleanup: support cleanup of orphaned containers - timeout: support crawlTimeout via check in cleanup loop - support crawl listing + crawl stopping Ilya Kreymer 2021-08-25 15:28:57 -0700
  • b417d7c185 docker manager: support scheduling with apscheduler and separate 'scheduler' process Ilya Kreymer 2021-08-25 12:21:03 -0700
  • 91e9fc8699 dockerman: initial pass - support for creating, deleting crawlconfigs, running crawls on-demand - config stored in volume - list to docker events and clean up containers when they exit Ilya Kreymer 2021-08-24 22:49:06 -0700
  • 20b19f932f make crawlTimeout a per-crawconfig property allow crawl complete/partial complete to update existing crawl state, eg. timeout enable handling backofflimitexceeded / deadlineexceeded failure, with possible success able to override the failure state filter out only active jobs in running crawls listing Ilya Kreymer 2021-08-24 11:27:34 -0700
  • ed27f3e3ee job handling: - job watch: add watch loop for job failure (backofflimitexceeded) - set job retries + job timeout via chart values - sigterm starts graceful shutdown by default, including for timeout - use sigusr1 to switch to instant shutdown - update stop_crawl() to use new semantics Ilya Kreymer 2021-08-23 21:19:21 -0700
  • 7146e054a4 crawls work (#1): - support listing existing crawls - add 'schedule' and 'manual' annotations to jobs, store in Crawl obj - ensure manual jobs are deleted when completed - support deleting crawls by id (but not data) - rename running crawl delete to '/cancel' Ilya Kreymer 2021-08-23 17:57:16 -0700
  • 66c4e618eb crawls work (#1), support for: - canceling a crawl (via sigterm) - stopping a crawl gracefully (via custom exec sigint) Ilya Kreymer 2021-08-23 12:25:04 -0700
  • a8255a76b2 crawljob: - support run once on existing crawl job - support updating/patching existing crawl job with new crawl config, new schedule and run once Ilya Kreymer 2021-08-21 22:10:31 -0700
  • ea9010bf9a add completed crawls to crawls table Ilya Kreymer 2021-08-20 23:52:24 -0700
  • 4b08163ead support usage counters per archive, per user -- handle crawl completion Ilya Kreymer 2021-08-20 23:05:42 -0700
  • 170958be37 rename crawls -> crawlconfigs.py add crawls for crawl api management Ilya Kreymer 2021-08-20 12:09:28 -0700
  • f2d9d7ba6a new features: - sending emai for validation + invites, configured via env vars - inviting new users to join an existing archive - /crawldone webhook to track verify crawl id (next: store crawl complete entry) Ilya Kreymer 2021-08-20 11:02:29 -0700
  • 627e9a6f14 cleanup crawl config, add separate 'runNow' field crawler: add cpu/memory limits minio: auto-create bucket for local minio Ilya Kreymer 2021-08-19 14:15:21 -0700
  • eaa87c8b43 support for user roles (owner, crawler, viewer), owner users can issue invites to other existing users by email to join existing archives Ilya Kreymer 2021-08-18 20:34:24 -0700
  • 61a608bfbe update models: - replace storages with archives, which have a single storage (for now) - crawls associated with archives - users below to archive, with one admin user (if archive created by default) - update crawlconfig for latest browsertrix-crawler (0.4.4) - k8s: fix permissions for crawler role - k8s: fix minio service (now requiring two ports) Ilya Kreymer 2021-08-18 16:49:41 -0700
  • f77eaccf41 support committing to s3 storage move mongo into separate optional deployment along with minio support for configuring storages support for deleting crawls, associated config and secrets Ilya Kreymer 2021-07-02 15:56:24 -0700
  • a111bacfb5 add k8s support - working apis for adding crawls, removing crawls in mongo, mapped to k8s cronjobs - more complete crawl spec - option to start on-demand job from cronjobs - optional minio in separate deployment/service Ilya Kreymer 2021-06-30 21:48:44 -0700
  • c3143df0a2 rename archives -> storages add crawlconfig apis run lint pass, prep for k8s / docker crawl manager support Ilya Kreymer 2021-06-29 20:30:33 -0700
  • b08a188fea initial commit! Ilya Kreymer 2021-06-28 15:48:59 -0700