Commit Graph

146 Commits

Author SHA1 Message Date
Ilya Kreymer
d2d7240455
background jobs fix: ensure bucket is parsed correctly (#1359)
Follow-up to #1321 
- correctly parse the endpoint_url into prefix and bucket path
- also add region and s3 provider type to storage secrets
2023-11-08 15:08:23 -08:00
Ilya Kreymer
5530ca92e1
Move backend app templates to be installed from configmap volume (#1331)
Instead of adding the app templates launched from the backend via
`backend/btrixcloud/templates`, add them to a configmap and mount the
configmap in the same location.

This allows these templates to be updated, like other values in
charts/... without having to rebuild any of the images, speeding up dev
and maintenance time.

Changes include:
- move backend/btrixcloud/templates -> chart/app-templates/
- add app-templates/*.yaml to app-templates configmap
- mount app-templates configmap to /app/btrixcloud/templates/ in api and op containers
2023-11-06 09:37:48 -08:00
Francesco Servida
0b8bbcf8e6
Allow User to specify custom cluster-issuer (#1332)
Implemented variable and defaults for cluster-issuer to allow users to
specify, if needed, their own cluster issuer. (eg. installations with
only outbound traffic that cannot solve ACME https challenge)
2023-11-04 13:29:17 -07:00
Francesco Servida
4998274ab0
correctly suffix Auth-Signer url when running in custom namespace (#1335) 2023-11-04 10:34:05 -07:00
Ilya Kreymer
fb3d88291f
Background Jobs Work (#1321)
Fixes #1252 

Supports a generic background job system, with two background jobs,
CreateReplicaJob and DeleteReplicaJob.
- CreateReplicaJob runs on new crawls, uploads, profiles and updates the
`replicas` array with the info about the replica after the job succeeds.
- DeleteReplicaJob deletes the replica.
- Both jobs are created from the new `replica_job.yaml` template. The
CreateReplicaJob sets secrets for primary storage + replica storage,
while DeleteReplicaJob only needs the replica storage.
- The job is processed in the operator when the job is finalized
(deleted), which should happen immediately when the job is done, either
because it succeeds or because the backoffLimit is reached (currently
set to 3).
- /jobs/ api lists all jobs using a paginated response, including filtering and sorting
- /jobs/<job id> returns details for a particular job
- tests: nightly tests updated to check create + delete replica jobs for crawls as well as uploads, job api endpoints
- tests: also fixes to timeouts in nightly tests to avoid crawls finishing too quickly.

---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2023-11-02 13:02:17 -07:00
Ilya Kreymer
6dc452ebad
Storage Refactor: Replication + Custom Storage Support (#1296)
- Refactors storage to support replicas + custom storages on the Org.
- There is a default primary + replica storage, while an Org can also have
primary and replica storages.
- StorageRef object is used to store references to default and custom
storage.

- CrawlFile has been updated to contain a StorageRef instead of a
def_storage_name, which references
either a default storage (in StorageOps) or custom storage (in
Organization)
- There is also a 'replicas' Optional[List[StorageRef]] which contains
replicas, if any.
- CrawlFileOut contain a numReplicas for how many replicas exist for
a given file.
- Migration: migration 0020 added to migrate existing Orgs, CrawlFile and ProfileFile objects to new storage system (CrawlFile and ProfileFile now extend BaseFile)


Part of #1262

---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2023-10-26 21:44:09 -07:00
Tessa Walsh
d58747dfa2
Provide full resources in archived items finished webhooks (#1308)
Fixes #1306 

- Include full `resources` with expireAt (as string) in crawlFinished
and uploadFinished webhook notifications rather than using the
`downloadUrls` field (this is retained for collections).
- Set default presigned duration to one minute short of 1 week and enforce
maximum supported by S3
- Add 'storage_presign_duration_minutes' commented out to helm values.yaml
- Update tests

---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2023-10-23 19:01:58 -07:00
Anish Lakhwara
834fa72baf
Refactor microk8s playbook to follow "new" structure (#1264)
* Refactor microk8s playbook to follow structure with shared roles

- Integrates with btrix/deploy role for deploying
- Seperated RedHat and Debian into seperate roles
- Created Common role

- allow running remotely by default
- use 'browsertrix_cloud_home' for charts path
- add additional customizable options to btrix_values.j2 (todo: unify all the templates)
- docs: update to new playbook path

---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2023-10-11 19:33:30 -07:00
Ilya Kreymer
16e7a1d0a2
Storage Ops Refactor (#1257)
* storage ops refactor:
- create StorageOps class similar to other ops classes
- init storages list in StorageOps, no longer require lookup up default storages via CrawlManager
- convert all storage functions to members, add storageops to operator
- remove unused params, ensure crawl exists for rollover restart
- add env var to determine if using local minio to use correct endpoint URL

* crawls /seeds endpoint: just return empty list if not a crawl (eg. upload)

* crawlmanager: remove unused code, rename check_storage -> has_storage
2023-10-10 15:04:23 -07:00
Ilya Kreymer
fa86555eed
Track pod resource usage, detect OOM crashes, handle auto-scaling (#1235)
* keep track of per pod status on crawljob:
- crashes time, and reason
- 'used' vs 'allocated' resources 
- 'percent' used / allocated

* crawl log errors: log error when crawler crashes via OOM, either via redis error log
or to console

* add initial autoscaling support!
- detect if metrics server is available via K8SApi.is_pod_metrics_available()
- if available, use metrics for 'used' fields
- if no metrics, set memory used for redis only (using redis apis)
- allow overriding memory and cpu via newMemory and newCpu settings on pod status
- scale memory / cpu based on newMemory and newCpu setting
- templates: update jinja templates to allow restarting crawler and redis with new resources
- ci: enable metrics-server on k3d, microk8s and nightly k3d ci runs

* roles: cleanup unused roles, add permissions for listing metrics

* stats for running crawls:
- update in db via operator
- avoids losing stats if redis pod happens to be done
- tradeoff is more db access in operator, but less extra connections to redis + already
loading from db in backend
- size stat: ensure size of previous files is added to the stats

* crawler deployment tweaks:
- adjust cpu/mem per browser
- add --headless flag to configmap to use new headless mode by default!
2023-10-05 20:41:18 -07:00
Ilya Kreymer
86a424af93
migration improvements: (#1228)
* migration improvements + rerunning migrations: (fixes #1227)
- avoid starting some workers while migration is still running
- ensure workers that aren't performing migration await for migration to complete
- backend will not be valid until migration is run
* allow rerunning migration from specified version via --set rerun_from_migration=<VERSION> (replaces rerun_last_migration)
2023-09-28 12:04:19 -07:00
Ilya Kreymer
c9c39d47b7
Scheduled Crawl Refactor: Handle via Operator + Add Skipped Crawls on Quota Reached (#1162)
* use metacontroller's decoratorcontroller to create CrawlJob from Job
* scheduled job work:
- use existing job name for scheduled crawljob
- use suspended job, set startTime, completionTime and succeeded status on job when crawljob is done
- simplify cronjob template: remove job_image, cron_namespace, using same namespace as crawls,
placeholder job image for cronjobs

* move storage quota check to crawljob handler:
- add 'skipped_quota_reached' as new failed status type
- check for storage quota before checking if crawljob can be started, fail if not (check before any pods/pvcs created)

* frontend:
- show all crawls in crawl workflow, no need to filter by status
- add 'skipped_quota_reached' status, show as 'Skipped (Quota Reached)', render same as failed

* migration: make release namespace available as DEFAULT_NAMESPACE, delete old cronjobs in DEFAULT_NAMESPACE and recreate in crawlers namespace with new template
2023-09-12 13:05:43 -07:00
Ilya Kreymer
ad9bca2e92
Operator refactor to control pods + pvcs directly instead of statefulsets (#1149)
- Ability for pod to be Completed, unlike in Statefulset - eg. if 3 pods are running and first one finishes, all 3 must be running until all 3 are done. With this setup, the first finished pod can remain in Completed state.
- Fixed shutdown order - crawler pods now correctly shutdown first before redis pods, by switching to background deletion.
- Pod priority decreases with scale: 1st instance of a new crawl can preempt 3rd or 2nd instance of another crawl
- Create priority classes upto 'max_crawl_scale, configured in values.yaml
- Improved scale change reconciliation: if increasing scale, immediately scale up. If decreasing scale,
graceful stop scaled-down instance to complete via redis 'stopone' key, wait until they exit with Completed state
before adjust status.scale / removing scaled down pods. Ensures unaccepted interrupts don't cause scaled down data to be deleted.
- Redis pod remains inactive until crawler is first active, or after no crawl pods are active for 60 seconds
- Configurable Redis storage with 'redis_storage' value, set to 3Gi by default
- CrawlJob deletion starts as soon as post-finish crawl operations are run
- Post-crawl operations get their own redis instance, since one during response is being cleaned up in finalizer
- Finalizer ignores request with incorrect state (returns 400 if reported as not finished while crawl is finished)
- Current resource usage added to status
- Profile browser: also manage single pod directly without statefulset for consistency.
- Restart pods via restartTime value: if spec.restartTime != status.restartTime, clear out pods and update status.restartTime (using OnDelete policy to avoid recreate loops in edge cases).
- Update to latest metacontroller (v4.11.0)
- Add --restartOnError flag for crawler (for browsertrix-crawler 0.11.0)
- Failed crawl logging: dd 'fail_crawl()' to be used for failing a crawl, which prints logs for default container (if enabled) as well as pod status
- tests: check other finished states to avoid stuck in infinite loop if crawl fails
- tests: disable disk utilization check, which adds unpredictability to crawl testing!
fixes #1147 

---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2023-09-11 10:38:04 -07:00
Anish Lakhwara
e57148d0e9
feat: add SMTP {port, use_tls} config (#1142)
* feat: add SMTP {port, use_tls} config
* If `password` is None don't attempt to log in
* remove 'can be omitted' comment

---------
Co-authored-by: Ilya Kreymer <ikreymer@users.noreply.github.com>
2023-09-08 08:18:36 -07:00
Ilya Kreymer
2967f1e320
ingress: simplify ingress config: (fixes #1135) (#1146)
* ingress: simplify ingress config: (fixes #1135)
- use standard Prefix pathTypes
- remove nginx-specific rewriting
- remove 'scheme', use https/http based on 'tls' setting (in ingress and configmap)
- fix signing ingress to use ingressClassName
2023-09-07 09:51:48 -07:00
Ilya Kreymer
68bc053ba0
Print crawl log to operator log (mostly for testing) (#1148)
* log only if 'log_failed_crawl_lines' value is set to number of last lines to log
from failed container

---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2023-09-06 17:53:02 -07:00
Ilya Kreymer
38f596fd81
chart: move minio credentials to separate secret, part of #490 (#1143) 2023-09-06 17:35:30 -07:00
Ilya Kreymer
dce1ae6129
better resources scaling by number of browsers per crawler container (#1103)
- set crawler cpu / memory with fixed base + incremental bumps based on number of browsers
- allow parsing k8s quantities with parse_quantity, compute in operator
- set 'crawler_cpu = crawler_cpu_base + crawler_extra_cpu_per_browser * (num_browsers - 1)'
and same for memory
2023-09-06 01:42:44 -04:00
Ilya Kreymer
6dca2f1c03
supports overriding the replayweb.page version without having to be r… (#1122)
* supports overriding the replayweb.page version without having to be rebuild frontend image:
- ensures 'rwp_base_url' from helm chart is passed to nginx
- ensures both ui.js and sw.js are loaded based on nginx environment variable, not hard-coded
- ui.js loaded via redirect from new /replay/ui.js path
- pin RWP to known working release in default values.yaml
- remove RWP_BASE_URL from Dockerfile, no longer needed, set via chart env var
- set default RWP_BASE_URL for devserver to use CDN
- set RWP version to 1.8.11
2023-09-05 20:10:21 -04:00
Ilya Kreymer
a9ab17fc61
publish helm chart on release (fixes #1114) (#1117) (#1123)
- no longer using :latest by default in values.yaml, instead updating version with each release
- set chart version to match app version in Chart.yaml
- update version in helm chart and values.yaml as part of update-version.sh script
- update test.yaml and local-config.yaml to enable using :latest tag images
- ci: add ci script for packaging current helm chart
- docs: updates docs to indicate deploying directly from GitHub release
- docs: add script to fill in latest version for 'VERSION' using custom script
- chart: set local_service_port to 30870 by default, but use only if no ingress.
- default values.yaml set up for local deployment, local-config.yaml contains additional commented out examples
- ci draft: add deployment info to draft with helm install command for current version
- test: fix password check test
2023-08-30 12:02:02 -07:00
Ilya Kreymer
989ed2a8da
Use Shared Services for Crawling, Redis, Profile Browsers (#1088)
* refactor to use shared role-based service shared across pods:
- 'crawler' service for all crawler screencasting, scales 0 .. N with crawler-<ID>-N.crawl
- 'redis' service for all redis access, redis-<ID>-0.redis
- 'browser' service for all browser access (profile browsers), browser-<ID>-0.browser
- don't create a new service per crawl/profile at all
- enable 'publishNotReadyAddresses' for potentially faster resolving, esp for redis
- remove service as type managed by operator as no longer creating services dynamically
- remove frontend var CRAWLER_SVC_SUFFIX, suffix always '.crawler' to match crawler service name
2023-08-24 20:08:53 -07:00
Ilya Kreymer
63b776bce8
ingress: minor tweaks to ingress to update to latest spec: (#1096)
- use pathType ImplementationSpecific for regexes
- use ingressClassName instead of annotation
2023-08-23 11:36:52 -07:00
Ilya Kreymer
9553115bbe
helm chart tweaks: (#1067)
* helm chart tweaks:
- lower mem requirements for backend and crawler
- disable cors in ingress to pass through cors headers from backend
- crawler statefulset: use ordered instead of parallel scaling policy to avoid single crawl taking up all crawling capacity quickly
2023-08-14 16:43:12 -07:00
Ilya Kreymer
7ea6d76f10
Resource Constraints Cleanup: (fixes #895) (#1019)
* resource constraints: (fixes #895)
- for cpu, only set cpu requests
- for memory, set mem requests == mem limits
- add missing resource constraints for minio and scheduled job
- for crawler, set mem and cpu constraints per browser, scale based on browser instances per crawler
- add comments in values.yaml for crawler values being multiplied
- default values: bump crawler to 650 millicpu per browser instance just in case

cleanup: remove unused entries from main backend configmap
2023-08-01 00:11:16 -07:00
Vinzenz Sinapius
5807507f29
Add proxy settings for crawler and profilebrowser (#997) 2023-07-26 16:11:10 -07:00
Ilya Kreymer
4bea7565bc
load handling: scale up redis only when crawler pods running (#1009)
Operator: Modified init behavior to only load redis when at least one crawler pod available:
- waits for at least one crawler pod to be available before starting redis pod, to avoid situation where many crawler pods are in pending mode, but redis pods are still running.
- redis statefulset starts at scale of 0
- once crawler pod becomes available, redis sts is scaled to 1 (via `initRedis==true` status)
- crawl remains in 'starting' or 'waiting_capacity' state until pod becomes available without redis pod running
- set to 'running' state only after redis and at least one crawler pod is available
- if no crawler pods available after running, or, if stuck in starting for >60 seconds, switch to 'waiting_capacity' state
- when switching to 'waiting_capacity', also scale down redis to 0, wait for crawler pod to become available, only then scale up redis to 1, and get back to 'running'

other tweaks:
- add new status field 'initRedis', default to false, not displayed
- crawler pod: consider 'ContainerCreating' state as available, as container will not be blocked by resource limits
- add a resync after 3 seconds when waiting for crawler pod or redis pod to become available, configurable via 'operator_fast_resync_secs'
- set_state: if not updating state, ensure state reflects actual value in db
2023-07-26 08:40:05 -07:00
Ilya Kreymer
d7cb47390e
readd support for passing in 'crawler_extra_args' for additional/custom (#957)
options not covered by standard crawler opts (removed setting all args this way in #889)
2023-07-07 12:08:40 -07:00
Ilya Kreymer
00eb62214d
Uploads API: BaseCrawl refactor + Initial support for /uploads endpoint (#937)
* basecrawl refactor: make crawls db more generic, supporting different types of 'base crawls': crawls, uploads, manual archives
- move shared functionality to basecrawl.py
- create a base BaseCrawl object, which contains start / finish time, metadata and files array
- create BaseCrawlOps, base class for CrawlOps, which supports base crawl deletion, querying and collection add/remove

* uploads api: (part of #929)
- new UploadCrawl object which extends BaseCrawl, has name and description
- support multipart form data data upload to /uploads/formdata
- support streaming upload of a single file via /uploads/stream, using botocore multipart upload to upload to s3-endpoint in parts
- require 'filename' param to set upload filename for streaming uploads (otherwise use form data names)
- sanitize filename, place uploads in /uploads/<uuid>/<sanitized-filename>-<random>.wacz
- uploads have internal id 'upload-<uuid>'
- create UploadedCrawl object with CrawlFiles pointing to the newly uploaded files, set state to 'complete'
- handle upload failures, abort multipart upload
- ensure uploads added within org bucket path
- return id / added when adding new UploadedCrawl
- support listing, deleting, and patch /uploads
- support upload details via /replay.json to support for replay
- add support for 'replaceId=<id>', which would remove all previous files in upload after new upload succeeds. if replaceId doesn't exist, create new upload. (only for stream endpoint so far).
- support patching upload metadata: notes, tags and name on uploads (UpdateUpload extends UpdateCrawl and adds 'name')

* base crawls api: Add /all-crawls list and delete endpoints for all crawl types (without resources)
- support all-crawls/<id>/replay.json with resources
- Use ListCrawlOut model for /all-crawls list endpoint
- Extend BaseCrawlOut from ListCrawlOut, add type
- use 'type: crawl' for crawls and 'type: upload' for uploads
- migration: ensure all previous crawl objects / missing type are set to 'type: crawl'
- indexes: add db indices on 'type' field and with 'type' field and oid, cid, finished, state

* tests: add test for multipart and streaming upload, listing uploads, deleting upload
- add sample WACZ for upload testing: 'example.wacz' and 'example-2.wacz'

* collections: support adding and remove both crawls and uploads via base crawl
- include collection_ids in /all-crawls list
- collections replay.json can include both crawls and uploads

bump version to 1.6.0-beta.2
---------

Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2023-07-07 09:13:26 -07:00
Ilya Kreymer
4c8de3160b typo fix: fix extra trailing quote on CRAWL_ARGS in configmap.yaml 2023-06-16 18:55:21 -07:00
Ilya Kreymer
4428184aea
frontend: configure running with a fixed 'replay.json', auth headers passed via separate config (#899)
wabac.js will reload the replay.json on 403 with new token (will be in next version of wabac.js)
presign urls: make presign timeout configurable (in minutes), defaults to 60 mins
dockerfile: fix configuring RWP_BASE_URL
2023-06-08 11:26:26 -07:00
Ilya Kreymer
dd757961fc
config: add overridable 'user_agent_suffix' and 'user_agent' to values.yaml, (#910)
passed to crawler --userAgentSuffix and --userAgent params, respectively, using
'quote' to support spaces in user-agent.
config: re-order settings to put 'Crawler Settings' section first, followed by 'Cluster Settings'
fixes #787
2023-06-07 12:01:12 -07:00
Ilya Kreymer
f2b7b6bcd5
Nightly Tests Fix (#905)
* tests: fix nightly test to account for 'waiting_capacity' state

* readd missing --logErrorsToRedis flag
2023-06-02 21:47:41 -07:00
Tessa Walsh
0284903b34
Cleanup carwler args (#889)
* crawler args cleanup:
- move crawler args command line entirely to configmap
- add required settings like --generateWACZ and --waitOnDone to configmap to not be overridable
- values files can configure individual settings, assembled in configmap
- move disk_utilization_threshold to configmap
- add 'crawler_logging_opts' and 'crawler_extract_full_text' options to values.yaml to more easily set these options

---------

Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2023-05-30 19:29:07 -04:00
Ilya Kreymer
70319594c2
crawlconfig: fix default filename template, make configurable (#835)
* crawlconfig: fix default filename template, make configurable
- make default crawl file template configurable with 'default_crawl_filename_template' value in values.yaml
- set to '@ts-@hostsuffix.wacz' by default
- allow updating via 'crawlFilenameTemplate' in crawlconfig patch, which updates configmap
- tests: add test for custom 'default_crawl_filename_template'
2023-05-08 14:03:27 -07:00
Ilya Kreymer
aae0e6590e
Ensure Volumes are deleted when crawl is canceled (#828)
* operator:
- ensures crawler pvcs are always deleted before crawl object is finalized (fixes #827)
- refactor to ensure finalizer handler always run when finalizing
- remove obsolete config entries
2023-05-05 12:05:54 -07:00
Ilya Kreymer
60ba9e366f
Refactor to use new operator on backend (#789)
* Btrixjobs Operator - Phase 1 (#679)

- add metacontroller and custom crds
- add main_op entrypoint for operator

* Btrix Operator Crawl Management (#767)

* operator backend:
- run operator api in separate container but in same pod, with WEB_CONCURRENCY=1
- operator creates statefulsets and services for CrawlJob and ProfileJob
- operator: use service hook endpoint, set port in values.yaml

* crawls working with CrawlJob
- jobs start with 'crawljob-' prefix
- update status to reflect current crawl state
- set sync time to 10 seconds by default, overridable with 'operator_resync_seconds'
- mark crawl as running, failed, complete when finished
- store finished status when crawl is complete
- support updating scale, forcing rollover, stop via patching CrawlJob
- support cancel via deletion
- requires hack to content-length for patching custom resources
- auto-delete of CrawlJob via 'ttlSecondsAfterFinished'
- also delete pvcs until autodelete supported via statefulset (k8s >1.27)
- ensure filesAdded always set correctly, keep counter in redis, add to status display
- optimization: attempt to reduce automerging, by reusing volumeClaimTemplates from existing children, as these may have additional props added
- add add_crawl_errors_to_db() for storing crawl errors from redis '<crawl>:e' key to mongodb when crawl is finished/failed/canceled
- add .status.size to display human-readable crawl size, if available (from webrecorder/browsertrix-crawler#291)
- support new page size, >0.9.0 and old page size key (changed in webrecorder/browsertrix-crawler#284)

* support for scheduled jobs!
- add main_scheduled_job entrypoint to run scheduled jobs
- add crawl_cron_job.yaml template for declaring CronJob
- CronJobs moved to default namespace

* operator manages ProfileJobs:
- jobs start with 'profilejob-'
- update expiry time by updating ProfileJob object 'expireTime' while profile is active

* refactor/cleanup:
- remove k8s package
- merge k8sman and basecrawlmanager into crawlmanager
- move templates, k8sapi, utils into root package
- delete all *_job.py files
- remove dt_now, ts_now from crawls, now in utils
- all db operations happen in crawl/crawlconfig/org files
- move shared crawl/crawlconfig/org functions that use the db to be importable directly,
including get_crawl_config, add_new_crawl, inc_crawl_stats

* role binding: more secure setup, don't allow crawler namespace any k8s permissions
- move cronjobs to be created in default namespace
- grant default namespace access to create cronjobs in default namespace
- remove role binding from crawler namespace

* additional tweaks to templates:
- templates: split crawler and redis statefulset into separate yaml file (in case need to load one or other separately)

* stats / redis optimization:
- don't update stats in mongodb on every operator sync, only when crawl is finished
- for api access, read stats directly from redis to get up-to-date stats
- move get_page_stats() to utils, add get_redis_url() to k8sapi to unify access

* Add migration for operator changes
- Update configmap for crawl configs with scale > 1 or
crawlTimeout > 0 and schedule exists to recreate CronJobs
- add option to rerun last migration, enabled via env var and by running helm with --set=rerun_last_migration=1

* subcharts: move crawljob and profilejob crds to separate subchart, as this seems best way to guarantee proper install order with + update on upgrade with helm, add built btrix-crds-0.1.0.tgz subchart
- metacontroller: use release from ghcr, add metacontroller-helm-v4.10.1.tgz subchart

* backend api fixes
- ensure changing scale of crawl also updates it in the db
- crawlconfigs: add 'currCrawlSize' and 'lastCrawlSize' to crawlconfig api

---------

Co-authored-by: D. Lee <leepro@gmail.com>
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2023-04-24 18:30:52 -07:00
Ilya Kreymer
f6dc26eeb5
nginx: enable worker processes autotune to correctly set the number of processes for nginx, possible fix for #780 (#785) 2023-04-21 18:13:22 -07:00
Ilya Kreymer
85b6a05419
Upgrade to mongo 6 and use sortArray for workflow crawls (#764) (#765)
fixes from 1.4.1:
* Upgrade to mongo 6 and use  for workflow crawls

* update readiness probe with timeouts doubled, and failure threshold increased for slower 'mongosh' readiness check

update versions to 1.5.0-beta.0 in backend and frontend

Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2023-04-11 18:22:07 -07:00
Ilya Kreymer
7f757d396a
config: add 'pageLoadTimeout' and 'pageExtraDelay' options to backend… (#742)
* config: add 'pageLoadTimeout' and 'pageExtraDelay' options to backend config
- add 'default_page_load_timeout_seconds' to values.yaml, defaulting to 120, for pageLoadTimeout
- add 'defaultPageLoadTimeSeconds ' to /api/settings, update tests for /api/settings
addresses issue in #636
2023-04-04 19:52:23 -07:00
Ilya Kreymer
1c47a648a9
Max page limit override (#737)
* more page limit: update to #717, instead of setting --limit in each crawlconfig,
apply override --maxPageLimit setting, implemented in crawler, to override individually configured page limit

* update tests, no longer returning 'crawl_page_limit_exceeds_allowed'
2023-04-03 14:01:32 -07:00
Ilya Kreymer
887cb16146
Allow configurable max pages per crawl in deployment settings (#717)
* backend: max pages per crawl limit, part of fix for #716:
- set 'max_pages_crawl_limit' in values.yaml, default to 100,000
- if set/non-0, automatically set limit if none provided
- if set/non-0, return 400 if adding config with limit exceeding max limit
- return limit as 'maxPagesPerCrawl' in /api/settings
- api: /all/crawls - add runningOnly=0 to show all crawls, default to 1/true (for more reliable testing)

tests: add test for 'max_pages_per_crawl' setting
- ensure 'limit' can not be set higher than max_pages_per_crawl
- ensure pages crawled is at the limit
- set test limit to max 2 pages
- add settings test
- check for pages.jsonl and extraPages.jsonl when crawling 2 pages
2023-03-28 16:26:29 -07:00
Ilya Kreymer
413fd8d7ea
Chart: split Crawl args into separate variables (#639)
* chart crawl args cleanup:
- move configurable settings out of 'crawler_args'
- add 'crawler_session_size_limit_bytes' and 'crawler_session_time_limit_seconds' for --timeLimit and --sizeLimit option for crawler
- remove hard-coded 'timeout' to allow configuring via crawl config
- set liveness check port from existing config value
- add comments that requests hd must be at least double the size limit
- defaults: set crawler_requests_hd to 22GB, default crawl session size limit to 10GB
2023-02-24 17:24:04 -08:00
Ilya Kreymer
3df6e0f146
crawler arguments fixes: (#621)
- partial fix to #321, don't hard-code behavior limit into crawler args
- allow setting number of crawler browser instances via 'crawler_browser_instances' to avoid having to override the full crawler args
2023-02-22 13:23:19 -08:00
Tessa Walsh
14b349443f
Make pending invites expire via TTL index (#568)
* Make invites expire after configurable window

The value can be set in EXPIRE_AFTER_SECONDS env var and via
helm chart values, and defaults to 7 days.

* Create nightly test CI and add invite expiration test to it

* Update 404 error message for missing or expired invite

---------

Co-authored-by: sua yoo <sua@suayoo.com>
2023-02-14 16:07:14 -05:00
Ilya Kreymer
21745fb6f8
health readiness check: add /healthz endpoint for nginx readiness check, set failure threshold to 3 (similar to ingress) (#562) 2023-02-06 15:08:05 -08:00
Ilya Kreymer
ccd87e0dff
Rename api / nginx settings -> backend / frontend, set pull policy job images (#504)
* rename config values
- api -> backend
- nginx -> frontend

* job pods:
- set job_pull_policy from api_pull_policy (same as backend image)
- default to Always, but can be overridden for local deployment (same as backend image)

typo fix: CRAWL_NAMESPACE -> CRAWLER_NAMESPACE (part of #491)
ansible: set default label to :latest instead of :dev for
2023-01-18 20:21:36 -08:00
Ilya Kreymer
1dfa494210
backend: add default behavior time to /api/settings (part of #321) (#499) 2023-01-18 14:52:15 -08:00
Ilya Kreymer
827b643262
backend: add 'allow_dupe_invites' option to allow re-inviting users. if not set (default), duplicate invites will result in errors (#471) 2023-01-12 23:25:48 -08:00
Ilya Kreymer
4dbca8c421
email sending tweaks: (#470)
- support 'reply-to' email field in values, and in ansible-based values
- set 'subject' for different types of messages
2023-01-12 23:25:23 -08:00
Tessa Walsh
49460bb070
Add default organization + invite to default org (#465), #455
- Add default switch to Archive (org) model
- Set default org name via values.yaml
- Add check to ensure only one org with default org name exists
- Stop creating new orgs for new users
- Add new API endpoints for creating and renaming orgs (part of #457)
- Make Archive.name unique via index
- Wait for db connection on init, log if waiting
- Make archive-less invites invite user to default org with Owner role
- Rename default org from chart value if changed
- Don't create new org for invited users
2023-01-12 16:44:18 -08:00
Ilya Kreymer
30bda8c75d
VNC-Based Profile Browser (#433)
* profile browser vnc support + fixes:
- switch profile browser rendering to use VNC
- frontend: add @novnc/novnc as dependency, create separate bundle novnc.js to load into vnc browser (to avoid loading from each container)
- frontend: update proxy paths to proxy websocket, index page to crawler
- frontend: allow browser profiles in all browsers, remove browser compatibility check
- frontend: update webpack dev config, apply prettier
- frontend: node version fix
- backend: get vncpassword, build new URL for proxying to crawler iframe
- backend: fix profile / crawl job pull policy from 'Always' -> 'Never', should use existing image for job
- backend: fix kill signal to use bash -c to work with latest backend image
- backend/chart: add 'profile_browser_timeout_seconds' to chart values to control how long profile browser to remain when idle (default to 60)
- backend: remove utils.py, now using secret.token_hex() for random suffix
Co-authored-by: sua yoo <sua@suayoo.com>
2023-01-10 14:42:42 -08:00
DongWoo Lee
b03b848ec2 have ingress for signer only when it is enabled 2023-01-06 14:06:33 -08:00
Ilya Kreymer
82ffc0dfbc
Local Deployment Work: Support running locally + test cluster on CI (#396)
* k8s local deployment work:
- make it easier to deploy w/o ingress by setting 'local_service_port' (suggested port 30870)
- if using local minio, ensure file endpoints set to /data/ and /data/ proxies correctly to local bucket
- if not using minio, ensure file endpoints point to correct access / endpoint url.
- setup should work with docker desktop, minikube, microk8s and k3s!
- nginx chart: bump nginx memory limit to 20Mi
- nginx image: 00-default-override-resolver-config -> 00-browsertrix-nginx-init for clarity
- nginx image: use default nginx.conf, pin to nginx 1.23.2
- mongo: readd readiness probe, bump connect wait timeout (needed for ci)
- config: set superadmin username to 'admin'
- config schema: set 'name' as required 
- add sample chart values overrides:
- chart values: local-config.yaml for running locally with 'local_service_port'
- chart values: add microk8s-hosted.yaml for configuring a hosted microk8s setup
- chart values: add microk8s-ci.yaml for ci tests
- ci: remove docker swarm tests
- ci: add microk8s integration tests: launching cluster, logging in, running a crawl of example.com, downloading/checking WACZ
- bump to 1.1.0-beta.2
2022-12-02 19:58:34 -08:00
Ilya Kreymer
aabb0b2a92
chart / deployment fixes to run on microk8s: (fixes #385) (#387)
- ingress: fix proxying /data to minio, use another ingress which proxies correct host to ensure presigned urls work
- presigning: determine if signing endpoint url (minio) or access endpoint (cloud bucket) based on if access endpoint is provided, set bool on storage object
- chart: fix indent on incorrect storageClassName configs
- ingress: make 'ingress_class' configurable (set to 'public' for microk8s, default to 'nginx')
- minio: use older minio image which supports legacy fs based setup (for now)
- nginx service: add 'nginx_service_use_node_port' config setting: if true, will use NodePort for frontend,
other will use default (ClusterIP) and only for the frontend / nginx
- chart: remove changing service type for other services
2022-11-30 09:21:58 -08:00
Ilya Kreymer
dde4c5ee68 k8s chart:
ingress: use separate ingress for authsign to allow ssl-redirect true on main ingress
mongo: local: disable readiness check for now due to issues with eval command (for now)
2022-10-15 13:46:31 -07:00
Ilya Kreymer
447b0bf9b9
k8s chart + values tweak: (#317)
- mongo chart to avoid requiring username/password if passing db_url
- tweaks to default values (set registration enabled by default, longer) add missing options
2022-09-21 12:45:08 -07:00
Ilya Kreymer
1216f6cb66
K8s: update chart for local minio + mongo default (#301)
* k8s chart fixes:
mongo: pin to 5.0.11 version for now
minio: create empty dir for local storage for now instead of using mc, use 'btrix-data' as bucket name
2022-09-02 13:07:47 -07:00
Ilya Kreymer
68ec582f73
nginx simplify: (#259)
- add custom init script for ./docker-entrypoint.d/ to setup resolver from local /etc/resolv.conf
- custom init script also removes default.conf, and removes minio route if NO_MINIO_ROUTE=1 is set
- assign template vars to nginx vars to avoid conflicts when interpolating
- k8s: remove initContainers and volumes, now handled via custom init script in image
2022-06-13 11:53:15 -07:00
Ilya Kreymer
5b6aa3bc95
Affinity + Tolerations + Cleanup Crawl Job (#256)
* k8s: add tolerations for 'nodeType=crawling:NoSchedule' to allow scheduling crawling on designated nodes for crawler and profiles jobs and statefulsets
* add affinity for 'nodeType=crawling' on crawling and profile browser statefulsets
* refactor crawljob: combine crawl_updater logic into base crawl_job
* increment new 'crawlAttemptCount' counter crawlconfig when crawl is started, not necessarily finished, to avoid deleting configs that had attempted but not finished crawls.
* better external mongodb support: use MONGO_DB_URL to set custom url directly, otherwise build from username, password and mongo host
2022-06-10 19:21:37 -07:00
Ilya Kreymer
dee354f252 affinity: add affinity for k8s crawl deployments:
- prefer deploy crawler, redis and job to same zone
- prefer deploying crawler and job together via crawler node type, redis via redis node type (all optional)
2022-06-07 21:52:04 -07:00
Ilya Kreymer
0c8a5a49b4 refactor to use docker swarm for local alternative to k8s instead of docker compose (#247):
- use python-on-whale to use docker cli api directly, creating docker stack for each crawl or profile browser
- configure storages via storages.yaml secret
- add crawl_job, profile_job, splitting into base and k8s/swarm implementations
- split manager into base crawlmanager and k8s/swarm implementations
- swarm: load initial scale from db to avoid modifying fixed configs, in k8s, load from configmap
- swarm: support scheduled jobs via swarm-cronjob service
- remove docker dependencies (aiodocker, apscheduler, scheduling)
- swarm: when using local minio, expose via /data/ route in nginx via extra include (in k8s, include dir is empty and routing handled via ingress)
- k8s: cleanup minio chart: move init containers to minio.yaml
- swarm: stateful set implementation to be consistent with k8s scaling:
  - don't use service replicas,
  - create a unique service with '-N' appended and allocate unique volume for each replica
  - allows crawl containers to be restarted w/o losing data
- add volume pruning background service, as volumes can be deleted only after service shuts down fully
- watch: fully simplify routing, route via replica index instead of ip for both k8s and swarm
- rename network btrix-cloud-net -> btrix-net to avoid conflict with compose network
2022-06-05 10:37:17 -07:00
Ilya Kreymer
bf79959a5a refactoring to use statefulsets + job (#245)
- use statefulsets instead of deployments for mongo, redis, signer
- use k8s job + statefulset for running crawls
- use separate statefulset for crawl (scaled) and single-replica redis stateful set
- move crawl job update login to crawl_updater
- remove shared redis chart

package refactor:
- move to shared code to 'btrixcloud'
- move k8s to 'btrixcloud.k8s'
- move docker to 'btrixcloud.docker'
2022-06-05 10:37:17 -07:00
Ilya Kreymer
c023fe7c9a
Backend API prefix (#240)
* apply /api prefix consistently, both directly through backend and when accessing via frontend, fixes #236

* docs: update local deployment docs to use 9871 instead of 8000, don't expose 8000 by default

* schemas: don't include /openapi.json as /healthz in documentation, keep /healthz at root

* k8s: route backend to /api without additional rewriting
2022-05-31 19:29:20 -07:00
Ilya Kreymer
3df310ee4f
Backend: Crawls with Multiple WACZ files + Profile + Misc Fixes (#232)
* backend: k8s:
- support crawls with multiple wacz files, don't assume crawl complete after first wacz uploaded
- if crawl is running and has wacz file, still show as running
- k8s: allow configuring node selector for main pods (eg. nodeType=main) and for crawlers (eg. nodeType=crawling)
- profiles: support uploading to alternate storage specified via 'shared_profile_storage' value is set
- misc fixes for profiles

* backend: ensure docker run_profile api matches k8s
k8s chart: don't delete pvc and pv in helm chart

* dependency: bump authsign to 0.4.0
docker: disable public redis port

* profiles: fix path, profile browser return value

* fix typo in presigned url cacheing
2022-05-19 18:40:41 -07:00
Ilya Kreymer
aa83d51f7a
k8s backend improvements: (#205)
- add liveness probe for crawls, configurable via 'crawler_liveness_port'
- add User system:anonymous permissions
- treat jobs that have exceeded total as 'partial_complete' (experimental)
2022-03-30 14:39:06 -07:00
Ilya Kreymer
4b2f89db91 k8s: support for using a pre-made persistent volume/claim for crawling, configurable via CRAWLER_PV_CLAIM, otherwise using emptyDir
k8s: ability to set deployment scale for frontend as well
2022-03-15 11:18:23 -07:00
Ilya Kreymer
fb51f8e33e
Mongo auth fix (#190)
* backend: makes mongo auth configurable!
use mongo_auth secret in k8s and set env vars in docker
fixes #177 
* docker: update config.sample.env: use ws screencast by default, add NO_DELETE_ON_FAIL option, extend default login lifetime
2022-03-04 15:04:33 -08:00
Ilya Kreymer
cdd0ab34a3
Watch Stream Directly from Browsertrix Crawler (#189)
* watch work: proxy directly to crawls instead of redis pubsub
- add 'watchIPs' to crawl detail output
- cache crawl ips for quick access for auth
- add '/ipaccess/{ip}' endpoint for watch ws connection to ensure ws has access to the specified container ip
- enable 'auth_request' in nginx frontend
- requirements: update to latest redis-py
remaining fixes for #134
2022-03-04 14:55:11 -08:00
Ilya Kreymer
51a573ef1f backend prod settings:
- set WEB_CONCURRENCY env var to configure number of backend api workers for both docker and k8s
- set via 'backend_workers' in values.yaml
- also add 'rwp_base_url' to values.yaml
- update containers to use public webrecorder/browsertrix-backend and webrecorder/browsertrix-frontend containers
- make liveness, readiness and startup health checks more tolerant
2022-02-28 18:09:13 -08:00
Ilya Kreymer
9bd402fa17
New WS Endpoint for Watching Crawl (#152)
* backend support for new watch system (#134):
- support for watch via redis pubsub and websocket connection to backend
- can support watch from any number of crawler instances to support scaled crawls
- use /archives/{aid}/crawls/{crawl_id}/watch/ws websocket endpoint
- ws: ignore graceful connectionclosedok exception, log other exceptions
- set logging to info to instead of debug for now (debug logs all ws traffic)
- remove old watch apis in backend
- remove old websocket routing to crawler instance for old watch system
- oauth bearer check: support websockets, use websocket object if no request object
- crawler args: replace --screencastPort with --screencastRedis
2022-02-22 10:33:10 -08:00
Ilya Kreymer
ca626f3c0a k8s chart: add permissions for pod exec and logs 2022-02-20 09:39:11 -08:00
Ilya Kreymer
57e5b9fceb k8s charts: update default resource usage in values.yaml
add liveness probe for backend pod
2022-02-14 18:49:56 -08:00
Ilya Kreymer
ca85edc8b3 backend: resource limits:
- set resource mem and cpu requests/limits for all used services (not minio for now)
- add readiness proble to redis, mongo
- adjust crawler limits, set via configmap
2022-02-08 19:53:41 -08:00
Ilya Kreymer
71842be94a backend: k8s setup minor tweaks:
- add 'emptyDir' volume for crawl directory (to allow any pod restarts to have access to the data)
- rename minio and redis volumes to avoid any confusion
- add pod termination grace-period (default to 600 secs)
2022-02-08 15:52:57 -08:00
Ilya Kreymer
523b557eac replay route: (prepare for replay, #124)
- add support for /replay/sw.js
- ensure route works in both k8s and docker (routed via main nginx)
2022-01-31 11:18:10 -08:00
Ilya Kreymer
542680daf7
backend fixes: fix graceful stop + stats (#122)
* backend fixes: fix graceful stop + stats
- use redis to track stopping state, to be overwritten when finished
- also include stats in completed crawls
- docker: use short container id for crawl id
- graceful stop returns 'stopping_gracefully' instead of 'stopped_gracefully'
- don't set stopping state when complete!
- beginning files support: resolve absolute urls for crawl detail (not pre-signing yet)
2022-01-30 18:58:47 -08:00
Ilya Kreymer
2e2b8b329d
Add signing server via authsign (k8s only) (#107)
- add k8s deployment of signing server, if 'signer.enabled' chart value if set
- update ingress to provide access for 'signer.host' if signing server enabled to verify domain, run signing server itself on different port (also turn off ssl redirects to support signing server)
- set WACZ_SIGN_URL and WACZ_SIGN_TOKEN (supported in browesertrix-crawler 0.5.0)
- authsign deployment uses a volume to store current certs
- add sample signer block, with signing disabled by default
2022-01-26 23:27:13 -08:00
Ilya Kreymer
b3ca501a19 helm chart: support cloud-based persistent volumes if values 'volume_storage_class' is specified.
use PersistentVolumeClaim to create a persistent volume for each local service (mongo, minio, redis) when running in a cloud setup
if cloud-specified volume storage class not specified, create default hostPath volume (eg. for minikube)
lint: add default icon for chart
2022-01-15 19:34:31 -08:00
Ilya Kreymer
53beb84c01
Config superuser (#59)
* backend: automatically create super user, fixes #57
- if SUPERUSER_EMAIL is set, superuser is created with `is_superuser` and `is_verified` settings, if user doesn't already exist.
- if SUPERUSER_PASSWORD if set, the password for superuser is set, otherwise a random password is generated
update sample SUPERUSER_EMAIL and SUPERUSER_PASSWORD in config file and chart.
- ensure verification email is not sent if user already verified
2021-12-05 14:12:42 -08:00
Ilya Kreymer
eaf8055063
Support unified docker + k8s deployment (#58)
- adapt nginx config to work both in docker and k8s, using env vars to set urls

backend: additional fixes:
- use env vars with nginx config
- fix settings api route
- when sending e-mail, use the Host header for verification urls when available
- prepare Dockerfile with full build from scratch in image, (disabled 'yarn install' for faster builds for now)
- fix accept invite api for existing user to /archives/accept-invite/{token}
2021-12-05 13:02:26 -08:00
Ilya Kreymer
87c5505c43
Backend Invite System Refactor (#53)
* backend:
- refactor invite system, move to separate InviteOps object, used by archives and user
- supporting three invite use cases:
1) superuser invites any user not registered, not added to any archive
2) archive admin invites any user not registered, add to one of their archives
3) archive admin invites existing registered user, add to one of their archives

- support superadmin invite via /users/invite (fixes #37)
- superadmin invite has no archive set and does not add user to archive

- don't send verification email when accepting from invite, fixes #50
- use different email template / accept url for existing user invite, eg, `/invite/accept/`

- fix default token value in chart
2021-12-04 12:14:28 -08:00
Ilya Kreymer
11b797d535
Add global settings endpoint (#52)
* backend:
- add /api/settings endpoint for misc system-wide settings
- setting 'registrationEnabled' if open registration should be enabled, set via REGISTRATION_ENABLED=1 env var
- setting 'jwtTokenLifetimeMinutes' returns the jwt token expiry in seconds, configured in minutes via JWT_TOKEN_LIFETIME_MINUTES env var (default: 60)
2021-12-03 10:56:57 -08:00
Ilya Kreymer
05c1129fb8
Frontend + Backend Integrated Deployment (K8s only) (#45)
* support running backend + frontend together on k8s
* split nginx container into separate frontend service, which uses nignx-base image and the static frontend files
* add nginx-based frontend image to docker-compose build (for building only, docker-based combined deployment not yet supported)

* backend:
- fix paths for email templates
- chart: support '--set backend_only=1' and '--set frontend_only=1' to only force deploy one or the other
- run backend from root /api in uvicorn
2021-12-03 10:17:22 -08:00
Ilya Kreymer
d0b54dd752 Enable sending emails in K8S, trigger verification e-mail on registration. (#38)
* k8s: support email configuration
support sending reset password email
fix for #32

* fastapi users: update to latest (8.1.2)
send verification email upon registration

* update to latest fastapi-users(8.1.2), refactor to use UserManager class
ensure verification e-mail sent upon registration, w/o requiring separate apicall
fixes #32

* add email options to default chart/values.yaml

* separate usermanager init from fastapi users init, fix for sending invite emails
2021-11-30 23:50:38 -08:00
Ilya Kreymer
3d4d7049a2
Misc backend fixes for cloud deployment (#26)
* misc backend fixes:
- fix running w/o local minio
- ensure crawler image pull policy is configurable, loaded via chart value
- use digitalocean repo for main backend image (for now)
- add bucket_name to config only if using default bucket

* enable all behaviors, support 'access_endpoint_url' for default storages

* debugging: add 'no_delete_jobs' setting for k8s and docker to disable deletion of completed jobs
2021-11-25 11:58:26 -08:00
Ilya Kreymer
57a4b6b46f add collections api:
- collections defined by name per archive
- can update collections with additional metadata (currently just description)
- crawl config api accepts a list of collections by name, resolved to collection uids and stored in config
- finished crawls also associated with collection list
- /archives/{aid}/collections/{name} can list all crawl artifacts (wacz files) from a named collection (in frictionless data package-ish format)
- /archives/{aid}/collections/$all lists all crawled artifacts for the archive

readiness check: add /healthz endpoints for app and nginx
ingress: add /data/ route to local bucket

storage improvements:
- for default storages, store path only, and prepend default storage access endpoint
- collections api returns the paths using the storage access endpoint
- define default storages as secrets in k8s (can support multiple), hard-coded in docker (only one for now)
2021-10-27 09:39:14 -07:00
Ilya Kreymer
c38e0b7bf7 use redis based queue instead of url for crawl done webhook
update docker setup to support redis webhook, add consistent CRAWL_ARGS, additional fixes
2021-10-10 12:18:28 -07:00
Ilya Kreymer
4ae4005d74 add ingress + nginx container for better routing
support screencasting to dynamically created service via nginx (k8s only thus far)
add crawl /watch endpoint to enable watching, creates service if doesn't exist
add crawl /running endpoint to check if crawl is running
nginx auth check in place, but not yet enabled
add k8s nginx.conf
add missing chart files
file reorg: move docker config to configs/
k8s: add readiness check for nginx and api containers for smoother reloading
ensure service deleted along with job
todo: update dockerman with screencast support
2021-10-09 23:47:29 -07:00
Ilya Kreymer
19879fe349 Storage + Data Model Refactor (fixes #3):
- Add default vs custom (s3) storage
 - K8S: All storages correspond to secrets
 - K8S: Default storages inited via helm
 - K8S: Custom storage results in custom secret (per archive)
 - K8S: Don't add secret per crawl config
 - API for changing storage per archive
 - Docker: default storage just hard-coded from env vars (only one for now)
 - Validate custom storage via aiobotocore before confirming
 - Data Model: remove usage from users
 - Data Model: support adding multiple files per crawl for parallel crawls
 - Data Model: track completions for parallel crawls
 - Data Model: initial support for tags per crawl, add collection as 'coll' tag

README fixes
2021-10-09 18:58:40 -07:00
Ilya Kreymer
b6d1e492d7 add redis for storing crawl state data!
- supported in both docker and k8s
- additional pods with same job id automatically use same crawl state in redis
- support dynamic scaling (#2) via /scale endpoint - k8s job parallelism adjusted dynamically for running job (only supported in k8s so far)
2021-09-17 15:02:11 -07:00
Ilya Kreymer
ed27f3e3ee job handling:
- job watch: add watch loop for job failure (backofflimitexceeded)
- set job retries + job timeout via chart values
- sigterm starts graceful shutdown by default, including for timeout
- use sigusr1 to switch to instant shutdown
- update stop_crawl() to use new semantics
2021-08-23 21:22:01 -07:00
Ilya Kreymer
7146e054a4 crawls work (#1):
- support listing existing crawls
- add 'schedule' and 'manual' annotations to jobs, store in Crawl obj
- ensure manual jobs are deleted when completed
- support deleting crawls by id (but not data)
- rename running crawl delete to '/cancel'

change paths for local minio/mongo to /tmp
2021-08-23 18:01:29 -07:00
Ilya Kreymer
627e9a6f14 cleanup crawl config, add separate 'runNow' field
crawler: add cpu/memory limits
minio: auto-create bucket for local minio
2021-08-19 14:15:21 -07:00
Ilya Kreymer
61a608bfbe update models:
- replace storages with archives, which have a single storage (for now)
- crawls associated with archives
- users below to archive, with one admin user (if archive created by default)
- update crawlconfig for latest browsertrix-crawler (0.4.4)
- k8s: fix permissions for crawler role
- k8s: fix minio service (now requiring two ports)
2021-08-18 16:53:49 -07:00
Ilya Kreymer
f77eaccf41 support committing to s3 storage
move mongo into separate optional deployment along with minio
support for configuring storages
support for deleting crawls, associated config and secrets
2021-07-02 15:56:24 -07:00
Ilya Kreymer
a111bacfb5 add k8s support
- working apis for adding crawls, removing crawls in mongo, mapped to k8s cronjobs
- more complete crawl spec
- option to start on-demand job from cronjobs
- optional minio in separate deployment/service
2021-06-30 21:48:44 -07:00