Commit Graph

105 Commits

Author SHA1 Message Date
Ilya Kreymer
feb7ab7652
Improved type checking for backend with mypy (#1174)
* add mypy type check
- run type check on backend fix ambiguous typing issues
- add mypy to lint gh action + precommit hook
- add mypy.ini
2023-09-13 19:40:26 -07:00
Tessa Walsh
7cf2b11eb7
Add event webhook tests (#1155)
* Add success filter to webhook list GET endpoint

* Add sorting to webhooks list API and add event filter

* Test webhooks via echo server

* Set address to echo server on host from CI env var for k3d and microk8s

* Add -s back to pytest command for k3d ci

* Change pytest test path to avoid hanging on collecting tests

* Revert microk8s to only run on push to main
2023-09-12 22:08:40 -07:00
Ilya Kreymer
ad9bca2e92
Operator refactor to control pods + pvcs directly instead of statefulsets (#1149)
- Ability for pod to be Completed, unlike in Statefulset - eg. if 3 pods are running and first one finishes, all 3 must be running until all 3 are done. With this setup, the first finished pod can remain in Completed state.
- Fixed shutdown order - crawler pods now correctly shutdown first before redis pods, by switching to background deletion.
- Pod priority decreases with scale: 1st instance of a new crawl can preempt 3rd or 2nd instance of another crawl
- Create priority classes upto 'max_crawl_scale, configured in values.yaml
- Improved scale change reconciliation: if increasing scale, immediately scale up. If decreasing scale,
graceful stop scaled-down instance to complete via redis 'stopone' key, wait until they exit with Completed state
before adjust status.scale / removing scaled down pods. Ensures unaccepted interrupts don't cause scaled down data to be deleted.
- Redis pod remains inactive until crawler is first active, or after no crawl pods are active for 60 seconds
- Configurable Redis storage with 'redis_storage' value, set to 3Gi by default
- CrawlJob deletion starts as soon as post-finish crawl operations are run
- Post-crawl operations get their own redis instance, since one during response is being cleaned up in finalizer
- Finalizer ignores request with incorrect state (returns 400 if reported as not finished while crawl is finished)
- Current resource usage added to status
- Profile browser: also manage single pod directly without statefulset for consistency.
- Restart pods via restartTime value: if spec.restartTime != status.restartTime, clear out pods and update status.restartTime (using OnDelete policy to avoid recreate loops in edge cases).
- Update to latest metacontroller (v4.11.0)
- Add --restartOnError flag for crawler (for browsertrix-crawler 0.11.0)
- Failed crawl logging: dd 'fail_crawl()' to be used for failing a crawl, which prints logs for default container (if enabled) as well as pod status
- tests: check other finished states to avoid stuck in infinite loop if crawl fails
- tests: disable disk utilization check, which adds unpredictability to crawl testing!
fixes #1147 

---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2023-09-11 10:38:04 -07:00
Tessa Walsh
89e44e5cd6
Add operator logs to nightly tests (#1150) 2023-09-07 09:15:47 -07:00
Ilya Kreymer
7d0cfa93e2 quick fix: fix typo in publish-helm-chart specifying version 2023-09-05 15:51:10 -04:00
Anish Lakhwara
3bfa69b98a
fix: add "v" to helm chart release filename (#1141)
* fix: add "v" to helm chart release filename, fixes #1134 

* add 'v' to helm chart version and update-version.sh

---------
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2023-09-05 15:47:39 -04:00
Ilya Kreymer
a9ab17fc61
publish helm chart on release (fixes #1114) (#1117) (#1123)
- no longer using :latest by default in values.yaml, instead updating version with each release
- set chart version to match app version in Chart.yaml
- update version in helm chart and values.yaml as part of update-version.sh script
- update test.yaml and local-config.yaml to enable using :latest tag images
- ci: add ci script for packaging current helm chart
- docs: updates docs to indicate deploying directly from GitHub release
- docs: add script to fill in latest version for 'VERSION' using custom script
- chart: set local_service_port to 30870 by default, but use only if no ingress.
- default values.yaml set up for local deployment, local-config.yaml contains additional commented out examples
- ci draft: add deployment info to draft with helm install command for current version
- test: fix password check test
2023-08-30 12:02:02 -07:00
Ilya Kreymer
38f67a6cc0
Optimize Frontend Image Build on CI (#1057)
* Always run yarn only on build platform with --platform=$BUILDPLATFORM
* Remove optional dependencies (playwright + chromium) from build with --ignore-optional and move some devDependencies to be optional
* Disable husky pre-commit hook checks on frontend

Co-authored-by: sua yoo <sua@suayoo.com>
2023-08-09 12:06:20 -07:00
Anish Lakhwara
b5a9c42df1
feat: add pre-commit to check we don't have real passwords in yml files (#990)
* feat: use existing pre-commit framework

* feat(ci): add github action for password_check

* feat: add some simple tests to password_check.py

* fix: set `backend_password_secret` in default values.yaml to an allowed password
2023-07-26 13:29:37 -07:00
Tessa Walsh
577416024b
Fix pull_request syntax in ansible lint GH Action (#995)
* Fix pull_request syntax in ansible lint GH Action

* Only lint Digital Ocean playbook for now

* fix: pass ansible lint

---------

Co-authored-by: Anish Lakhwara <anish+git@lakhwara.com>
2023-07-20 12:13:52 +02:00
Anish Lakhwara
bc82f562dc feat: ansible lint github action 2023-07-10 17:58:47 -07:00
Ilya Kreymer
00fb8ac048
Concurrent Crawl Limit (#874)
concurrent crawl limits: (addresses #866)
- support limits on concurrent crawls that can be run within a single org
- change 'waiting' state to 'waiting_org_limit' for concurrent crawl limit and 'waiting_capacity' for capacity-based
limits

orgs:
- add 'maxConcurrentCrawl' to new 'quotas' object on orgs
- add /quotas endpoint for updating quotas object

operator:
- add all crawljobs as related, appear to be returned in creation order
- operator: if concurrent crawl limit set, ensures current job is in the first N set of crawljobs (as provided via 'related' list of crawljob objects) before it can proceed to 'starting', otherwise set to 'waiting_org_limit'
- api: add org /quotas endpoint for configuring quotas
- remove 'new' state, always start with 'starting'
- crawljob: add 'oid' to crawljob spec and label for easier querying
- more stringent state transitions: add allowed_from to set_state()
- ensure state transitions only happened from allowed states, while failed/canceled can happen from any state
- ensure finished and state synched from db if transition not allowed
- add crawl indices by oid and cid

frontend: 
- show different waiting states on frontend: 'Waiting (Crawl Limit) and 'Waiting (At Capacity)'
- add gear icon on orgs admin page
- and initial popup for setting org quotas, showing all properties from org 'quotas' object

tests:
- add concurrent crawl limit nightly tests
- fix state waiting -> waiting_capacity
- ci: add logging of operator output on test failure
2023-05-30 15:38:03 -07:00
Ilya Kreymer
12f7db3ae2
tests: fixes for crawl cancel + crawl stopped (#864)
* tests:
- fix cancel crawl test by ensuring state is not running or waiting
- fix stop crawl test by ensuring stop is only initiated after at least one page has been crawled,
otherwise result may be failed, as no crawl data has been crawled yet (separate fix in crawler to avoid loop if stopped
before any data written webrecorder/browsertrix-crawler#314)
- bump page limit to 4 for tests to ensure crawl is partially complete, not fully complete when stopping
- allow canceled or partial_complete due to race condition

* chart: bump frontend limits in default, not just for tests (addresses #780)

* crawl stop before starting:
- if crawl stopped before it started, mark as canceled
- add test for stopping immediately, which should result in 'canceled' crawl
- attempt to increase resync interval for immediate failure
- nightly tests: increase page limit to test timeout

* backend:
- detect stopped-before-start crawl as 'failed' instead of 'done'
- stats: return stats counters as int instead of string
2023-05-22 20:17:29 -07:00
Sara Tavares
07fb7317fe
Delete proofread-action.yaml (#760)
Resulting in a lot of false positives (to revisit later)
2023-04-11 15:49:27 -07:00
Ilya Kreymer
63be81d835 ci: make playwright integration tests run only on PRs involving frontend 2023-04-05 09:57:34 -07:00
Ilya Kreymer
2b0d5ff8b3
misc frontend build fixes: playwright version + chunking (#740)
* misc frontend build fixes:
- fix playwright version to be consistent to fix playwright test
- chunking: set max number of chunks generated

* lock playwright version

* remove intl polyfill

---------

Co-authored-by: sua yoo <sua@suayoo.com>
2023-04-03 21:27:44 -07:00
Sara Tavares
b61592b5ed
CI: Add Playwright UI e2e tests + CI (#614)
Adds Playwright for UI tests.
Basic Playwright test to login.
Playwright Github Action.

---------

Co-authored-by: sua yoo <sua@suayoo.com>
2023-03-22 16:23:22 -07:00
sua yoo
e8f88a797b
Remove new issue project automation config (#718) 2023-03-21 13:49:34 -07:00
Sara Tavares
3fa93b01b8
ci: Create proofread-action.yaml (#714) 2023-03-20 21:08:56 -07:00
sua yoo
934ee18044
chore: switch actions for issue assign automation
addresses #658
2023-03-08 10:01:00 -08:00
sua yoo
3b61266eed
chore: switch to issue node ID
proposed fix for update-project-column
2023-03-06 12:32:08 -08:00
sua yoo
ba2d8db413
chore: fix update-project-column org 2023-03-06 12:27:05 -08:00
sua yoo
0007e9bf0b
chore: remove operation from gh action
see: https://github.com/github/update-project-action/pull/50
2023-03-06 12:24:45 -08:00
sua yoo
1e3b384e31
chore: update assign issue automation action 2023-03-06 12:18:28 -08:00
sua yoo
31dc5c56c9
chore: update add-to-project action version 2023-03-06 11:40:28 -08:00
sua yoo
18abc84484
chore: update project automation action 2023-03-06 11:38:10 -08:00
Ilya Kreymer
a86a3b470a ci: add tokens to fix project automation (to be able to write to shared project) 2023-03-02 09:57:52 -08:00
sua yoo
29f31cd462
ci: add workflows for adding issues to project (#660) 2023-02-28 18:37:01 -08:00
Tessa Walsh
cbab425fec
Make nightly tests run nightly, not monthly (#624) 2023-02-22 17:54:16 -05:00
Tessa Walsh
14b349443f
Make pending invites expire via TTL index (#568)
* Make invites expire after configurable window

The value can be set in EXPIRE_AFTER_SECONDS env var and via
helm chart values, and defaults to 7 days.

* Create nightly test CI and add invite expiration test to it

* Update 404 error message for missing or expired invite

---------

Co-authored-by: sua yoo <sua@suayoo.com>
2023-02-14 16:07:14 -05:00
Ilya Kreymer
3261e7d666 ci:
- run k3d-ci on all changes to charts or backend, including PRs (faster)
- run microk8s and k3d-log-ci only on commits to main that change charts or backend (slower)
- run lint on all changes to backend, including PRs

fix
2023-02-08 11:24:54 -08:00
sua yoo
d128525e4e
Run unit tests in frontend PR check (#569) 2023-02-06 17:47:15 -08:00
Ilya Kreymer
da60403e4b ci: set env vars for deploy script 2023-02-03 10:54:03 -08:00
Ilya Kreymer
6866383c6f ci: fix deploy script typo, ensure version is set 2023-02-03 10:49:17 -08:00
Ilya Kreymer
e8b90b7c3e
Deploy to Dev Cluster Fixes (#542)
fix typos for on-demand k8s cluster deployment via github action trigger
2023-01-31 16:33:56 -08:00
Ilya Kreymer
e98df10dad
CI: Setup manual workflow for dev deployment (#540)
* deployment: add initial manual workflow for deploying to dev cluster, addresses #428
* opt: only run k3d-log-ci tests on backend or chart changes
2023-01-31 15:42:50 -08:00
D. Lee
be4f918149
Merge pull request #442 from webrecorder/admin-logging-service
Add logging service
2023-01-19 22:15:16 -08:00
sua yoo
24a7b14d63
Add path filter to GH workflows (#500)
* update docs path

* update lint check

* cluster runs only on backend changes
2023-01-18 15:02:21 -08:00
sua yoo
f7892d7f2f
Add frontend build check (#498) 2023-01-18 13:06:33 -08:00
Ilya Kreymer
edfb1bd513
quickfix: pydantic / lint fix (#452)
* backend: use latest pydantic again, fix pylint with custom .pylintrc (as suggested in pydantic/pydantic#1961)
2023-01-10 18:54:11 -08:00
Ilya Kreymer
56a6d7a5d8
Backend lint check (#451)
- apply lint + format fixes to backend
- add ci for lint + format fixes for backend
- use fixed version of pydantic
2023-01-10 16:17:06 -08:00
Tessa Walsh
d1b59c9bd0
Use archive_viewer_dep permissions to GET crawls (#443)
* Use archive_viewer_dep permissions to GET crawls

* Add is_viewer check to archive_dep

* Add API endpoint to add new user to archive directly (/archive/<id>/add-user)

* Add tests

* Refactor tests to use fixtures

* And remove login test that duplicates fixtures
2023-01-09 19:11:53 -08:00
DongWoo Lee
7dbad5c606 add ingress test 2023-01-05 18:44:50 -08:00
DongWoo Lee
65da349caa add more time for pods 2023-01-04 23:54:58 -08:00
DongWoo Lee
947bf6e65d add ingress test 2023-01-04 23:32:32 -08:00
DongWoo Lee
539a4556aa add logging service CI with k3d 2023-01-04 22:52:41 -08:00
Ilya Kreymer
2d93cef966
CI: Add K3D CI test (#405)
- add testing with K3D cluster
- bump backend image to python 3.10-slim for newer python, smaller image.
- bump to 1.2.0-beta.0
2022-12-07 23:26:16 -08:00
Ilya Kreymer
ba3e7d048f
build: increase network timeout for yarn for frontend build for arm64 build (#401)
- build: increase network timeout for frontend build for arm64 build (see nodejs/docker-node#1335)
- also bump microk8s action version to fix warning
2022-12-06 10:19:27 -08:00
Ilya Kreymer
e6f99bf2be ci: ensure qemu is setup for multiarch build 2022-12-05 18:38:26 -08:00
Ilya Kreymer
829548af0f doc tweaks:
- fix typos in docs
- update prod deployment info
- update minikube info
- add info on how to run with local images
- bump version to 1.1.0-beta.3 for testing multiarch build
2022-12-05 18:14:19 -08:00
Henry Wilkinson
a74d88dcda
mkdocs setup (deploy, dev, user-guide) (#375)
* Initial docs move
* Setup mkdocs
* Adds instructions for building docs
* add new deployment docs, local and prod
* set up three sections: deployment, dev and user guide
* remove old deployment docs
* ci: mkdocs gh-pages publish

Co-authored-by: sua yoo <sua@suayoo.com>
Co-authored-by: Ilya Kreymer <ikreymer@gmail.com>
2022-12-05 16:41:37 -08:00
Ilya Kreymer
82ffc0dfbc
Local Deployment Work: Support running locally + test cluster on CI (#396)
* k8s local deployment work:
- make it easier to deploy w/o ingress by setting 'local_service_port' (suggested port 30870)
- if using local minio, ensure file endpoints set to /data/ and /data/ proxies correctly to local bucket
- if not using minio, ensure file endpoints point to correct access / endpoint url.
- setup should work with docker desktop, minikube, microk8s and k3s!
- nginx chart: bump nginx memory limit to 20Mi
- nginx image: 00-default-override-resolver-config -> 00-browsertrix-nginx-init for clarity
- nginx image: use default nginx.conf, pin to nginx 1.23.2
- mongo: readd readiness probe, bump connect wait timeout (needed for ci)
- config: set superadmin username to 'admin'
- config schema: set 'name' as required 
- add sample chart values overrides:
- chart values: local-config.yaml for running locally with 'local_service_port'
- chart values: add microk8s-hosted.yaml for configuring a hosted microk8s setup
- chart values: add microk8s-ci.yaml for ci tests
- ci: remove docker swarm tests
- ci: add microk8s integration tests: launching cluster, logging in, running a crawl of example.com, downloading/checking WACZ
- bump to 1.1.0-beta.2
2022-12-02 19:58:34 -08:00
Ilya Kreymer
d6386b7051
Release Build + Versioning (#373)
- Adds version to version.txt in root
- adds update-version.sh which updates version in frontend/package.json and backend/btrixcloud/version.py
- frontend: loads version from $VERSION env var, ../version.txt or package.json
- ci: on new github release, pushes webrecorder/browsertrix-backend and webrecorder/browsertrix-frontend images to Dockerhub with current version, as well as latest.
- version set to 1.1.0-beta.0
- closes #357
2022-11-18 17:15:25 -08:00
Ilya Kreymer
418c07bf0d
Local swarm + podman support (#261)
* backend: refactor swarm support to also support podman (#260)
- implement podman support as subclass of swarm deployment
- podman is used when 'RUNTIME=podman' env var is set
- podman socket is mapped instead of docker socket
- podman-compose is used instead of docker-compose (though docker-compose works with podman, it does not support secrets, but podman-compose does)
- separate cli utils into SwarmRunner and PodmanRunner which extends it
- using config.yaml and config.env, both copied from sample versions
- work on simplifying config: add docker-compose.podman.yml and docker-compose.swarm.yml and signing and debug configs in ./configs
- add {build,run,stop}-{swarm,podman}.sh in scripts dir
- add init-configs, only copy if configs don't exist
- build local image use current version of podman, to support both podman 3.x and 4.x
- additional fixes for after testing podman on centos
- docs: update Deployment.md to cover swarm, podman, k8s deployment
2022-06-14 00:13:49 -07:00
Ilya Kreymer
e3f268a2e8
CI setup for new swarm mode (#248)
- build backend and frontend with cacheing using GHA cache)
- streamline frontend image to reduce layers
- setup local swarm with test/setup.sh script, wait for containers to init
- copy sample config files as default (add storages.sample.yaml)
- add initial backend test for logging in with default superadmin credentials via 127.0.0.1:9871
- must use 127.0.0.1 instead of localhost for accessing frontend container within action
2022-06-06 09:34:02 -07:00