browsertrix/.github/workflows/k3d-ci.yaml
Ilya Kreymer ad9bca2e92
Operator refactor to control pods + pvcs directly instead of statefulsets (#1149)
- Ability for pod to be Completed, unlike in Statefulset - eg. if 3 pods are running and first one finishes, all 3 must be running until all 3 are done. With this setup, the first finished pod can remain in Completed state.
- Fixed shutdown order - crawler pods now correctly shutdown first before redis pods, by switching to background deletion.
- Pod priority decreases with scale: 1st instance of a new crawl can preempt 3rd or 2nd instance of another crawl
- Create priority classes upto 'max_crawl_scale, configured in values.yaml
- Improved scale change reconciliation: if increasing scale, immediately scale up. If decreasing scale,
graceful stop scaled-down instance to complete via redis 'stopone' key, wait until they exit with Completed state
before adjust status.scale / removing scaled down pods. Ensures unaccepted interrupts don't cause scaled down data to be deleted.
- Redis pod remains inactive until crawler is first active, or after no crawl pods are active for 60 seconds
- Configurable Redis storage with 'redis_storage' value, set to 3Gi by default
- CrawlJob deletion starts as soon as post-finish crawl operations are run
- Post-crawl operations get their own redis instance, since one during response is being cleaned up in finalizer
- Finalizer ignores request with incorrect state (returns 400 if reported as not finished while crawl is finished)
- Current resource usage added to status
- Profile browser: also manage single pod directly without statefulset for consistency.
- Restart pods via restartTime value: if spec.restartTime != status.restartTime, clear out pods and update status.restartTime (using OnDelete policy to avoid recreate loops in edge cases).
- Update to latest metacontroller (v4.11.0)
- Add --restartOnError flag for crawler (for browsertrix-crawler 0.11.0)
- Failed crawl logging: dd 'fail_crawl()' to be used for failing a crawl, which prints logs for default container (if enabled) as well as pod status
- tests: check other finished states to avoid stuck in infinite loop if crawl fails
- tests: disable disk utilization check, which adds unpredictability to crawl testing!
fixes #1147 

---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2023-09-11 10:38:04 -07:00

94 lines
2.5 KiB
YAML

name: Cluster Run (K3d)
on:
push:
paths:
- 'backend/**'
- 'chart/**'
pull_request:
paths:
- 'backend/**'
- 'chart/**'
jobs:
btrix-k3d-test:
runs-on: ubuntu-latest
steps:
- name: Create k3d Cluster
uses: AbsaOSS/k3d-action@v2
with:
cluster-name: btrix-1
args: >-
-p "30870:30870@agent:0:direct"
--agents 1
--no-lb
--k3s-arg "--no-deploy=traefik,servicelb,metrics-server@server:*"
- name: Checkout
uses: actions/checkout@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
with:
driver-opts: network=host
- name: Build Backend
uses: docker/build-push-action@v3
with:
context: backend
load: true
#outputs: type=tar,dest=backend.tar
tags: webrecorder/browsertrix-backend:latest
cache-from: type=gha,scope=backend
cache-to: type=gha,scope=backend,mode=max
- name: Build Frontend
uses: docker/build-push-action@v3
with:
context: frontend
load: true
#outputs: type=tar,dest=frontend.tar
tags: webrecorder/browsertrix-frontend:latest
cache-from: type=gha,scope=frontend
cache-to: type=gha,scope=frontend,mode=max
- name: 'Import Images'
run: |
k3d image import webrecorder/browsertrix-backend:latest -m direct -c btrix-1 --verbose
k3d image import webrecorder/browsertrix-frontend:latest -m direct -c btrix-1 --verbose
- name: Install Kubectl
uses: azure/setup-kubectl@v3
- name: Install Helm
uses: azure/setup-helm@v3
with:
version: 3.10.2
- name: Start Cluster with Helm
run: |
helm upgrade --install -f ./chart/values.yaml -f ./chart/test/test.yaml btrix ./chart/
- name: Install Python
uses: actions/setup-python@v3
with:
python-version: '3.9'
- name: Install Python Libs
run: pip install pytest requests
- name: Wait for all pods to be ready
run: kubectl wait --for=condition=ready pod --all --timeout=240s
- name: Run Tests
run: pytest -s -vv ./backend/test/*.py
- name: Print Backend Logs (API)
if: ${{ failure() }}
run: kubectl logs svc/browsertrix-cloud-backend -c api
- name: Print Backend Logs (Operator)
if: ${{ failure() }}
run: kubectl logs svc/browsertrix-cloud-backend -c op