browsertrix/backend/btrixcloud/templates/replica_job.yaml
Ilya Kreymer fb3d88291f
Background Jobs Work (#1321)
Fixes #1252 

Supports a generic background job system, with two background jobs,
CreateReplicaJob and DeleteReplicaJob.
- CreateReplicaJob runs on new crawls, uploads, profiles and updates the
`replicas` array with the info about the replica after the job succeeds.
- DeleteReplicaJob deletes the replica.
- Both jobs are created from the new `replica_job.yaml` template. The
CreateReplicaJob sets secrets for primary storage + replica storage,
while DeleteReplicaJob only needs the replica storage.
- The job is processed in the operator when the job is finalized
(deleted), which should happen immediately when the job is done, either
because it succeeds or because the backoffLimit is reached (currently
set to 3).
- /jobs/ api lists all jobs using a paginated response, including filtering and sorting
- /jobs/<job id> returns details for a particular job
- tests: nightly tests updated to check create + delete replica jobs for crawls as well as uploads, job api endpoints
- tests: also fixes to timeouts in nightly tests to avoid crawls finishing too quickly.

---------
Co-authored-by: Tessa Walsh <tessa@bitarchivist.net>
2023-11-02 13:02:17 -07:00

87 lines
2.3 KiB
YAML

apiVersion: batch/v1
kind: Job
metadata:
name: "{{ id }}"
labels:
role: "background-job"
job_type: {{ job_type }}
btrix.org: {{ oid }}
spec:
ttlSecondsAfterFinished: 0
backoffLimit: 3
template:
spec:
restartPolicy: Never
priorityClassName: bg-jobs
podFailurePolicy:
rules:
- action: FailJob
onExitCodes:
containerName: rclone
operator: NotIn
values: [0]
containers:
- name: rclone
image: rclone/rclone:latest
env:
{% if job_type == BgJobType.CREATE_REPLICA %}
- name: RCLONE_CONFIG_PRIMARY_TYPE
value: "s3"
- name: RCLONE_CONFIG_PRIMARY_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: "{{ primary_secret_name }}"
key: STORE_ACCESS_KEY
- name: RCLONE_CONFIG_PRIMARY_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: "{{ primary_secret_name }}"
key: STORE_SECRET_KEY
- name: RCLONE_CONFIG_PRIMARY_ENDPOINT
value: "{{ primary_endpoint }}"
#valueFrom:
# secretKeyRef:
# name: "{{ primary_secret_name }}"
# key: STORE_ENDPOINT_URL
{% endif %}
- name: RCLONE_CONFIG_REPLICA_TYPE
value: "s3"
- name: RCLONE_CONFIG_REPLICA_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: "{{ replica_secret_name }}"
key: STORE_ACCESS_KEY
- name: RCLONE_CONFIG_REPLICA_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: "{{ replica_secret_name }}"
key: STORE_SECRET_KEY
- name: RCLONE_CONFIG_REPLICA_ENDPOINT
value: "{{ replica_endpoint }}"
#valueFrom:
# secretKeyRef:
# name: "{{ replica_secret_name }}"
# key: STORE_ENDPOINT_URL
{% if job_type == BgJobType.CREATE_REPLICA %}
command: ["rclone", "-vv", "copyto", "--checksum", "primary:{{ primary_file_path }}", "replica:{{ replica_file_path }}"]
{% elif job_type == BgJobType.DELETE_REPLICA %}
command: ["rclone", "-vv", "delete", "replica:{{ replica_file_path }}"]
{% endif %}
resources:
limits:
memory: "100Mi"
requests:
memory: "100Mi"
cpu: "50m"