browsertrix/chart/app-templates/background_job.yaml
Ilya Kreymer 8a507f0473
Consolidate list page endpoints + better QA sorting + optimize pages fix (#2417)
- consolidate list_pages() and list_replay_query_pages() into
list_pages()
- to keep backwards compatibility, add <crawl>/pagesSearch that does not
include page totals, keep <crawl>/pages with page total (slower)
- qa frontend: add default 'Crawl Order' sort order, to better show
pages in QA view
- bgjob: account for parallelism in bgjobs, add logging if succeeded
mismatches parallelism
- QA sorting: default to 'crawl order' by default to get better results.
- Optimize pages job: also cover crawls that may not have any pages but have pages listed in done stats
- Bgjobs: give custom op jobs more memory
2025-02-21 13:47:20 -08:00

83 lines
1.7 KiB
YAML

apiVersion: batch/v1
kind: Job
metadata:
name: "{{ id }}"
labels:
role: "background-job"
job_type: {{ job_type }}
{% if oid %}
btrix.org: {{ oid }}
{% endif %}
spec:
ttlSecondsAfterFinished: 90
backoffLimit: 3
{% if scale %}
parallelism: {{ scale }}
{% endif %}
template:
spec:
restartPolicy: Never
priorityClassName: bg-job
podFailurePolicy:
rules:
- action: FailJob
onExitCodes:
containerName: btrixbgjob
operator: NotIn
values: [0]
volumes:
- name: ops-configs
secret:
secretName: ops-configs
containers:
- name: btrixbgjob
image: {{ backend_image }}
imagePullPolicy: {{ pull_policy }}
env:
- name: BG_JOB_TYPE
value: {{ job_type }}
{% if oid %}
- name: OID
value: {{ oid }}
{% endif %}
- name: CRAWL_TYPE
value: {{ crawl_type }}
{% if crawl_id %}
- name: CRAWL_ID
value: {{ crawl_id }}
{% endif %}
envFrom:
- configMapRef:
name: backend-env-config
- secretRef:
name: mongo-auth
volumeMounts:
- name: ops-configs
mountPath: /ops-configs/
command: ["python3", "-m", "btrixcloud.main_bg"]
resources:
{% if larger_resources %}
limits:
memory: "1200Mi"
requests:
memory: "500Mi"
cpu: "200m"
{% else %}
limits:
memory: "200Mi"
requests:
memory: "200Mi"
cpu: "50m"
{% endif %}