browsertrix/chart/templates/role.yaml
Ilya Kreymer fa86555eed
Track pod resource usage, detect OOM crashes, handle auto-scaling (#1235)
* keep track of per pod status on crawljob:
- crashes time, and reason
- 'used' vs 'allocated' resources 
- 'percent' used / allocated

* crawl log errors: log error when crawler crashes via OOM, either via redis error log
or to console

* add initial autoscaling support!
- detect if metrics server is available via K8SApi.is_pod_metrics_available()
- if available, use metrics for 'used' fields
- if no metrics, set memory used for redis only (using redis apis)
- allow overriding memory and cpu via newMemory and newCpu settings on pod status
- scale memory / cpu based on newMemory and newCpu setting
- templates: update jinja templates to allow restarting crawler and redis with new resources
- ci: enable metrics-server on k3d, microk8s and nightly k3d ci runs

* roles: cleanup unused roles, add permissions for listing metrics

* stats for running crawls:
- update in db via operator
- avoids losing stats if redis pod happens to be done
- tradeoff is more db access in operator, but less extra connections to redis + already
loading from db in backend
- size stat: ensure size of previous files is added to the stats

* crawler deployment tweaks:
- adjust cpu/mem per browser
- add --headless flag to configmap to use new headless mode by default!
2023-10-05 20:41:18 -07:00

43 lines
1.2 KiB
YAML

---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
namespace: {{ .Values.crawler_namespace }}
name: crawler-run
rules:
- apiGroups: [""]
resources: ["pods", "pods/exec", "pods/log", "services", "configmaps", "secrets", "events", "persistentvolumeclaims"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete", "deletecollection"]
- apiGroups: ["batch", "extensions", "apps"]
resources: ["jobs", "cronjobs", "statefulsets"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete", "deletecollection"]
- apiGroups: ["btrix.cloud"]
resources: ["crawljobs", "profilejobs"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete", "deletecollection"]
- apiGroups: ["metrics.k8s.io"]
resources: ["pods"]
verbs: ["list"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: crawler-role
namespace: {{ .Values.crawler_namespace }}
subjects:
- kind: ServiceAccount
name: default
namespace: {{ .Release.Namespace }}
- kind: User
name: system:anonymous
namespace: {{ .Release.Namespace }}
roleRef:
kind: Role
name: crawler-run
apiGroup: rbac.authorization.k8s.io