Files
grafana-terraform/environments/dev/adibrov/alerts/node/node_high_load.yaml
Alexandr 5af763ebb1
Some checks failed
terraform-dev / validate (push) Successful in 6s
terraform-dev / plan (push) Failing after 11s
terraform-dev / apply (push) Has been skipped
feat: add postgres/gitea/blackbox alerts and more node alerts
2026-04-03 11:34:08 +03:00

20 lines
706 B
YAML

name: "DEV ADIBROV - High System Load"
expression: |
node_load5{job="node_exporter"} / on(instance) machine_cpu_cores{job="cadvisor"}
threshold: 2
for: "10m"
condition_type: "gt"
need_reduce: true
reducer_type: "max"
no_data_state: "OK"
exec_err_state: "Error"
labels:
service: "system"
severity: "warning"
team: "infra"
summary: |
Высокий LA на {{ $labels.instance }}: {{ printf "%.2f" $values.B.Value }} на ядро
description: |
Средняя нагрузка (load average 5m) на {{ $labels.instance }} превышает 2x количество ядер CPU.
Система перегружена — процессы ждут в очереди на выполнение.