feat: add postgres/gitea/blackbox alerts and more node alerts
Some checks failed
terraform-dev / validate (push) Successful in 6s
terraform-dev / plan (push) Failing after 11s
terraform-dev / apply (push) Has been skipped

This commit is contained in:
Alexandr
2026-04-03 11:34:08 +03:00
parent 03dfa99400
commit 5af763ebb1
18 changed files with 393 additions and 1 deletions

View File

@ -0,0 +1,19 @@
name: "DEV ADIBROV - PostgreSQL Down"
expression: |
pg_up{job="postgres"}
threshold: 1
for: "2m"
condition_type: "lt"
need_reduce: true
reducer_type: "min"
no_data_state: "Alerting"
exec_err_state: "Alerting"
labels:
service: "postgres"
severity: "critical"
team: "infra"
summary: |
PostgreSQL недоступен на {{ $labels.instance }}
description: |
Exporter не может подключиться к PostgreSQL на {{ $labels.instance }}.
База данных либо упала, либо недоступна по сети.

View File

@ -0,0 +1,21 @@
name: "DEV ADIBROV - PostgreSQL Long Running Transactions"
expression: |
max by(instance) (
pg_stat_activity_max_tx_duration{job="postgres", state="active"}
)
threshold: 300
for: "5m"
condition_type: "gt"
need_reduce: true
reducer_type: "max"
no_data_state: "OK"
exec_err_state: "Error"
labels:
service: "postgres"
severity: "warning"
team: "infra"
summary: |
Долгая транзакция в PostgreSQL на {{ $labels.instance }}: {{ printf "%.0f" $values.B.Value }}с
description: |
На {{ $labels.instance }} есть транзакция, выполняющаяся более 5 минут.
Долгие транзакции блокируют vacuum и могут накапливать bloat.

View File

@ -0,0 +1,22 @@
name: "DEV ADIBROV - PostgreSQL Too Many Connections"
expression: |
(
pg_stat_activity_count{job="postgres"}
/ pg_settings_max_connections{job="postgres"}
) * 100
threshold: 80
for: "5m"
condition_type: "gt"
need_reduce: true
reducer_type: "max"
no_data_state: "OK"
exec_err_state: "Error"
labels:
service: "postgres"
severity: "warning"
team: "infra"
summary: |
PostgreSQL: {{ printf "%.0f" $values.B.Value }}% соединений занято на {{ $labels.instance }}
description: |
На {{ $labels.instance }} занято {{ printf "%.0f" $values.B.Value }}% от max_connections PostgreSQL.
При достижении лимита новые подключения будут отклоняться с ошибкой.