feat: add postgres/gitea/blackbox alerts and more node alerts
Some checks failed
terraform-dev / validate (push) Successful in 6s
terraform-dev / plan (push) Failing after 11s
terraform-dev / apply (push) Has been skipped

This commit is contained in:
Alexandr
2026-04-03 11:34:08 +03:00
parent 03dfa99400
commit 5af763ebb1
18 changed files with 393 additions and 1 deletions

View File

@ -0,0 +1,19 @@
name: "DEV ADIBROV - Endpoint Down (Blackbox)"
expression: |
probe_success{job="blackbox"}
threshold: 1
for: "3m"
condition_type: "lt"
need_reduce: true
reducer_type: "min"
no_data_state: "Alerting"
exec_err_state: "Alerting"
labels:
service: "blackbox"
severity: "critical"
team: "infra"
summary: |
Endpoint недоступен: {{ $labels.instance }}
description: |
Blackbox exporter не может достучаться до {{ $labels.instance }}.
Сервис недоступен снаружи уже более 3 минут.

View File

@ -0,0 +1,19 @@
name: "DEV ADIBROV - SSL Certificate Expiring Soon"
expression: |
(probe_ssl_earliest_cert_expiry{job="blackbox"} - time()) / 86400
threshold: 14
for: "1h"
condition_type: "lt"
need_reduce: true
reducer_type: "min"
no_data_state: "OK"
exec_err_state: "Error"
labels:
service: "blackbox"
severity: "warning"
team: "infra"
summary: |
SSL сертификат истекает через {{ printf "%.0f" $values.B.Value }} дней: {{ $labels.instance }}
description: |
SSL сертификат для {{ $labels.instance }} истекает менее чем через 14 дней.
Необходимо обновить сертификат до истечения срока действия.