• 欢迎来到小爱博客,一个分享互联网IT技术的网站,喜欢就收藏吧!

prometheus alertmanager配置

prometheus 小爱 3个月前 (07-31) 59次浏览 已收录 0个评论 扫描二维码

规则
收敛、屏蔽、抑制、模板、api
收敛
分组收敛group

新分组默认等待30s批量发送 group_wait: 30s

存在分组有新告警加入等待5m批量发送 group_interval: 5m

发送成功的告警等待3h后重复发送 repeat_interval: 3h

默认5m没有同类告警标记解决

抑制收敛inhibit_rules

同维度的告警,可实现级别(其他条件)上的抑制inhibit_rules
屏蔽收敛 silence

页面配置,什么时间什么条件的屏蔽
install
wget https://github.com/prometheus/alertmanager/releases/download/v0.20.0/alertmanager-0.20.0.linux-amd64.tar.gz

alertmanager –config.file=alertmanager.yml –storage.path=data –data.retention=24h
configuration
命令行配置系统参数,文件配置规则

command-line
# 通用
-h
–config.file=simple.yml
–storage.path=./data
–data.retention=120h
–log.level=debug|info|warn|error
–web.listen-address=”0.0.0.0:9093″
# 高可用
–cluster.listen-address=”0.0.0.0:9094″
–cluster.peer=”192.168.1.1:9094″
–cluster.peer-timeout=”15s”
configure-file
global:
# email
smtp_smarthost: ‘smtp.exmail.qq.com:25’
smtp_from: ‘xxxx’
smtp_auth_username: ‘xxx’
smtp_auth_password: ‘xxx’
# wechat
wechat_api_url: “https://qyapi.weixin.qq.com/cgi-bin/”
wechat_api_secret: “xxx”
wechat_api_corp_id: “xxx”
# 5m类某分组没告警则清除分组
resolve_timeout: 5m

# 自定义发送告警模板
templates:
– ‘D:\\gohome\\src\\github.com\\prometheus\\alertmanager\\template\\my.tmpl’

# 路由树: 根节点
route:
receiver: ‘default-receiver’
# 分组维度
group_by: [‘alertname’, ‘cluster’, ‘service’]

# How long to initially wait to send a notification for a group
# of alerts. Allows to wait for an inhibiting alert to arrive or collect
# more initial alerts for the same group. (Usually ~0s to few minutes.)
# 新分组等待发送, 收敛间隔30s
group_wait: 30s

# How long to wait before sending a notification about new alerts that
# are added to a group of alerts for which an initial notification has
# already been sent. (Usually ~5m or more.)
# 存在分组,有新告警加入发送, 收敛间隔5m
group_interval: 5m

# How long to wait before sending a notification again if it has already
# been sent successfully for an alert. (Usually ~3h or more).
# 发送成功的alert重复发送需等待3h
repeat_interval: 3h

# 一层
routes:
– receiver: ‘team-X-pager’
group_wait: 10s # 优先级高于父group_wait
match:
severity: ‘critical’

# 二层
routes:
– receiver: ‘team-Y-pager’
match_re:
service: ‘^(foo1|foo2|baz)$’

routes:
– receiver: ‘team-X-pager’
match:
owner: ‘team-X’
continue: true
– receiver: ‘team-Y-pager’
match:
owner: ‘team-Y’

– receiver: ‘team-Y-pager’
group_by: [‘alertname’, ‘cluster’, ‘database’]
match_re:
service: ^(foo1|foo2|baz)$

# 接收
receivers:
– name: ‘default-receiver’
webhook_configs:
– url: ‘http://127.0.0.1:5001/’

– name: ‘team-X-mails’
email_configs:
– to: ‘abc@abc.com’

– name: ‘team-X-pager’
email_configs:
– to: ‘abc@abc.com’
webhook_configs:
– url: ‘http://127.0.0.1:5001/’

– name: ‘team-Y-pager’
webhook_configs:
– url: ‘http://127.0.0.1:5001/’
email_configs:
– to: ‘abc@abc.com’
# The HTML body of the email notification.
# 使用自定义模板
# my.tmpl内容如下:
# {{ define “email.my.html” }}test html template{{ .CommonAnnotations.SortedPairs.Values | join ” ” }}{{ end }}
html: ‘{{ template “email.my.html” . }}’

# 抑制
# alertname、cluster、service相同的告警
# critical存在则warning的被抑制
inhibit_rules:
– equal: [‘alertname’, ‘cluster’, ‘service’]
source_match:
severity: ‘critical’
target_match:
severity: ‘warning’
API
# 重载配置
curl -XPOST http://127.0.0.1:9093/-/reload
# 检测健康
curl -XPOST http://127.0.0.1:9093/-/healthy
amtool
# 安装
go get github.com/prometheus/alertmanager/cmd/amtool

# 校验文件
amtool –alertmanager.url=http://127.0.0.1:9093 check-config cfg/alertmanager.yml

# 查询发送的alert
amtool.exe –alertmanager.url=http://127.0.0.1:9093 alert
webhook

# 安装
go get github.com/prometheus/alertmanager/examples/webhook

# 测试
cat send-alert.sh
alerts1='[{
“labels”: {
“alertname”: “‘$1’test-disk”,
“dev”: “sda1”,
},
“annotations”: { “info”: “The disk sda1 is running full” }
},
{
“labels”: {
“alertname”: “‘$1’test-disk”,
“dev”: “sdb1”,
“instance”: “example3”,
“severity”: “critical”
}
}]’

curl -XPOST -d”$alerts1″ http://127.0.0.1:9093/api/v1/alerts

# 验证
./webhook

# 高可用测试
# 查看输出结果是否为一条,挂掉一台,是否可以正常发送
# 往两台发送,最终是否只发送一条
sh -x send-alert.sh 999

作者:_JustDoIT
链接:https://www.jianshu.com/p/90e392b6fbdd
来源:简书
著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。


小爱博客 , 版权所有
转载请注明原文链接:prometheus alertmanager配置
喜欢 (1)
【你的支持, 我的动力】
分享 (0)
发表我的评论
取消评论
表情 贴图 加粗 删除线 居中 斜体 签到

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址