一小时带你学习完zabbix监控,生产全套详细学习教程
prometheus组件负责产生告警,不处理告警。 AlertManager组件是用来处理告警
Alertmanager定义告警机制,通过,Email,wechat等发送警报
Alertmanager可以定义分组,抑制,沉默。
[root@10-0-0-93 alertmanager]#vi alertmanager.yml
global:
resolve_timeout: 5m
smtp_from: "jiankong123@zw.cn"
smtp_smarthost: 'smtp.mxhichina.com:465'
smtp_auth_username: "jiankong123@zw.cn"
smtp_auth_password: "Dasdfghjkl"
smtp_require_tls: false
# 邮箱模板
templates:
- '/etc/alertmanager/alertmanager-tmpl/*.tmpl'
# 路由分组
route:
group_by: ['alertname']
group_wait: 10s
group_interval: 5m
repeat_interval: 15m
receiver: 'ops'
# 接收配置
receivers:
- name: 'ops'
# 邮箱配置
email_configs:
- to: '{{ template "email.to" . }}'
html: '{{ template "email.to.html" . }}'
send_resolved: true
webhook_configs:
- url: http://172.22.2.8:8060/dingtalk/ops/send
send_resolved: true
## 抑制器配置
#例如当集群不可用时,多条一样告警同时告警,接收人只希望接收到一条告警
inhibit_rules:
# 源标签警报触发时抑制含有目标标签的警报,在当前警报匹配 severity: 'critical'
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
[root@10-0-0-193 alertmanager-tmpl]# cat email.tmpl
{{ define "email.from" }}jiankong123@wf.cn{{ end }}
{{ define "email.to" }}wf123@wf.cn{{ end }}
{{ define "email.to.html" }}
{{- if gt (len .Alerts.Firing) 0 -}}{{ range .Alerts }}
========================
监控告警通知
告警地址: 测试环境监控
告警级别: {{ .Labels.severity }}
告警类型: {{ .Labels.alertname }}
故障主机: {{ .Labels.instance }}
告警主题: {{ .Annotations.summary }}
告警详情: {{ .Annotations.description }}
触发时间: {{ .StartsAt.Format "2008-01-02 15:04:05" }}
========================
{{ end }}{{ end -}}
{{- if gt (len .Alerts.Resolved) 0 -}}{{ range .Alerts }}
========================
监控告警恢复
告警地址: 测试环境监控
告警类型: {{ .Labels.alertname }}
故障主机: {{ .Labels.instance }}
告警主题: {{ .Annotations.summary }}
告警详情: {{ .Annotations.description }}
告警时间: {{ .StartsAt.Format "2008-01-02 15:04:05" }}
恢复时间: {{ .EndsAt.Local.Format "2008-01-02 15:04:05" }}
========================
{{ end }}{{ end -}}
{{- end }}
prometheus-webhook-dingtalk来进行钉钉告警
[root@10-0-0-193 webhook]# cat dingding.tmpl
{{ define "__subject" }}
[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}]
{{ end }}
{{ define "__alert_list" }}{{ range . }}
---
{{ if .Labels.owner }}@{{ .Labels.owner }}{{ end }}
**告警类型**: {{ .Labels.alertname }}
**告警级别**: {{ .Labels.severity }}
**告警主机**: {{ .Labels.instance }}
**告警主题**: {{ index .Annotations "summary" }}
**告警信息**: {{ index .Annotations "description" }}
**告警时间**: {{ dateInZone "2008.01.02 15:04:05" (.StartsAt) "Asia/Shanghai" }}
{{ end }}{{ end }}
{{ define "__resolved_list" }}{{ range . }}
---
{{ if .Labels.owner }}@{{ .Labels.owner }}{{ end }}
**告警类型**: {{ .Labels.alertname }}
**告警级别**: {{ .Labels.severity }}
**告警主机**: {{ .Labels.instance }}
**告警主题**: {{ index .Annotations "summary" }}
**告警信息**: {{ index .Annotations "description" }}
**告警时间**: {{ dateInZone "2008.01.02 15:04:05" (.StartsAt) "Asia/Shanghai" }}
**恢复时间**: {{ dateInZone "2008.01.02 15:04:05" (.EndsAt) "Asia/Shanghai" }}
{{ end }}{{ end }}
{{ define "default.title" }}
{{ template "__subject" . }}
{{ end }}
{{ define "default.content" }}
{{ if gt (len .Alerts.Firing) 0 }}
**====监控告警侦测到{{ .Alerts.Firing | len }}个故障====**
{{ template "__alert_list" .Alerts.Firing }}
---
{{ end }}
{{ if gt (len .Alerts.Resolved) 0 }}
**====监控告警恢复{{ .Alerts.Resolved | len }}个故障====**
{{ template "__resolved_list" .Alerts.Resolved }}
{{ end }}
{{ end }}
{{ define "ding.link.title" }}{{ template "default.title" . }}{{ end }}
{{ define "ding.link.content" }}{{ template "default.content" . }}{{ end }}
{{ template "default.title" . }}
{{ template "default.content" . }}
如有疏漏不妥之处,还请不吝赐教
页面更新:2024-03-06
本站资料均由网友自行发布提供,仅用于学习交流。如有版权问题,请与我联系,QQ:4156828
© CopyRight 2008-2024 All Rights Reserved. Powered By bs178.com 闽ICP备11008920号-3
闽公网安备35020302034844号