失眠网,内容丰富有趣,生活中的好帮手!
失眠网 > Prometheus+Grafana+企业微信机器人告警

Prometheus+Grafana+企业微信机器人告警

时间:2022-12-02 01:31:28

相关推荐

Prometheus+Grafana+企业微信机器人告警

Prometheus+Grafana+企业微信机器人告警

开源监控和报警系统

Prometheus+Grafana+企业微信机器人告警

Prometheus+Grafana+企业微信机器人告警1.Prometheus 配置安装1.1.node_exporter1.2.process_exporter1.3.mysqld_exporter1.4.nginx_exporter1.5.redis_exporter1.6.监控SpringBoot-2.x1.7.alertmanager1.7.1.添加告警规则1.7.2.安装docker1.7.3.企业微信机器人配置及启动 2.Grafana

1.Prometheus 配置安装

Prometheus下载地址

上传至服务器

解压tar -zxvf prometheus-2.44.0.linux-amd64.tar.gz进入目录启动nohup /usr/local/prometheus-2.44.0/prometheus --config.file="/usr/local/prometheus-2.44.0/prometheus.yml" > ./prometheus.log 2>&1 & (修改成自己prometheus安装的地址)打开web测试: http://localhost:9090 (promethues默认端口)

启动成功

添加开机自启

vim /usr/lib/systemd/system/prometheus.service[Unit]Description=prometheusAfter=network.target[Service]Restart=on-failureExecStart=/usr/local/prometheus-2.44.0/prometheus --config.file="/usr/local/prometheus-2.44.0/prometheus.yml"[Install]WantedBy=multi-user.target刷新配置systemctl daemon-reload 测试启动systemctl start prometheus查看启动状态systemctl status prometheus添加开机自启systemctl enable prometheus重新启动systemctl restart prometheus

1.1.node_exporter

node_exporter-1.5.0-下载地址

也可以打开官网选择版本下载

上传服务器

解压tar -zxvf node_exporter-1.5.0.linux-amd64.tar.gz进入目录启动nohup /usr/local/node_exporter-1.5.0/node_exporter > ./node_exporter.log 2>&1 &打开web访问: http://localhost:9100/metrics (node_exporter默认端口)

启动成功

添加开机自启

vi /usr/lib/systemd/system/node_exporter.service[Unit]Description=node_exporterDocumentation=/prometheus/node_exporterAfter=network.target[Service]Restart=on-failureExecStart=/usr/local/node_exporter-1.5.0/node_exporter[Install]WantedBy=multi-user.target刷新配置systemctl daemon-reload 测试启动systemctl start node_exporter查看启动状态systemctl status node_exporter添加开机自启systemctl enable node_exporter重新启动systemctl restart node_exporter

进入prometheus目录编辑prometheus.yml配置文件

static_configs:- targets: ["localhost:9090"]#添加node_exporter配置- job_name: 'node_exporter'static_configs:- targets: ['localhost:9100'] # 更改为自己的ip地址9100默认端口保存退出重新启动prometheussystemctl restart prometheus

web打开prometheus页面

存在即配置成功

查询采集数据

1.2.process_exporter

process_exporter-下载地址

上传服务器

解压tar -zxvf process-exporter-0.7.10.linux-amd64.tar.gz进入目录添加 vi process-exporter.yamlprocess_names:- name: '{{.Comm}}'cmdline:- '.+'- 测试启动nohup /usr/local/process-exporter-0.7.10/process-exporter -config.path=/usr/local/process-exporter-0.7.10/process-exporter.yaml > ./process_exporter.log 2>&1 &打开web访问: http://localhost:9256/metrics

启动成功

添加开机自启

vim /usr/lib/systemd/system/process-exporter.service[Unit]Description=process-exporterAfter=network.target[Service]Restart=on-failureExecStart=/usr/local/process-exporter-0.7.10/process-exporter -config.path=/usr/local/process-exporter-0.7.10/process-exporter.yaml[Install]WantedBy=multi-user.target刷新配置systemctl daemon-reload 测试启动systemctl start process-exporter查看启动状态systemctl status process-exporter添加开机自启systemctl enable process-exporter重新启动systemctl restart process-exporter

进入prometheus目录添加配置

- job_name: process_exporterstatic_configs:- targets: ['localhost:9256'] #9256默认端口重启prometheus,打开web访问prometheus

存在即配置成功

1.3.mysqld_exporter

mysqld_exporter-0.14.0-下载地址

也可以打开官网选择版本下载

上传服务器

解压tar -zxvf mysqld_exporter-0.14.0.linux-amd64.tar.gz进入目录添加添加配置文件vi /usr/local/mysqld_exporter-0.14.0/.f[client]user=root #mysql账号password=root #mysql 密码mysql(5.7)无法登录同时忘记设置密码时vim /etc/f#在[mysqld]后添加skip-grant-tables #登录时跳过权限检查#重启MySQL服务sudo systemctl restart mysqld#测试登录mysql –uroot –p #直接回车(Enter)#设置新密码set password for 'root'@'localhost'=password('root');如果报:ERROR 1290 (HY000): The MySQL server is running with the --skip-grant-tables option so it cannot execute this statement#刷新配置flush privileges;#重新设置密码set password for 'root'@'localhost'=password('root');#赋予全部权限,实际配置建议重新创建新账号赋予部分权限保证安全GRANT ALL PRIVILEGES ON *.* TO 'root'@'%'IDENTIFIED BY 'root' WITH GRANT OPTION;flush privileges;#退出exit;再把my.ini的skip-grant-tables注释

进入目录启动

nohup /usr/local/mysqld_exporter-0.14.0/mysqld_exporter --config.my-cnf=/usr/local/mysqld_exporter-0.14.0/.f > ./mysqld_exporter.log 2>&1 &打开web访问: http://localhost:9104/metrics

启动成功

添加开启自启

vim /usr/lib/systemd/system/mysqld_exporter.service[Unit]Description=mysqld_exporterAfter=network.target[Service]Restart=on-failureExecStart=/usr/local/mysqld_exporter-0.14.0/mysqld_exporter --config.my-cnf=/usr/local/mysqld_exporter-0.14.0/.f[Install]WantedBy=multi-user.target刷新配置systemctl daemon-reload 测试启动systemctl start mysqld_exporter查看启动状态systemctl status mysqld_exporter添加开机自启systemctl enable mysqld_exporter重新启动systemctl restart mysqld_exporter

进入prometheus目录添加配置

- job_name: 'mysqld_exporter' # 采集mysql的指标static_configs:- targets: ['localhost:9104'] # mysqld_exporter服务的ip和端口重启prometheus,打开web访问prometheus

存在即配置成功

1.4.nginx_exporter

nginx_exporter-下载地址

进入nginx目录重新编译安装 /configure --prefix=/usr/local/nginx/ --with-http_stub_status_module --add-module=../nginx-http-flv-modulemakesudo make install启动nginx ./nginx -V 2>&1 | grep -o with-http_stub_status_module如果在终端输出with-http_stub_status_module,说明nginx已启用tub_status模块更改nginx.conf配置文件server {listen 80; #端口改成自己设定的location /nginx_status {stub_status on;access_log off;allow localhost;deny all;}}

上传至服务器

#解压tar -zxvf nginx_exporter-0.11.0.tar.gz#启动nginx_exporternohup /usr/local/nginx_exporter-0.11.0/nginx-prometheus-exporter -nginx.scrape-uri http://localhost:8080/nginx_status > ./nginx_exporter.log 2>&1 & 打开web访问: http://localhost:9113/metrics (nginx_exporter默认端口)

启动成功

添加开机自启

vim /usr/lib/systemd/system/nginx_exporter.service[Unit]Description=nginx_exporterAfter=network.target[Service]Restart=on-failureExecStart=/usr/local/nginx_exporter-0.11.0/nginx-prometheus-exporter -nginx.scrape-uri http://172.16.11.10:7006/nginx_status[Install]WantedBy=multi-user.target刷新配置systemctl daemon-reload 测试启动systemctl start nginx_exporter查看启动状态systemctl status nginx_exporter添加开机自启systemctl enable nginx_exporter重新启动systemctl restart nginx_exporter

进入prometheus目录编辑prometheus.yml配置文件

prometheus添加配置- job_name: 'nginx_status' # 采集nginx的指标metrics_path: '/metrics' # 拉取指标的接口路径scrape_interval: 5s # 采集指标的间隔周期static_configs:- targets: ['localhost:9113'] # nginx-prometheus-exporter服务的ip和端口

存在即配置成功

1.5.redis_exporter

redis_exporter-下载地址

上传服务器

#解压tar -zxvf redis_exporter-1.50.0.tar.gz#启动nohup /usr/local/redis_exporter-v1.50.0/redis_exporter > ./redis_exporter.log 2>&1 &打开web访问: http://localhost:9121/metrics

启动成功

添加开机自启

vim /usr/lib/systemd/system/redis_exporter.service[Unit]Description=redis_exporterAfter=network.target[Service]Restart=on-failureExecStart=/usr/local/redis_exporter-v1.50.0/redis_exporter[Install]WantedBy=multi-user.target刷新配置systemctl daemon-reload 测试启动systemctl start redis_exporter查看启动状态systemctl status redis_exporter添加开机自启systemctl enable redis_exporter

进入prometheus目录添加配置

- job_name: 'mysqld_exporter' # 采集mysql的指标static_configs:- targets: ['localhost:9121'] # redis_exporter服务的ip和端口重启prometheus,打开web访问prometheus

存在即配置成功

1.6.监控SpringBoot-2.x

//添加pom<!-- spring-boot-actuator依赖 --><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-actuator</artifactId></dependency><!-- /artifact/io.micrometer/micrometer-registry-prometheus --><dependency><groupId>io.micrometer</groupId><artifactId>micrometer-registry-prometheus</artifactId></dependency>

添加yml配置

management:endpoints:# Web端点的配置属性web:# 设置端点访问的URL前缀,默认为/actuatorbase-path: /actuatorexposure:# 开放端点的ID集合(eg:['health','info','beans','env']),配置为“*”表示全部 安全建议只开启 prometheus,healthinclude: '*'metrics:tags:application: ${spring.application.name}

#测试地址 http://localhost:80/actuator/prometheus #端口号更改为自己设定的端口号#prometheus配置文件- job_name: 'java'metrics_path: '/actuator/prometheus'scrape_interval: 5sstatic_configs:- targets: ['localhost:8080','localhost:8081','localhost:8082','localhost:8083'] #多个服务的配置 #重启prometheus,打开web访问prometheus

1.7.alertmanager

alertmanager-0.25.0-下载地址

也可以打开官网选择版本下载

修改prometheus的配置文件prometheus.yml

# Alertmanager configuration# 改为alertmanager的地址alerting:alertmanagers:- static_configs:- targets:- localhost:9093# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.# 指定规则文件rule_files:- "/usr/local/prometheus-2.44.0/rules/*.yml"#在prometheus目录新建 rules 目录mkdir rules

1.7.1.添加告警规则

***************** vi node_alived.yml **************8groups:- name: 实例存活告警规则rules:- alert: 实例存活告警expr: up == 0for: 1mlabels:user: prometheusseverity: warningannotations:summary: "主机宕机 !!!"description: "该实例主机已经宕机超过一分钟了。"********************* vi memory_over.yml*************************groups:- name: 内存报警规则rules:- alert: 内存使用率告警expr: (1 - (node_memory_MemAvailable_bytes / (node_memory_MemTotal_bytes))) * 100 > 50for: 1mlabels:severity: warningannotations:summary: "服务器可用内存不足。"description: "内存使用率已超过50%(当前值:{{ $value }}%)"*************** vi cpu_over.yml *******************************groups:- name: CPU报警规则rules:- alert: CPU使用率告警expr: 100 - (avg by (instance)(irate(node_cpu_seconds_total{mode="idle"}[1m]) )) * 100 > 50for: 1mlabels:severity: warningannotations:summary: "CPU使用率正在飙升。"description: "CPU使用率超过50%(当前值:{{ $value }}%)"**************** vi disk_over.yml******************************groups:- name: 磁盘使用率报警规则rules:- alert: 磁盘使用率告警expr: 100 - node_filesystem_free_bytes{fstype=~"xfs|ext4"} / node_filesystem_size_bytes{fstype=~"xfs|ext4"} * 100 > 80for: 20mlabels:severity: warningannotations:summary: "硬盘分区使用率过高"description: "分区使用大于80%(当前值:{{ $value }}%)"

上传服务器

#解压tar -zxvf alertmanager-0.25.0.linux-amd64.tar.gz#进入解压目录,修改配置文件vi alertmanager.ymlglobal:resolve_timeout: 5mroute:group_by: ['alertname']group_wait: 10sgroup_interval: 10srepeat_interval: 1hreceiver: 'web.hook'receivers:- name: 'web.hook'webhook_configs:- url: 'http://localhost:8089/adapter/wx' #默认端口send_resolved: trueinhibit_rules:- source_match:severity: 'critical'target_match:severity: 'warning'equal: ['alertname', 'dev', 'instance'] #查看版本/usr/local/alertmanager-0.25.0/alertmanager --version#启动/usr/local/alertmanager-0.25.0/amtool check-config /usr/local/alertmanager-0.25.0/alertmanager.yml打开web访问 http://localhost:9093/metrics

启动成功

进入prometheus目录添加配置

- job_name: 'alertmanager_exporter' static_configs:- targets: ['localhost:9093'] 重启prometheus,打开web访问prometheus

存在即配置成功

添加开启自启

vim /usr/lib/systemd/system/alertmanager.service[Unit]Description=https://prometheus.io[Service]Restart=on-failureExecStart=/usr/local/alertmanager-0.25.0/alertmanager --config.file /usr/local/alertmanager-0.25.0/alertmanager.yml --storage.path="/usr/local/alertmanager-0.25.0/data/" --data.retention=120h[Install] WantedBy=multi-user.target保存后:systemctl daemon-reloadsystemctl enable alertmanagersystemctl start alertmanager

1.7.2.安装docker

yum install -y yum-utils#设置镜像仓库地址yum-config-manager \--add-repo \/docker-ce/linux/centos/docker-ce.repoyum makecache fase yum install docker-ce docker-ce-cli containerd.io#启动dockersystemctl start docker#查看docker版本docker version#测试docker run hello-worlddocker images /docker ps systemctl enable docker添加json配置vi etc/docker/daemon.json{"registry-mirrors": ["https://78q96cy9."]} systemctl daemon-reload #刷新配置systemctl start docker # 启动docker服务systemctl stop docker # 停止docker服务systemctl restart docker # 重启docker服务

1.7.3.企业微信机器人配置及启动

打开企业微信添加机器人

复制webhook地址

#执行docker run -d --name wechat \--restart always -p 8080:80 \guyongquan/webhook-adapter \--adapter=/app/prometheusalert/wx.js=/wx=https://qyapi./cgi-bin/webhook/send?key=xxxx(自己的微信机器人key

启动成功后,修改之前配置告警消息配置更改为>10稍等一会,收到告警消息

更改为50,收到正常消息

2.Grafana

#Grafana下载wget /enterprise/release/grafana-enterprise_9.5.2_amd64.deb

或者打开官网下载自己需要的版本

解压进入目录启动

nohup ./bin/grafana-server web > ./grafana.log 2>&1 &打开web:http://localhost:3000,默认用户名和密码:admin/admin

如果觉得《Prometheus+Grafana+企业微信机器人告警》对你有帮助,请点赞、收藏,并留下你的观点哦!

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。