k8s 监控之路:从平台到应用 · 2019. 8. 23. · grafana web ui 查询、展示 ... app:...

19
K8s 控之路:从平台到黄广喆 2019.8.22

Upload: others

Post on 02-Jan-2021

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: K8s 监控之路:从平台到应用 · 2019. 8. 23. · Grafana Web UI 查询、展示 ... app: mysql-exporter Prometheus Operator Prometheus pod pod match watch watch reload ns:

K8s 监控之路:从平台到应用

黄广喆

2019.8.22

Page 2: K8s 监控之路:从平台到应用 · 2019. 8. 23. · Grafana Web UI 查询、展示 ... app: mysql-exporter Prometheus Operator Prometheus pod pod match watch watch reload ns:

立体化监控

1. 多维度、立体化

快速定位系统故障、更细粒度监控

2. 从宏观角度,三个层次

● 基础设施层面

● 应用层面

● 业务层面基础设施

应用

业务

主机、网络设备、存储、k8s 集群

数据库、中间件、Web

访问量、请求时延、订单量

Page 3: K8s 监控之路:从平台到应用 · 2019. 8. 23. · Grafana Web UI 查询、展示 ... app: mysql-exporter Prometheus Operator Prometheus pod pod match watch watch reload ns:

可观察性

Page 4: K8s 监控之路:从平台到应用 · 2019. 8. 23. · Grafana Web UI 查询、展示 ... app: mysql-exporter Prometheus Operator Prometheus pod pod match watch watch reload ns:

可观察性

https://landscape.cncf.io/

Page 5: K8s 监控之路:从平台到应用 · 2019. 8. 23. · Grafana Web UI 查询、展示 ... app: mysql-exporter Prometheus Operator Prometheus pod pod match watch watch reload ns:

基础设施层面 应用层面 业务层面

主机 CPU 使用率 应用资源 Pod CPU 业务相关 交易量

CPU 负载 Pod Mem 在线用户数

Mem 使用率 数据库 连接数

磁盘 I/O 慢查询

inode 吞吐量

网络传输速率 缓冲池

k8s 集群 资源配额 Web Server 访问量

etcd 提案状态 响应时间

etcd WAL日志同步 状态码监控

etcd 库同步 JVM FGC调度器延迟 TGC

线程数

Page 6: K8s 监控之路:从平台到应用 · 2019. 8. 23. · Grafana Web UI 查询、展示 ... app: mysql-exporter Prometheus Operator Prometheus pod pod match watch watch reload ns:

Prometheus

● 2016 年入 CNCF 基金会,2018年继 k8s 后第二个毕业项目

● 为什么 Prometheus 更适合?

○ 高效的时序数据存储引擎

○ 强大的查询能力:PromQL

○ 面向服务的架构:pull vs. push

○ 与 k8s 天然集成

○ 逐步完善的生态:OpenMetrics, prometheus operator, exporter, thanos

云原生领域监控事实标准

Page 7: K8s 监控之路:从平台到应用 · 2019. 8. 23. · Grafana Web UI 查询、展示 ... app: mysql-exporter Prometheus Operator Prometheus pod pod match watch watch reload ns:

Prometheus 不适用

1. 日志监控

2. 全链路追踪

Elasticsearch Fluent Bit Kibana

Jaeger

Page 8: K8s 监控之路:从平台到应用 · 2019. 8. 23. · Grafana Web UI 查询、展示 ... app: mysql-exporter Prometheus Operator Prometheus pod pod match watch watch reload ns:

Prometheus 架构

Host

Kube state

Mysql

exporter

exporter

exporter

Custom Appclient lib

Custom Appclient lib

Targets

数据源

Node HDD/SSD

Retrieval TSDB HTTPServer

Prometheus Server

采集、存储、处理

Grafana

Web UI

查询、展示

Kubernetes

Service Discovery

File

Page 9: K8s 监控之路:从平台到应用 · 2019. 8. 23. · Grafana Web UI 查询、展示 ... app: mysql-exporter Prometheus Operator Prometheus pod pod match watch watch reload ns:

KubeSphere 架构

Page 10: K8s 监控之路:从平台到应用 · 2019. 8. 23. · Grafana Web UI 查询、展示 ... app: mysql-exporter Prometheus Operator Prometheus pod pod match watch watch reload ns:

抓取 Node Exporter /metrics 接口:

# HELP node_cpu Seconds the cpus spent in each mode. # TYPE node_cpu counter node_cpu_seconds_total{cpu="0", mode="guest"} 0 node_cpu_seconds_total{cpu="0", mode="idle"} 2.03442237e+06 node_cpu_seconds_total{cpu="0", mode="iowait"} 3522.37 node_cpu_seconds_total{cpu="0", mode="irq"} 0.48 node_cpu_seconds_total{cpu="0", mode="nice"} 515.56 node_cpu_seconds_total{cpu="0", mode="softirq"} 953.06 node_cpu_seconds_total{cpu="0", mode="steal"} 0 node_cpu_seconds_total{cpu="0", mode="system"} 6605.46 node_cpu_seconds_total{cpu="0", mode="user"} 23343.01 node_cpu_seconds_total{cpu="1", mode="guest"} 0 node_cpu_seconds_total{cpu="1", mode="idle"} 2.03471439e+06 node_cpu_seconds_total{cpu="1", mode="iowait"} 3633.5 node_cpu_seconds_total{cpu="1", mode="irq"} 0.58 node_cpu_seconds_total{cpu="1", mode="nice"} 542.05 node_cpu_seconds_total{cpu="1", mode="softirq"} 880.49 node_cpu_seconds_total{cpu="1", mode="steal"} 0 node_cpu_seconds_total{cpu="1", mode="system"} 6581.92node_cpu_seconds_total{cpu="1",mode="user"} 23171.06

Page 11: K8s 监控之路:从平台到应用 · 2019. 8. 23. · Grafana Web UI 查询、展示 ... app: mysql-exporter Prometheus Operator Prometheus pod pod match watch watch reload ns:

Metrics 类型

● Counter:只增不减的计数器

● Gauge:可增可减的仪表盘

● Histogram:分组统计

● Summary:分位点统计

Page 12: K8s 监控之路:从平台到应用 · 2019. 8. 23. · Grafana Web UI 查询、展示 ... app: mysql-exporter Prometheus Operator Prometheus pod pod match watch watch reload ns:

Prometheus Exporter

Page 13: K8s 监控之路:从平台到应用 · 2019. 8. 23. · Grafana Web UI 查询、展示 ... app: mysql-exporter Prometheus Operator Prometheus pod pod match watch watch reload ns:

Prometheus SDK

Page 14: K8s 监控之路:从平台到应用 · 2019. 8. 23. · Grafana Web UI 查询、展示 ... app: mysql-exporter Prometheus Operator Prometheus pod pod match watch watch reload ns:

● Operator 模式:CustomResourceDefinition(CRD) + Custom Controller

● 自定义 Custom Resource Definitions (CRD) 管理 Prometheus:

○ ServiceMonitor:负责管理监控目标配置;

○ Prometheus:负责创建和管理 Prometheus Server 实例;

○ PrometheusRules:记录 Recording Rules

● 如何工作?

○ 在 K8s 集群上部署 Operator;

○ Operator 根据 CRD 配置自动创建出 Prometheus Pods

Prometheus Operator

Page 15: K8s 监控之路:从平台到应用 · 2019. 8. 23. · Grafana Web UI 查询、展示 ... app: mysql-exporter Prometheus Operator Prometheus pod pod match watch watch reload ns:

Prometheus Operator 架构

CRDExporter 1

Exporter 2

Exporter 3

Exporter 1

Exporter 2

Page 16: K8s 监控之路:从平台到应用 · 2019. 8. 23. · Grafana Web UI 查询、展示 ... app: mysql-exporter Prometheus Operator Prometheus pod pod match watch watch reload ns:

应用自定义监控

Elements already in place

Custom elements to be deployed by the user

kind: Prometheusname: k8s…spec: serviceMonitorSelector: matchExpressions: - key: k8s-app operator: In values: - redis-exporter - mysql-exporter

kind: ServiceMonitormetadata: labels: k8s-app: mysql-exportername: exporter-smon…spec: endpoints:: - interval: 30s port: metrics namespaceSelector: matchNames: - my-app selector: matchLabels: app: mysql-exporter

PrometheusOperator Prometheus

pod pod

match

watchwatch

reload

ns: my-app

kind: Servicemetadata: labels: app: mysql-exportername: exporter-smonnamespace: my-app…spec: ports:: - name: metrics port: 9104 selector: app: mysql-exporter

mysql-exporter

pod

match

match

scrape

ns: monitoring

Page 17: K8s 监控之路:从平台到应用 · 2019. 8. 23. · Grafana Web UI 查询、展示 ... app: mysql-exporter Prometheus Operator Prometheus pod pod match watch watch reload ns:

应用自定义监控

Elements already in place

Custom elements to be deployed by the user

kind: Prometheusname: k8s…spec: serviceMonitorSelector: matchExpressions: - key: k8s-app operator: In values: - redis-exporter - mysql-exporter

kind: ServiceMonitormetadata: labels: k8s-app: mysql-exportername: exporter-smon…spec: endpoints:: - interval: 30s port: metrics namespaceSelector: matchNames: - my-app selector: matchLabels: app: mysql-exporter

PrometheusOperator Prometheus

pod pod

match

watchwatch

reload

ns: my-app

kind: Servicemetadata: labels: app: mysql-exportername: exporter-smonnamespace: my-app…spec: ports:: - name: metrics port: 9104 selector: app: mysql-exporter

mysql-exporter

pod

match

match

scrape

ns: monitoring

Page 18: K8s 监控之路:从平台到应用 · 2019. 8. 23. · Grafana Web UI 查询、展示 ... app: mysql-exporter Prometheus Operator Prometheus pod pod match watch watch reload ns:
Page 19: K8s 监控之路:从平台到应用 · 2019. 8. 23. · Grafana Web UI 查询、展示 ... app: mysql-exporter Prometheus Operator Prometheus pod pod match watch watch reload ns:

Thanks