Prometheus AlertManager And Grafana

  1. The Prometheus Alertmanager is a component that groups alerts, reliably deduplicates, and sends the grouped alerts as notifications. Prometheus AlertManager是一个组件,它对警报进行分组、可靠地消除重复数据并将分组的警报作为通知发送。

  2. Grouping categorizes alerts of similar nature into a single notification. This is especially useful during larger outages when many systems fail at once and hundreds to thousands of alerts may be firing simultaneously. 分组将类似性质的警报分类为单个通知。 当许多系统同时发生故障并且可能同时触发数百到数千个警报时,此功能特别有用

  3. ResolveTimeout is the time after which an alert is declared resolved -- ResolveTimeout是将警报声明为已解决的时间

  4. 告警信息生命周期的3中状态

    1)inactive:表示当前报警信息即不是firing状态也不是pending状态

    2)pending:表示在设置的阈值时间范围内被激活的

    3)firing:表示超过设置的阈值时间被激活的

  5. the per-second rate of HTTP requests as measured over the last 5 minutes --- 最近5分钟内HTTP请求的每秒速率; calculates the per-second instant rate of increase of the time series in the range vector.This is based on the last two data points. --- 根据最近两个数据点计算范围向量中时间序列的每秒瞬时增加率。

  6. calculate population standard variance over dimensions; calculate population standard deviation over dimensions --- 计算各个维度上的总体标准方差; 计算各个维度上的总体标准偏差/抑制(Inhibition)/沉默(Silences)

  7. We will call them timespans as they span many values over a time range. --- 我们将它们称为时间跨度,因为它们在一个时间范围内跨越许多值。

Grafana

  1. Install Grafana plugin : grafana-piechart-panel
    kubectl exec -it grafana-pod-id -n grafana -- grafana-cli plugins install <plugin-id>

    kubectl -n monitoring exec -it grafana-8d87fdbfc-w4pqr -- grafana-cli plugins install grafana-piechart-panel

    Finally, delete the pod to restart the server: kubectl delete pod grafana-pod-id -n grafana

    kubectl delete pod grafana-8d87fdbfc-w4pqr -n monitoring

  2. Grafana is the leading graph and dashboard builder for visualizing time series infrastructure and application metrics -- Grafana是可视化时间序列基础设施和应用程序度量的领先图表和仪表板生成器

  3. ETCD related dashboard: https://grafana.com/grafana/dashboards/10322 (etcd-clusters-as-service)

AlertManager References

  1. Understanding Prometheus AlertManager
  2. Alertmanager简介及机制 (email and wechat配置)
  3. Alertmanager (Prometheus) notification configuration in Kubernetes (alertmanager-main secret)
  4. alertmanager报警规则详解 (如何以优雅的姿势监控kubernetes)
  5. ETCD Alerting Rule
  6. kubernetes-monitoring/kubernetes-mixin (Great)

  7. Prometheus QL Example
  8. PromQL / How to return 0 instead of ‘no data’
  9. Prometheus Counters and how to deal with them

Grafana References

  1. How To Add a Prometheus Dashboard to Grafana

results matching ""

    No results matching ""