Kubernetes Monitoring

  1. Goals for Monitoring
    1. the first and foremost goal of monitoring is reliability
    2. Important to have proper alerting in place (有适当的报警)
    3. In addition to reliability, another significant feature of a monitoring system is providing observability into Kubernetes Cluster ---除了可靠性之外,监控系统的另一个重要特性是为您的Kubernetes集群提供可观察性。
    4. another important use case for cluster monitoring is that of providing users with insight into the operation of the cluster
  2. These metrics are collected by the lightweight, short-term, in-memory metrics-server and are exposed via themetrics.k8s.ioAPI. metrics-server discovers all nodes on the cluster and queries each node’s kubelet for CPU and memory usage.
  3. Metrics-Server是集群核心监控数据的聚合器
  4. Kubernetes 1.7.3之前,cAdvisor的metrics数据集成在kubelet的metrics中,通过节点开放的4194端口获取数据 Kubernetes 1.7.3之后,cAdvisor的metrics被从kubelet的metrics独立出来了,在prometheus采集的时候变成两个scrape的job。网上很多文档记录都说在node节点会开放4194端口,可以通过该端口获取cAdvisor的metrics数据,新版本kubelet中的cadvisor没有对外开放4194端口,只能通过apiserver提供的api做代理获取监控指标metrics
  5. Node service discovery is useful for monitoring the infrastructure of and under Kubernetes, but not much use for monitoring your applications running on Kubernetes --- 节点服务发现对于监视Kubernetes及其以下的基础结构很有用,但对于监视Kubernetes上运行的应用程序没有多大用处。
  6. It is common to **have separate Prometheus servers for network, infrastructure, and application monitoring**. This is known as vertical sharding and it is the best way to scale Prometheus

PromQL

  1. join different metrics together for arithmetic operations against them --- 将不同的度量组合在一起进行算术运算
  2. The values for each timestamp will be the values recorded in the time series back in time, taken from the timestamp for the length of time given in the range duration --- 每个时间戳记的值都将是按时间回溯记录在时间序列中的值, 时间戳的取值来自于时间范围确定的持续时间。
  3. A range-vector is typically generated in order to then apply a function to it to get an instant-vector, which can be graphed (only instant vectors can be graphed).
  4. group_leftorgroup_rightkeywords convert the match into a many-to-one or one-to-many matching respectively. The left and right indicate the side that has the higher cardinality. So a group_left means that multiple series on the left side can match a single series on the right. The result of this is that the returned instant-vector contains all of the labels from the side with the higher cardinality, even if they don’t match any label on the right.
  5. sometimes a time series with no suffix with a quantile label. 有时是没有后缀且带有分位数标签的时间序列。
  6. Aggregation operators work only on instant vectors, and they also output instant vectors. 聚合运算符仅适用于即时向量, 并且它们还输出即时向量。
  7. When a PromQL operator or function could change the value or meaning of a time series, the metric name is removed.
  8. The main use of the standard deviation in monitoring is to detect outliers.标准偏差在监测中的主要用途是检测异常值。
  9. allowing classes of analysis that few other metrics systems offer --- 支持其他指标系统无法提供的分析类别
  10. All the logical operators (and,or, unless) work in a many-to-many fashion, and they are the only operators that work many-to-many
  11. Prometheus works entirely in UTC, and has no notion of time zones
  12. a time series continues beyond the bound of the range if the first/last samples is within 110% of the average interval of the data. If this is not the case, it is presumed the time series exists for 50% of an interval beyond the samples you have, but not with the value going below zero 如果第一个/最后一个采样在数据平均间隔的110%以内,则时间序列将继续超出范围的边界。 如果不是这种情况,则假定时间序列存在的时间间隔超出了您拥有的样本的间隔的50%,但不存在小于零的值

Blogs

  1. Utilizing and monitoring kubernetes cluster resources more effectively
  2. Kubernetes monitoring with Prometheus in 15 minutes
  3. Monitoring Your Apps in Kubernetes Environment with Prometheus
  4. Prometheus监控实践:Kubernetes集群监控

  5. How To Setup Prometheus Monitoring On Kubernetes Cluster

  6. Kubernetes Monitoring 101---Core pipeline & Services Pipeline

  7. A Deep Dive into Kubernetes Metrics (Series)

  8. Kubernetes Metrics and Monitoring (2019)

  9. Kubernetes in Production: The Ultimate Guide to Monitoring Resource Metrics with Prometheus

  10. Horizontal Pod Autoscale with Custom Prometheus Metrics

  11. Kubernetes Monitoring with Prometheus -The ultimate guide (part 1)(Great)

  12. Kubernetes Cluster Monitoring Using Prometheusv

  13. A Journey into Scaling a Prometheus Deployment

  14. Introduction to Kubernetes Monitoring Architecture

ff

Logging

  1. Centralized logging in Kubernetes
  2. Logging in Kubernetes: From container to visualization
  3. Collecting Application Logs On Kubernetes
  4. 在 Kubernetes 上搭建 EFK 日志收集系统
  5. Kubernetes日志收集(一)——Pod进程日志收集 (great)

PromQL (Instrumentation)

  1. Prometheus Querying - Breaking Down PromQL (Vector Match Example)
  2. Prometheus Counters and how to deal with them (Spring boot example)

Netflix Mantis Monitoring System

results matching ""

    No results matching ""