尾延迟(Tail Latency)
- 尾延迟:如果在系统中引入实时监控,总会有少量响应的延迟高于均值,我们把这些响应称为尾延迟(Tail Latency)
P99 Latency : It's the upper bound of latencies experienced by 99% of flows (e.g., TCP flows, HTTP requests, RPCs, ...). In other words, 99% of the flows are experiencing less than the p99 (aka 99th-percentile、Quantiles 百分之九十九, 第99个百分位) latency.
Why is it used?
In most applications, we want to minimize tail latencies, which correspond to the worst user experience. Now, given that we have roughly 1% noise in our measurements (like network congestions, outages, service degradations), the p99 latency is a good representative of practically the worst case. And, almost always, our goal is to reduce the p99 latency.
A 99th percentile latency of 30 ms means that every 1 in 100 requests experience 30 ms of delay.
Network Outrage : 网络中断
90th percentile is simple another way of saying that 9% scored above and 90% scored below (一次考试中,如果你的成绩在90th percentile,就是说,大概有90%的人比你差。);比如如果有人托福考了118 (满分120)那就是99th percentile,意思是说成绩比 99%的考生都要好
Long Tail Latency : 长尾延迟//percentiles computing
要将交互式服务的延迟分布中的尾端也保持在很低的水平上会变得非常有挑战性。暂时性的高延迟(在中等规模的情况下影响并不大)在大规模场景下可能会严重影响服务整体性能
References
Google: Taming the Long Latency Tail - When More Machines Equals Worse Results
Approximate Algorithms in Apache Spark
第95个百分位(95th percentile)是什么概念