Kubernetes v1.36 GA: Pressure Stall Information (PSI) Metrics Now Stable for Production Workloads
Breaking: PSI Metrics Graduate to General Availability
Kubernetes v1.36, released today, marks a major milestone for node-level observability: Pressure Stall Information (PSI) metrics have graduated to General Availability (GA). This means operators can now rely on a stable, production-grade interface to detect resource bottlenecks—CPU, memory, and I/O—before they escalate into outages.

“PSI gives us the earliest possible warning of resource tension,” said Jane Chen, a contributor to the Kubernetes SIG Node. “Unlike traditional utilization numbers, PSI tells you how long tasks are actually waiting—and that’s the signal that matters in a live cluster.”
Background: Beyond Utilization
First introduced in the Linux kernel in 2018, PSI tracks the time tasks spend stalled due to resource shortages. Traditional metrics like CPU or memory utilization can be misleading: a node at 80% CPU may still cause severe latency for some workloads due to scheduling delays. PSI fills that gap by providing cumulative totals and moving averages over 10s, 60s, and 300s windows.
These moving averages help operators distinguish between transient spikes and sustained pressure, enabling more accurate capacity planning and faster incident response. Until now, Kubernetes lacked a standardized, stable way to expose PSI metrics at the pod and container levels.
What This Means for Operators
With the GA graduation in v1.36, PSI metrics are available through the Kubelet at the node, pod, and container granularity. Operators no longer need to rely on external agents or custom scripts to scrape kernel-level counters. This directly translates into:
- Earlier detection of resource contention before it impacts SLAs.
- Lower overhead—the collection logic is negligible, as proven by extensive performance testing.
- Improved automation for cluster autoscaling and workload rebalancing based on actual stall signals.
“This is a game-changer for cluster resource management,” added Chen. “We now have a first-class, stable metric that aligns with how Linux actually schedules work.”
Proving Stability: Performance Testing at Scale
A common concern with new telemetry features is the resource overhead of collection and serving. To address this, SIG Node conducted rigorous performance validation on high-density workloads (80+ pods) across different machine types. The tests isolated two scenarios:
- Kubelet overhead: Compare Kubelet CPU usage with PSI feature enabled versus disabled, while kernel tracking was already active.
- Kernel overhead: Compare system-level CPU impact when kernel PSI is turned on versus off, with the Kubelet feature active.
Scenario 1: Kubelet Overhead Is Negligible
On 4-core machines, both clusters had kernel PSI enabled by default. The Kubelet’s CPU usage showed practically identical bursts whether the feature was on or off. The extra cost stayed within 0.1 cores—just 2.5% of node capacity—well within safe production margins.
Scenario 2: Kernel PSI Adds Minimal System Load
When measuring system CPU usage, the PSI-enabled clusters tracked the same pattern as those without, with only a marginal increase from the baseline of 2.5 cores. The act of Kubernetes reading cgroup metrics proved to be a fraction of the overall system cost.
“These numbers confirm that PSI is production-ready,” said Chen. “The overhead is so small it’s lost in the noise of normal Kubelet housekeeping.”
Immediate Availability
Kubernetes v1.36 is now available for download. Operators can enable PSI metrics by ensuring the kernel has psi=1 (default on most modern distributions) and upgrading their clusters to v1.36. No additional feature gate is required.
For detailed migration guides and configuration examples, refer to the official Kubernetes PSI documentation.
Related Articles
- gThumb 4.0 Alpha: A Radical Visual Overhaul with GTK4 and Libadwaita
- How to Install and Explore Fedora KDE Plasma Desktop 44
- Enforcing Reproducible Builds in Debian 14 Forky: A Step-by-Step Implementation Guide
- Upgrading to Fedora Linux 44 on Silverblue: A Complete Step-by-Step Guide
- Mastering Memory Management with Policy Groups: A Practical Guide
- Exploring the Latest Fedora KDE Plasma Desktop 44: Key Updates and Features
- Linux Kernel Sees Major Changes: Famfs Filesystem, Python Packaging Reforms, and 7.1 Merge Window Launch
- Ratty: A Playful GPU-Accelerated Terminal Emulator That Breaks the Mold