Kubernetes adds in‑place pod scaling
- Kubernetes v1.36’s big infrastructure change landed on April 30: in-place vertical scaling for Pod-level CPU and memory moved to beta and turned on by default. - The key detail is the scope change: Kubernetes 1.35 already made container-level in-place resize stable, and 1.36 extends that to Pod-level `spec.resources`. - That matters because live resource tuning no longer has to mean killing a Pod first — a real reliability win for stateful or latency-sensitive workloads.
Kubernetes just made one of its most annoying operational tradeoffs a little less painful. If a running workload needed more CPU or memory, the old answer was often “replace the Pod” — which is fine until that restart is exactly what you were trying to avoid. In Kubernetes v1.36, Pod-level in-place vertical scaling moved to beta and is enabled by default. That means you can raise or lower Pod-level CPU and memory on a live Pod without recreating it. ### What actually changed? The new piece is Pod-level resources, not the basic idea of in-place resize itself. Kubernetes 1.35 already took container-level in-place Pod resize to stable, so changing `resources.requests` and `resources.limits` for individual containers without rebuilding the Pod was already real. Kubernetes 1.36 adds the beta step for Pod-level resources in `spec.resources`, which act as an upper bound for the combined resource use of containers inside that Pod. ### Why is Pod-level scaling different? Because it changes the control point. Container-level resizing says “give this specific container more headroom.” Pod-level resizing says “give this Pod a bigger shared envelope.” That matters for multi-container Pods where sidecars and app containers share a budget and the exact split can move around over time. Kubernetes had already pushed Pod-level resources to beta in v1.34; now the in-place vertical scaling part for that model is beta too. ### How do you use it? The mechanism is the `/resize` subresource. You update the Pod’s requested CPU and memory instead of deleting the Pod and letting a controller create a replacement. Kubernetes docs frame the feature as a way to avoid application disruption while changing allocations on running Pods. That sounds small, but for operators it removes a whole class of “safe but noisy” maintenance actions. ### Why was the old way so annoying? Because “just restart it” is cheap only for stateless, disposable workloads. For anything stateful, latency-sensitive, or warm-cache heavy, a restart can mean dropped connections, cold starts, rebalancing, or noisy failover behavior. Kubernetes was very good at horizontal scaling — add more Pods with HPA — but vertical tuning on live workloads lagged behind. In-place resize closes part of that gap. ### Does this replace autoscaling? Not really. Horizontal Pod Autoscaler still answers “how many Pods should exist?” Vertical resizing answers “how big should this running Pod be?” Those are different knobs. In practice, platform teams use both — horizontal scaling for demand swings, vertical tuning for efficiency, bin-packing, and keeping critical workloads stable while their resource profile changes. ### What’s the catch? The catch is that “in place” does not mean “magic.” The node still has to have spare capacity, the kubelet still has to apply the change, and CPU and memory have different enforcement behavior under Linux cgroups. If the node cannot satisfy the new request, the resize can be deferred or constrained by scheduling reality. So this is not infinite elasticity — it is a safer way to make vertical changes when the cluster can actually honor them. ### Why are people talking about it now? Because Kubernetes 1.36.0 released on April 22, 2026, and the feature blog landed on April 30. Social posts are bundling the news with cheat sheets about Pods, Services, storage, RBAC, HPA, VPA, and Helm, but the real story is narrower and more useful: one more production-grade way to tune live workloads without turning every resource change into a restart event. ### Bottom line? This is not a flashy end-user feature. It is plumbing. But it is the kind of plumbing that makes Kubernetes feel more mature — less “cattle only,” more capable of careful live adjustments when uptime and stability matter.