The release of Kubernetes 1.35, officially designated as “Timbernetes,” represents a definitive shift in the architectural philosophy of cloud-native orchestration. This version marks the graduation of In-Place Pod Vertical Scaling (KEP-1287) to General Availability (GA), a transition that fundamentally alters the lifecycle management of containerized workloads. For more than six years, the Kubernetes community has grappled with the limitations of an immutable resource model where changing the compute power of a running pod necessitated its destruction and recreation. The “restart-to-scale” paradigm, while consistent with early microservices doctrine, imposed a significant operational burden on complex, stateful, and latency-sensitive applications. By making container resources mutable within the pod spec, Kubernetes 1.35 empowers the kubelet to adjust resource limits on the fly, directly modifying the underlying kernel control groups (cgroups) without signaling a process restart. This advancement is particularly transformative for high-stakes environments, such as those hosting Java Virtual Machine (JVM) applications or large-scale stateful sets, where the “restart tax” was previously measured in performance degradation, lost cache states, and service downtime.
Why is the Graduation of In-Place Pod Scaling the Definitive Feature of the Kubernetes 1.35 Release?
The maturation of In-Place Pod Vertical Scaling into a stable feature in Kubernetes 1.35 is not merely an incremental update; it is the resolution of a foundational friction point in the platform’s history. Since its inception as an alpha feature in version 1.27, this capability has been designed to address the inefficiency of static resource allocation. In the traditional model, developers were forced into a binary choice: overprovision resources to ensure peak performance, leading to wasted node utilization, or underprovision and risk pod evictions or Out-Of-Memory (OOM) failures during traffic surges. The 1.35 release solves this by allowing the Vertical Pod Autoscaler (VPA) and other controllers to tune container resources dynamically.
The symbolism of the “Timbernetes” name, as noted by the release team, reflects a project that is growing deep roots and expanding its branches to support the most demanding modern workloads, including artificial intelligence (AI) training and edge computing. These workloads often exhibit volatile resource requirements that do not align well with the overhead of pod recreation. In-place scaling allows a pod to expand its resource envelope as a training job intensifies or as a game server experiences a sudden influx of players, ensuring that the infrastructure responds at the speed of the application rather than the speed of the orchestration layer.
| Feature | Legacy “Restart-to-Scale” Model | Kubernetes 1.35 Dynamic Update Model |
|---|---|---|
| Pod Specification | Immutable resource fields for CPU and memory. | Mutable resource requests and limits. |
| Operational Impact | Pod recreation required; IP address and UID changes. | Pod remains running; identity and network remain intact. |
| Scaling Velocity | Limited by image pull time and startup probes. | Immediate cgroup modification via the kubelet. |
| State Retention | In-memory caches and JIT optimizations are lost. | Full retention of in-memory state and optimized code. |
| Risk Profile | High; new pod may fail to schedule on the node. | Low; scaling is performed on the already admitted node. |
How Does the Shift from Restart-to-Scale to Dynamic Updates Change Pod Lifecycle Management?
The transition to a dynamic update model fundamentally redefines the relationship between the pod spec and the running container. Historically, the resources field within a pod’s container definition was considered a fixed contract established at the moment of scheduling. Under the new model in Kubernetes 1.35, the spec.containers[*].resources field represents a “desired state” that can be modified via the new resize subresource. This subresource allows for specialized permissions, ensuring that only authorized entities like the VPA or a cluster administrator can trigger a vertical scale event.
When an update is initiated, the kubelet on the node takes responsibility for reconciling the desired state with the actual hardware allocation. This process is asynchronous, meaning the API server accepts the change and then waits for the kubelet to report back whether the resize was successful, deferred, or infeasible. The use of cgroups v2 is a prerequisite for this functionality, as it provides the unified hierarchy and improved resource isolation necessary for reliable on-the-fly adjustments. Kubernetes 1.35 draws a hard line here, deprecating cgroup v1 support to ensure that the platform can fully leverage these modern kernel features.
| Resource Field | Role in Kubernetes 1.35 | Visibility/Location |
|---|---|---|
| Desired Resources | The user-requested CPU and memory values. | spec.containers[*].resources |
| Admitted Resources | Resources the node has committed to the pod. | status.containerStatuses[*].allocatedResources |
| Configured Resources | The values currently enforced by the container runtime. | status.containerStatuses[*].resources |
This three-way tracking ensures that the control plane always understands the gap between what is requested and what is currently running. If a node lacks the capacity to fulfill a resize request, the kubelet marks the status as PodResizePending or Infeasible, preventing the cluster from over-committing resources while keeping the original pod running at its current capacity.
What are the Practical Benefits of In-Place Scaling for JVM Applications and High-Stakes Services?
Java Virtual Machine (JVM) applications are among the primary beneficiaries of in-place scaling due to their sensitivity to restarts. When a JVM restarts, it loses its Just-In-Time (JIT) compilation optimizations. The JIT compiler is responsible for analyzing execution patterns and translating frequently used bytecode into highly optimized machine code. A restart forces the JVM back to an interpreted state, causing a “cold start” where the application consumes significantly more CPU to re-warm its cache. In high-traffic environments, these cold starts can trigger a cascade of failures as the unoptimized application struggles to meet its latency Service Level Agreements (SLAs).
Furthermore, many Java applications require significant memory for their heap space. In a traditional Kubernetes environment, if the VPA recommended a memory increase, the resulting restart would kill active transactions and sever database connections. With Kubernetes 1.35, the pod can increase its memory limit in place. While the JVM may still require specific flags to recognize and use the newly available memory without a restart, the infrastructure no longer forces a hard termination.
| Application Type | Restart Penalty | In-Place Scaling Advantage |
|---|---|---|
| JVM (Java/Kotlin) | Loss of JIT optimizations; high startup CPU spike. | Maintains optimized machine code; smooth CPU scaling. |
| PostgreSQL / MySQL | WAL replay; termination of active queries. | Continuous query processing during memory expansion. |
| Redis / Memcached | Cache flush; immediate hit to backend databases. | Caches remain hot; no backend storm upon scaling. |
| AI / ML Training | Loss of training epoch progress; GPU re-initialization. | Continuous training; dynamic resource adjustment per phase. |
For stateful sets, such as databases and distributed caches, the benefits are equally profound. These services often maintain large in-memory data structures. Restarting a database instance often requires replaying logs or re-populating caches, a process that can take minutes for large datasets. In-place scaling allows these services to expand their resource requests as load grows, ensuring that the application remains responsive without the risk of data inconsistency or connection storms during a rolling restart.
How does the Kubelet Manage Resource Limits and Node Utilization during an In-Place Resize?
The kubelet acts as the ultimate authority on whether a resize can occur on a given node. When the desired resource update reaches the node, the kubelet calculates the remaining unallocated capacity. This calculation accounts for the pod.spec.resources of all other pods currently running on the node. If the node has sufficient space, the kubelet issues a call to the container runtime (containerd or CRI-O) to update the cgroup settings.
A key innovation in the 1.35 release is the “Memory Shrink Hazard” protection. Decreasing a memory limit is inherently risky; if an application’s current usage exceeds the new, lower limit, the kernel will immediately trigger an OOM-kill. To prevent this, the kubelet in 1.35 performs a best-effort check of the current memory usage before applying a decrease. If the usage is too high, the resize enters a PodResizeInProgress state with an error message, and the limit is not lowered until the application naturally reduces its memory footprint.
| Node Utilization Factor | Role of In-Place Scaling |
|---|---|
| Bin-Packing | Allows pods to be sized tightly; expands only when needed. |
| Resource Reclamation | Idle CPU/memory can be reclaimed without restarting pods. |
| Scheduling Pressure | Reduces the number of scheduling events for scaling. |
| OOM Prevention | Limits can be increased proactively as usage spikes. |
This dynamic management significantly improves node utilization. In traditional clusters, administrators often leave a “buffer” of unused resources on each node to accommodate pod restarts and spikes. With in-place scaling, clusters can run closer to their physical capacity because the system can reallocate resources between running pods as demand shifts. This moves Kubernetes closer to a true “fluid” infrastructure model where hardware is shared optimally in real-time.
What is the Role of the Vertical Pod Autoscaler (VPA) in the New 1.35 Resource Landscape?
The Vertical Pod Autoscaler has long been the “missing piece” for automated Kubernetes scaling, often overshadowed by the Horizontal Pod Autoscaler (HPA) because of its disruptive nature. In version 1.35, the VPA’s integration with in-place scaling transitions it from a recommendation engine into a primary scaling tool. The new InPlaceOrRecreate update mode, which graduated to beta in this release, allows the VPA to apply its recommendations with minimal impact.
When a VPA object is configured with updateMode: InPlaceOrRecreate, it periodically analyzes the historical resource consumption of its target pods. If it determines that a pod is over-provisioned or under-provisioned, it attempts to patch the running pod. This prevents the “vicious cycle” where a VPA would evict a pod to increase its resources, only for the new pod to fail to schedule because the node was already full.
| VPA Update Mode | Behavioral Logic | Disruption Level |
|---|---|---|
| Off | Recommender calculates values; no action taken. | None |
| Initial | Recommendations applied only during pod creation. | Minimal |
| Recreate | Evicts pod to apply new resource values. | High |
| InPlaceOrRecreate | Attempts in-place resize; falls back to eviction if needed. | Minimal |
One significant advantage of the 1.35 release is the removal of the blocker for running VPA and HPA simultaneously. Historically, using both could lead to a conflict where one would scale horizontally while the other scaled vertically, causing cluster instability. With in-place scaling, architects can configure VPA to manage the specific resource requests (right-sizing) while HPA manages the number of replicas based on custom metrics or external demand, leading to a more holistic scaling strategy.
How do Pod Specification Changes and Resize Policies Ensure Application Stability?
The introduction of mutable resources in the pod spec is accompanied by a new field called resizePolicy. This field allows developers to tell Kubernetes how their specific application handles resource changes. Not every application can utilize extra memory immediately; for example, some legacy databases allocate a buffer pool at startup that cannot be resized without a restart.
The resizePolicy includes two primary options: NotRequired and RestartContainer. By default, CPU is set to NotRequired, as the Linux kernel can seamlessly grant more cycles to a running process. Memory often defaults to RestartContainer in many production templates to ensure that the application environment is re-initialized to recognize the new limits. If a pod contains both CPU and memory updates and the memory policy requires a restart, the entire container will restart to ensure consistent state.
| Policy Field | Description | Common Use Case |
|---|---|---|
| NotRequired | Changes are applied to cgroups; no process signal sent. | CPU scaling; modern memory-aware apps (e.g., Go). |
| RestartContainer | Container is terminated and restarted with new limits. | Legacy Java apps with fixed -Xmx settings. |
Another critical guardrail in Kubernetes 1.35 is the immutability of the Quality of Service (QoS) class. A pod’s QoS class (Guaranteed, Burstable, or BestEffort) is a fundamental attribute used for scheduling and eviction decisions. Allowing a pod to change from “Guaranteed” to “Burstable” during its life would invalidate the scheduler’s initial decision to place it on a specific node. Consequently, if a resize request would result in a QoS class change, the API server will reject the update. This ensures that high-priority workloads maintain their performance guarantees throughout their lifecycle.
What Advanced Capabilities Does Kubernetes 1.35 Introduce for Resource Efficiency?
While in-place pod scaling is the headline feature, version 1.35 includes several alpha-stage enhancements that point toward the future of resource management. One of the most significant is the introduction of Pod-Level Resource Specifications (KEP-2837). In the current Kubernetes model, resources are defined at the container level, and the pod’s total is merely the sum of its parts. This often leads to “internal waste” in multi-container pods, where a sidecar might be sitting idle while the main application is throttled.
The new pod-level resource feature allows operators to set an aggregate request and limit for the entire pod. This creates a shared “resource bucket” that all containers in the pod can draw from dynamically. When combined with the alpha InPlacePodLevelResourcesVerticalScaling feature gate, this allows for the vertical scaling of the entire pod’s bucket without service disruption. This is a major leap forward for complex pods running sidecars for service mesh, logging, or security, as it allows for fluid resource sharing within the pod boundary.
| Feature Name | Release Status in 1.35 | Primary Benefit |
|---|---|---|
| In-Place Pod Resize | General Availability (GA) | Production-ready zero-downtime vertical scaling. |
| Pod-Level Resources | Alpha | Aggregate resource sharing between containers. |
| Native Gang Scheduling | Alpha | Ensures all pods in a group start together (AI training). |
| Node Declared Features | Alpha | Prevents version skew issues during pod scheduling. |
Additionally, “Native Gang Scheduling” (KEP-4671) enters alpha, providing a critical tool for AI and High-Performance Computing (HPC) workloads. This feature ensures “all-or-nothing” scheduling, where a group of pods is only admitted to the cluster if resources are available for every member. This prevents scenarios where half of a training job burns expensive GPU cycles while waiting for its peers to be scheduled, a common pain point in distributed machine learning.
Frequently Asked Questions
Yes, the feature relies on the Container Runtime Interface (CRI) to communicate resource changes to the underlying kernel. Kubernetes 1.35 requires a compatible version of either containerd (v1.6.9 or later) or CRI-O (v1.24.2 or later). These runtimes are designed to receive the update from the kubelet and apply the changes to the container’s cgroups without stopping the process.
If a resize request exceeds the available resources on the node where the pod is running, the kubelet will not apply the change. The request will be marked as Infeasible or PodResizePending in the pod’s status. The pod will continue to run at its original resource levels. If you are using the VPA in InPlaceOrRecreate mode, the VPA may eventually decide to evict the pod so it can be rescheduled on a node with more capacity.
Kubernetes 1.35 introduces a “best-effort” protection for memory shrinks. Before lowering the limit, the kubelet checks the container’s current memory usage. If the usage is higher than the new desired limit, the resize is deferred to prevent an immediate OOM-kill. However, this is not a perfect guarantee; if the application’s memory usage spikes immediately after the check but before the limit is applied, an OOM-kill can still occur.
In-place scaling is a cornerstone of “FinOps” for Kubernetes. By allowing pods to be sized based on their current needs rather than their peak historical usage, it enables tighter bin-packing. This means you can fit more pods onto fewer nodes, reducing the overall cloud provider bill. Furthermore, it allows for the reclamation of idle resources from one pod to give to another on the same node without the churn of restarts.
Yes, as a GA feature, In-Place Pod Vertical Scaling is enabled by default in version 1.35. However, it requires that your nodes are running on cgroup v2-enabled Linux distributions. If your cluster relies on legacy cgroup v1 nodes, the kubelet will not support this feature and, in some cases, may fail to start in 1.35 without specific configuration.
Absolutely. In fact, stateful sets are one of the primary targets for this feature. By using in-place scaling, you can vertically scale the members of a StatefulSet (such as a database cluster) one by one without triggering the traditional rolling update that causes temporary loss of availability or leader elections.
No. As of Kubernetes 1.35, the In-Place Pod Resize feature is strictly limited to CPU and memory resources. Other resource types, including GPUs managed via Device Plugins or ephemeral storage, remain immutable and still require a pod restart to modify.
Conclusion
The graduation of In-Place Pod Vertical Scaling to General Availability in Kubernetes 1.35 marks a high-water mark for the platform’s operational maturity. By dismantling the “restart-to-scale” barrier, the Kubernetes project has addressed one of the most persistent challenges in cloud-native architecture: the efficient management of dynamic, stateful, and performance-sensitive workloads. For organizations running mission-critical JVM applications or large-scale databases, version 1.35 offers a path to zero-downtime scaling and vastly improved hardware efficiency.
As the “Timbernetes” release settles into the ecosystem, the focus shifts from simply managing pod existence to optimizing pod performance in real-time. The integration of VPA, the safeguards for QoS and memory shrinking, and the introduction of pod-level sharing all point toward a future where the infrastructure is truly invisible, responding elastically to the heartbeat of the application. For architects and operators, the message of Kubernetes 1.35 is clear: the era of the “restart tax” is over, and the era of fluid, high-utilization clusters has arrived.
References
-
Kubernetes 1.35: In-Place Pod Resize Graduates to Stable, accessed on December 29, 2025, https://kubernetes.io/blog/2025/12/19/kubernetes-v1-35-in-place-pod-resize-ga/
-
Resize CPU and Memory Resources assigned to Containers - Kubernetes, accessed on December 29, 2025, https://kubernetes.io/docs/tasks/configure-pod-container/resize-container-resources/
-
Vertical Pod Autoscaling - Kubernetes, accessed on December 29, 2025, https://kubernetes.io/docs/concepts/workloads/autoscaling/vertical-pod-autoscale/
-
Kubernetes 1.35 “Timbernetes” Introduces Vertical Scaling - The New Stack, accessed on December 29, 2025, https://thenewstack.io/kubernetes-1-35-timbernetes-introduces-vertical-scaling/
-
Kubernetes 1.35 - Kubesimplify, accessed on December 29, 2025, https://blog.kubesimplify.com/kubernetes-v135-whats-new-whats-changing-and-what-you-should-know?source=more_articles_bottom_blogs
-
Kubernetes In-Place Pod Resizing - Vertical Scaling Without Downtime, accessed on December 29, 2025, https://builder.aws.com/content/30MCh47Lw54JPCwGh0TKZgJU8d7/kubernetes-in-place-pod-resizing-vertical-scaling-without-downtime
-
In-place Pod resizing in Kubernetes: How it works and how to use it | Tech blog - Palark, accessed on December 29, 2025, https://palark.com/blog/in-place-pod-resizing-kubernetes/
-
Kubernetes 1.35 enables zero-downtime resource scaling for production cloud workloads, accessed on December 29, 2025, https://www.networkworld.com/article/4107891/kubernetes-1-35-enables-zero-downtime-resource-scaling-for-production-cloud-workloads.html
-
KEP-2837: Pod Level Resource Specifications - GitHub, accessed on December 29, 2025, https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2837-pod-level-resource-spec/README.md
-
In-Place Vertical Pod Scaling: The Future of Resource Management, accessed on December 29, 2025, https://superorbital.io/blog/in-place-vertical-pod-scaling/
-
Kubernetes In-Place Pod Vertical Scaling - ScaleOps, accessed on December 29, 2025, https://scaleops.com/blog/kubernetes-in-place-pod-vertical-scaling/
-
Best Practices for configuring the JVM in a Kubernetes environment | AWS Builder Center, accessed on December 29, 2025, https://builder.aws.com/content/2y4rrgs9rUiiyBZsR2TVLmGhNS3/best-practices-for-configuring-the-jvm-in-a-kubernetes-environment