Skip to main content
timbernetes_fll0eo.avif

Kubernetes v1.35 (Timbernetes): Why This Release Actually Matters for Production & AI Workloads

Khushi

Khushi


Introduction

AI workloads are exploding.

Enterprises are now scaling distributed training across hundreds of GPUs while battling flaky scheduling and constant Pod restarts. The relentless growth of AI infrastructure has turned Kubernetes into a mission-critical battleground—where zero-downtime scaling and reliable gang scheduling make or break production SLAs.

Kubernetes v1.35 (Timbernetes) delivers exactly that.

This release ships 60 enhancements focused on security, scalability, and pruning legacy code to strengthen real-world cloud-native operations. The emphasis is clear: operational maturity. From in-place Pod resizing to AI-focused scheduling and ruthless deprecations, these features directly address the pain points SREs face in production.

If you run stateful systems, distributed AI jobs, or long-running services on Kubernetes, this is one of the most practical releases in recent years.

What's New in Kubernetes v1.35

Production Game-Changers

In-place Pod Resource Updates

CPU and memory scaling without Pod restarts is now GA. This is critical for databases, AI training jobs, and stateful services that can't tolerate disruptions. You can now tune resources live during traffic spikes — no redeployments required.

Pod Generation Tracking

Stable .metadata.generation and .status.observedGeneration fields in Pod provide reliable signals when kubelet has applied spec changes. This is essential for production monitoring of in-place resizes.

Topology Manager NUMA Enhancements

The max-allowable-numa-nodes option stabilizes support for servers beyond 8 NUMA nodes—unlocking large multi-GPU systems commonly used for AI and HPC workloads.

Operational Maturity Improvements

Native Pod Certificates

Built-in mTLS with automatic rotation eliminates the need for external cert manager and sidecars. This simplifies workload identity in zero-trust environments.

StatefulSet "maxUnavailable"

Roll out StatefulSet updates in parallel using a number or percentage while still maintaining availability SLAs.

User Namespaces

Run containers as root inside Pods while mapping them to unprivileged users on the host—dramatically reducing privilege-escalation risks in multi-tenant clusters.

Opportunistic Batching

The scheduler batches identical Pods using scheduling signatures, reducing latency during large AI job bursts.

Native Storage Version Migration

Storage version migration is now in-tree, beta, and enabled by default, reducing upgrade risk for long-lived clusters by removing external tooling dependencies.

The Future of AI Infrastructure

Gang Scheduling

The new Workload API and PodGroup enable true all-or-nothing scheduling for distributed AI and HPC jobs—eliminating partial placement deadlocks that waste GPU hours.

Container-level Restart Policies

Define independent restart rules per container. Sidecar failures no longer trigger full Pod restarts—ideal for ML pipelines.

Node Declared Features

Nodes can now advertise capabilities via .status.declaredFeatures, allowing the scheduler to avoid incompatible placements automatically.

Extended Toleration Operators

Numeric taint comparisons enable SLA-aware scheduling based on node reliability scores.

Critical Platform Breaking Changes

Deprecations and Removals

ChangeImpactAction Required
cgroup v1 RemovedOlder Linux nodes fail kubelet startupUpgrade all nodes to cgroup v2
containerd v1.x FinalLast release supporting legacy CRIMigrate to containerd 2.0+
IPVS kube-proxy DeprecatedWarning logs, nftables recommendedPlan nftables migration
Ingress NGINX RetiredBest-effort support until March 2026Migrate to Gateway API

These deprecations are deliberate. Kubernetes v1.35 enforces modernization rather than carrying forward legacy risk.

Security and Identity Enhancements

Kubernetes v1.35 introduces production-grade security primitives that eliminate common multi-tenant vulnerabilities:

  • User Namespaces (Beta): Reduce privilege-escalation risk without rewriting applications.
  • Kubelet Cached Image Verification (Beta): Enforces image pull permissions even for cached images—critical for shared GPU clusters.
  • CSI Token Security: ServiceAccount tokens are now delivered via the secrets field, preventing accidental credential leaks.
  • Constrained Impersonation (Alpha): Granular authorization prevents service accounts from escalating privileges.

Workload & Observability Upgrades

  • Suspended Job Resource Tuning (Alpha): Adjust resources post-OOM without recreating Jobs.
  • Deployment terminatingReplicas: Real-time visibility into Pods being cleaned up during rollouts.
  • OCI Artifact Volumes (Beta): Pull ML models and configs directly into volumes—no init containers needed.
  • Service TrafficDistribution: PreferSameNode and PreferSameZone replace ambiguous PreferClose semantics for low-latency inference.

CLI, Configuration & API Improvements

KYAML (Beta, enabled by default) introduces a safer YAML subset that prevents common configuration errors while remaining compatible with existing kubectl workflows. This significantly reduces manifest drift in GitOps pipelines.

kuberc Credential Plugin Policies (Beta) allow fine-grained control over authentication plugin execution, eliminating surprise credential usage in CI/CD.

Comparable Resource Versions (Stable) introduce decimal-like resourceVersion semantics, enabling reliable controller and informer watch patterns.

Production impact: /flagz and /statusz endpoints now support structured, versioned JSON output (alpha), replacing fragile grep-based health checks with real observability integrations.

Conclusion

Kubernetes v1.35 Timbernetes marks a significant leap forward for AI/ML workloads. In-place Pod resizing eliminates downtime. Gang Scheduling enables reliable AI training at scale. Strategic deprecations force long-overdue modernization.

With this release, Kubernetes continue to be a right choice for building the platforms of the future.

If you are interested in reading the full changelog, you can find it here.

Planning a Kubernetes upgrade?

Talk to CloudRaft expert to get your Kubernetes upgrade done right.

Enjoying this post?

Get our posts directly in your inbox.