54608 bookmarks

Custom sorting

Kubernetes v1.34: Pod Replacement Policy for Jobs Goes GA

https://kubernetes.io/blog/2025/09/05/kubernetes-v1-34-pod-replacement-policy-for-jobs-goes-ga/

In Kubernetes v1.34, the Pod replacement policy feature has reached general availability (GA). This blog post describes the Pod replacement policy feature and how to use it in your Jobs.

About Pod Replacement Policy

By default, the Job controller immediately recreates Pods as soon as they fail or begin terminating (when they have a deletion timestamp).

As a result, while some Pods are terminating, the total number of running Pods for a Job can temporarily exceed the specified parallelism. For Indexed Jobs, this can even mean multiple Pods running for the same index at the same time.

This behavior works fine for many workloads, but it can cause problems in certain cases.

For example, popular machine learning frameworks like TensorFlow and JAX expect exactly one Pod per worker index. If two Pods run at the same time, you might encounter errors such as:

/job:worker/task:4: Duplicate task registration with task_name=/job:worker/replica:0/task:4

Additionally, starting replacement Pods before the old ones fully terminate can lead to:

Scheduling delays by kube-scheduler as the nodes remain occupied.

Unnecessary cluster scale-ups to accommodate the replacement Pods.

Temporary bypassing of quota checks by workload orchestrators like Kueue.

With Pod replacement policy, Kubernetes gives you control over when the control plane replaces terminating Pods, helping you avoid these issues.

How Pod Replacement Policy works

This enhancement means that Jobs in Kubernetes have an optional field .spec.podReplacementPolicy.

You can choose one of two policies:

TerminatingOrFailed (default): Replaces Pods as soon as they start terminating.

Failed: Replaces Pods only after they fully terminate and transition to the Failed phase.

Setting the policy to Failed ensures that a new Pod is only created after the previous one has completely terminated.

For Jobs with a Pod Failure Policy, the default podReplacementPolicy is Failed, and no other value is allowed. See Pod Failure Policy to learn more about Pod Failure Policies for Jobs.

You can check how many Pods are currently terminating by inspecting the Job’s .status.terminating field:

kubectl get job myjob -o=jsonpath='{.status.terminating}'

Example

Here’s a Job example that executes a task two times (spec.completions: 2) in parallel (spec.parallelism: 2) and replaces Pods only after they fully terminate (spec.podReplacementPolicy: Failed):

apiVersion: batch/v1 kind: Job metadata: name: example-job spec: completions: 2 parallelism: 2 podReplacementPolicy: Failed template: spec: restartPolicy: Never containers:

name: worker image: your-image

If a Pod receives a SIGTERM signal (deletion, eviction, preemption...), it begins terminating. When the container handles termination gracefully, cleanup may take some time.

When the Job starts, we will see two Pods running:

kubectl get pods

NAME READY STATUS RESTARTS AGE example-job-qr8kf 1/1 Running 0 2s example-job-stvb4 1/1 Running 0 2s

Let's delete one of the Pods (example-job-qr8kf).

With the TerminatingOrFailed policy, as soon as one Pod (example-job-qr8kf) starts terminating, the Job controller immediately creates a new Pod (example-job-b59zk) to replace it.

kubectl get pods

NAME READY STATUS RESTARTS AGE example-job-b59zk 1/1 Running 0 1s example-job-qr8kf 1/1 Terminating 0 17s example-job-stvb4 1/1 Running 0 17s

With the Failed policy, the new Pod (example-job-b59zk) is not created while the old Pod (example-job-qr8kf) is terminating.

kubectl get pods

NAME READY STATUS RESTARTS AGE example-job-qr8kf 1/1 Terminating 0 17s example-job-stvb4 1/1 Running 0 17s

When the terminating Pod has fully transitioned to the Failed phase, a new Pod is created:

kubectl get pods

NAME READY STATUS RESTARTS AGE example-job-b59zk 1/1 Running 0 1s example-job-stvb4 1/1 Running 0 25s

How can you learn more?

Read the user-facing documentation for Pod Replacement Policy, Backoff Limit per Index, and Pod Failure Policy.

Read the KEPs for Pod Replacement Policy, Backoff Limit per Index, and Pod Failure Policy.

Acknowledgments

As with any Kubernetes feature, multiple people contributed to getting this done, from testing and filing bugs to reviewing code.

As this feature moves to stable after 2 years, we would like to thank the following people:

Kevin Hannon - for writing the KEP and the initial implementation.

Michał Woźniak - for guidance, mentorship, and reviews.

Aldo Culquicondor - for guidance, mentorship, and reviews.

Maciej Szulik - for guidance, mentorship, and reviews.

Dejan Zele Pejchev - for taking over the feature and promoting it from Alpha through Beta to GA.

Get involved

This work was sponsored by the Kubernetes batch working group in close collaboration with the SIG Apps community.

If you are interested in working on new features in the space we recommend subscribing to our Slack channel and attending the regular community meetings.

via Kubernetes Blog https://kubernetes.io/

September 05, 2025 at 02:30PM

·kubernetes.io·yesterday at 11:30 PM

Kubernetes v1.34: Pod Replacement Policy for Jobs Goes GA

Should AI Get Legal Rights?

Model welfare is an emerging field of research that seeks to determine whether AI is conscious and, if so, how humanity should respond.

·wired.com·yesterday at 1:34 AM

Should AI Get Legal Rights?

Inside Anthropic’s ‘Red Team’—ensuring Claude is safe, and that Anthropic is heard in the corridors of power

Unique among AI labs, Anthropic’s Frontier Red Team reports to its policy chief and has a mandate to publicize the dangers it finds.

·fortune.com·yesterday at 1:33 AM

Inside Anthropic’s ‘Red Team’—ensuring Claude is safe, and that Anthropic is heard in the corridors of power

PSI Metrics for Kubernetes Graduates to Beta

https://kubernetes.io/blog/2025/09/04/kubernetes-v1-34-introducing-psi-metrics-beta/

As Kubernetes clusters grow in size and complexity, understanding the health and performance of individual nodes becomes increasingly critical. We are excited to announce that as of Kubernetes v1.34, Pressure Stall Information (PSI) Metrics has graduated to Beta.

What is Pressure Stall Information (PSI)?

Pressure Stall Information (PSI) is a feature of the Linux kernel (version 4.20 and later) that provides a canonical way to quantify pressure on infrastructure resources, in terms of whether demand for a resource exceeds current supply. It moves beyond simple resource utilization metrics and instead measures the amount of time that tasks are stalled due to resource contention. This is a powerful way to identify and diagnose resource bottlenecks that can impact application performance.

PSI exposes metrics for CPU, memory, and I/O, categorized as either some or full pressure:

some

The percentage of time that at least one task is stalled on a resource. This indicates some level of resource contention.

full

The percentage of time that all non-idle tasks are stalled on a resource simultaneously. This indicates a more severe resource bottleneck.

PSI: 'Some' vs. 'Full' Pressure

These metrics are aggregated over 10-second, 1-minute, and 5-minute rolling windows, providing a comprehensive view of resource pressure over time.

PSI metrics in Kubernetes

With the KubeletPSI feature gate enabled, the kubelet can now collect PSI metrics from the Linux kernel and expose them through two channels: the Summary API and the /metrics/cadvisor Prometheus endpoint. This allows you to monitor and alert on resource pressure at the node, pod, and container level.

The following new metrics are available in Prometheus exposition format via /metrics/cadvisor:

container_pressure_cpu_stalled_seconds_total

container_pressure_cpu_waiting_seconds_total

container_pressure_memory_stalled_seconds_total

container_pressure_memory_waiting_seconds_total

container_pressure_io_stalled_seconds_total

container_pressure_io_waiting_seconds_total

These metrics, along with the data from the Summary API, provide a granular view of resource pressure, enabling you to pinpoint the source of performance issues and take corrective action. For example, you can use these metrics to:

Identify memory leaks: A steadily increasing some pressure for memory can indicate a memory leak in an application.

Optimize resource requests and limits: By understanding the resource pressure of your workloads, you can more accurately tune their resource requests and limits.

Autoscale workloads: You can use PSI metrics to trigger autoscaling events, ensuring that your workloads have the resources they need to perform optimally.

How to enable PSI metrics

To enable PSI metrics in your Kubernetes cluster, you need to:

Ensure your nodes are running a Linux kernel version 4.20 or later and are using cgroup v2.

Enable the KubeletPSI feature gate on the kubelet.

Once enabled, you can start scraping the /metrics/cadvisor endpoint with your Prometheus-compatible monitoring solution or query the Summary API to collect and visualize the new PSI metrics. Note that PSI is a Linux-kernel feature, so these metrics are not available on Windows nodes. Your cluster can contain a mix of Linux and Windows nodes, and on the Windows nodes the kubelet does not expose PSI metrics.

What's next?

We are excited to bring PSI metrics to the Kubernetes community and look forward to your feedback. As a beta feature, we are actively working on improving and extending this functionality towards a stable GA release. We encourage you to try it out and share your experiences with us.

To learn more about PSI metrics, check out the official Kubernetes documentation. You can also get involved in the conversation on the #sig-node Slack channel.

via Kubernetes Blog https://kubernetes.io/

September 04, 2025 at 02:30PM

·kubernetes.io·Sep 4, 2025

PSI Metrics for Kubernetes Graduates to Beta

Last Week in Kubernetes Development - Week Ending August 21 2025

Week Ending August 21, 2025

https://lwkd.info/2025/20250904

Developer News

The Kubernetes Steering Committee 2025 election is open for four seats. Candidate nominations are due by September 8 and voting begins on September 10. Voting will be conducted through Elekto using GitHub login, where you can also verify your voter eligibility. The election ends October 24, and results will be announced on November 5.

Equinix Metal platform will shut down on June 30, 2026, so SIG Cloud Provider will deprecate cloud-provider-equinix-metal. The repo will be updated to Kubernetes 1.34, maintained with fixes and tests, and archived after the 1.37 release.

The KubeCon North America 2025 Maintainer Summit schedule is out.

Release Schedule

Next Deadline: 1.35 Release Cycle Starts, September

We are between release cycles right now. The 1.35 cycle will start in September. Watch the Dev mailing list for the call for release shadows.

The cherry-pick deadline for the next set of patch releases is this Friday.

Featured PRs

132798: Show simple values in validation rule failures

This PR improves error messages from CEL validation in CRDs; Previously, failures displayed the field type (for example, “string”) instead of the value that caused the failure; Now, when the value is a number, boolean, or string, the error message shows that value; This makes validation errors clearer and easier to understand.

133323: Make kubectl auth reconcile retry on conflict

This PR improves the kubectl auth reconcile command; Before, if the command tried to update an object and hit a conflict (for example, because another change happened at the same time), it would fail right away; Now, it retries when a conflict occurs, making the command more reliable when multiple updates happen concurrently.

Other Merges

Use consistent documentation of aliases in API

Improve shell completion for api resources

Drop experimental prefix from kubectl wait command

Remove ListType marker from non-list field

Move GetAffinityTerms functions to staging repo

kube proxy iptables logging now displays correctly

PodFailurePolicy conditions no longer require explicit status

Add resourceClaimModified to bindClaim update assume cache

Gate Storage version migration behind RealFIFO to prevent possible race conditions.

Improve godoc by enabling accurate deprecation warnings

Validate flush frequency is positive

Skip PreEnqueue when pod reactivated from backoffQ

Add conversion for timeoutForControlPlane field

Optimize calculatePodRequests for specific container lookups

Make DeleteOptions decode returns 400 instead of 500

Enable KYAML gate by default

Improve conversion-gen handling of unexported fields and pointer conversions

Make kubectl auth reconcile retries on conflict

Store WithContext ctx in a wrapper to avoid conflict

Extend applyconfiguration-gen to generate extract functions for all subresources, not just status.

Report actionable error when GC fails due to disk pressure

Increment metric for duplicate validation errors

Remove duplicate RBAC resources update validations

Prevent race in scheduler integration test

Resolve kubectl writing current-context to the wrong kubeconfig file when using multiple kubeconfig files

Enable multiple volume references to a single PVC

VAC API test to conformance

Deprecated

Removed deprecated gogo protocol definitions from k8s.io/kubelet/pkg/apis/dra in favor of google.golang.org/protobuf.

Remove StatefulSetAutoDeletePVC after feature GA-ed in 1.32

Remove OuterVolumeSpecName from ASW

Version Updates

Bumped cri-tools to v1.34.0

Update CoreDNS to v1.12.3

Subprojects and Dependency Updates

cloud-provider-vSphere v1.34.0 adds daemonset volumes, shared sessions, fixes service/tag issues, and updates Go, CAPI, CAPV, and Kubernetes.

cluster-apiv1.11.1 extends Kubernetes support to v1.34 for both management and workload clusters

cluster-api-provider-vsphere v1.14.0 upgrades to CAPI v1.11, Go 1.24, and adds multi-networking for NSX-VPC and vSphere providers

CRI-O v1.33.4 fixes CNI teardown, validates memory limits, pulls OCI images earlier and adds hostnetwork info

Ingress-NGINX v1.13.2 fixes nginx_ingress_controller_config_last_reload_successful metrics and hardens socket security; Helm Chart v4.13.2 updates to controller v1.13.2 and bumps Kube Webhook CertGen.

kind v0.30.0 contains patched dependencies and Kubernetes 1.34, as well as a bugfix for Kubernetes v1.33.0+ cluster reboots

kOps v1.33.1 adds Debian 13 support, fixes Amazon Linux 2 and CoreDNS issues, and updates Kubernetes hashes

Shoutouts

Rajalakshmi Girish: A big shout-out to the Kubernetes v1.34 Release Signal Team! @adil @ChengHao Yang (tico88612) @elieser1101 @Prajyot Parab @Sarthak Negi It has been an incredible journey with such a dedicated and committed group throughout this cycle. Experienced members supported and guided the new ones, while the newcomers showed eagerness and openness to learn. This team consistently showed up with the highest attendance in release team calls, whether it was the weekly syncs or burndown meetings. From diligently updating meeting notes, giving timely Go/No-Go signals for release cuts, and collaborating without a hitch, every member stepped up and delivered flawlessly. Despite busy schedules—whether balancing organizational responsibilities or internship commitments, everyone fulfilled their role with remarkable dedication. Our direct chat group reflected the unity and support within the team, always backing each other up whenever needed. Kudos to each of you. I am proud to have led such an energetic, collaborative, and committed team!

Vyom Yadav: Kubernetes v1.34 is shipped It was an absolute pleasure to be a part of this journey across the ocean, which wouldn’t have been possible without my fellow sailors. Lead Shadows: @Wendy Ha @Sreeram Venkitesh @Ryota @dchan - I felt very comfortable knowing I had y’all to help me steer this ship and proactively check the state of things on your own! Enhancements: @Jenny Shu @Drew Hagen @rayandas @Faeka Ansari @Sean McGinnis @jmickey - Enhancements gets quite busy early on in the cycle and it’s due to your efforts that we’ve 58 strong enhancements this cycle and a very well rounded Kubernetes release. Comms: @aibarbetta @Alejandro Leon @Dipesh @Graziano Casto @Melony Q. (aka.cloudmelon ) - Going through all the enhancements to select a few is quite daunting, especially when there are about 75 of them before the code freeze, y’all did an amazing job highlighting the enhancements we’ve and coordinating with CNCF to get things done on time. Release Signal: @Rajalakshmi Girish @ChengHao Yang (tico88612) @elieser1101 @Prajyot Parab @adil @Sarthak Negi - The flake that we find just before the release cut is always there, but the way you navigated those (and the structure of communication) to not cause any delay to the release is commendable. Docs: @Michelle Nguyen @Urvashi @Arvind Parekh @YuJen Huang(Dylan) @DangerBuff @Rashan - Docs is a team that’s busy during the complete cycle, from enforcing KEPs to have docs to managing release notes, when we’ve inherited some rough winds is a job well done. Branch Management: @Matteo (away until Jan ‘26) @Drew Hagen @Angelos Kolaitis @satyampsoni - Thank you for actually shipping Kubernetes (literally), and all the improvements you have been making to the process. and a very special thank you to @Kat Cosgrove and @fsmunoz for all the guidance and being there, jumping in when I required help, and to all SIG leads, tech leads, contributors for helping us ship this release. I’ve a lot to say about this cycle and the release team. I joined back in v1.27 and every cycle I’ve learned, grown, made friends and just enjoyed myself working to ship one the largest open source projects on this planet which is a no small feat and y’all should be incredibly proud. Feels absolutely nostalgic and I can’t thank everyone enough whom I’ve worked with (this or previous cycles) and it was an honor to be a part of this crew and steering the ship this release (until the next time!)

Michelle Nguyen: A big shout-out to Kubernetes v1.34 Docs team! @Rashan @Urvashi @Arvind Parekh @YuJen Huang(Dylan) @DangerBuff You all consistently went above and beyond—whether updating meeting notes meticulously, tracking down docs, or supporting each other with krel tasks. Every single person delivered exceptional work without fail. Thanks to you all, our release was smooth, especially from a docs perspective! A special shoutout to @Drew Hagen for helping Docs out during Docs Freeze. You absolutely rock! I’m incredibly proud of what we’ve accomplished as a team and am extremely grateful for the opportunity to work alongside everyone.

Agustina Barbetta: As we wrap up post-release communications for v1.34, I want to give a big shoutout to the Kubernetes v1.34 Comms team: @Dipesh @Graziano Casto @Alejandro Leon @Melony Q. (aka.cloudmelon ) Comms gets more challenging as the cycle progresses, but you’ve consistently stepped up and tackled everything from a quick outreach to major writing tasks. The second half of the cycle saw us publish 2 blogs, one of which highlighted 44 SIG features, while also reviewing 18 Feature Blogs that are currently rolling out. And through it all, we stayed on track with every commitment in the v1.34 timeline. Thank you for making v1.34 communications a huge success!

via Last Week in Kubernetes Development https://lwkd.info/

September 04, 2025 at 07:52AM

·lwkd.info·Sep 4, 2025

Last Week in Kubernetes Development - Week Ending August 21 2025

Kubernetes v1.34: Service Account Token Integration for Image Pulls Graduates to Beta

https://kubernetes.io/blog/2025/09/03/kubernetes-v1-34-sa-tokens-image-pulls-beta/

The Kubernetes community continues to advance security best practices by reducing reliance on long-lived credentials. Following the successful alpha release in Kubernetes v1.33, Service Account Token Integration for Kubelet Credential Providers has now graduated to beta in Kubernetes v1.34, bringing us closer to eliminating long-lived image pull secrets from Kubernetes clusters.

This enhancement allows credential providers to use workload-specific service account tokens to obtain registry credentials, providing a secure, ephemeral alternative to traditional image pull secrets.

What's new in beta?

The beta graduation brings several important changes that make the feature more robust and production-ready:

Required cacheType field

Breaking change from alpha: The cacheType field is required in the credential provider configuration when using service account tokens. This field is new in beta and must be specified to ensure proper caching behavior.

CAUTION: this is not a complete configuration example, just a reference for the 'tokenAttributes.cacheType' field.

tokenAttributes: serviceAccountTokenAudience: "my-registry-audience" cacheType: "ServiceAccount" # Required field in beta requireServiceAccount: true

Choose between two caching strategies:

Token: Cache credentials per service account token (use when credential lifetime is tied to the token). This is useful when the credential provider transforms the service account token into registry credentials with the same lifetime as the token, or when registries support Kubernetes service account tokens directly. Note: The kubelet cannot send service account tokens directly to registries; credential provider plugins are needed to transform tokens into the username/password format expected by registries.

ServiceAccount: Cache credentials per service account identity (use when credentials are valid for all pods using the same service account)

Isolated image pull credentials

The beta release provides stronger security isolation for container images when using service account tokens for image pulls. It ensures that pods can only access images that were pulled using ServiceAccounts they're authorized to use. This prevents unauthorized access to sensitive container images and enables granular access control where different workloads can have different registry permissions based on their ServiceAccount.

When credential providers use service account tokens, the system tracks ServiceAccount identity (namespace, name, and UID) for each pulled image. When a pod attempts to use a cached image, the system verifies that the pod's ServiceAccount matches exactly with the ServiceAccount that was used to originally pull the image.

Administrators can revoke access to previously pulled images by deleting and recreating the ServiceAccount, which changes the UID and invalidates cached image access.

For more details about this capability, see the image pull credential verification documentation.

How it works

Configuration

Credential providers opt into using ServiceAccount tokens by configuring the tokenAttributes field:

CAUTION: this is an example configuration.

Do not use this for your own cluster!

# apiVersion: kubelet.config.k8s.io/v1 kind: CredentialProviderConfig providers:

name: my-credential-provider matchImages:
- ".myregistry.io/" defaultCacheDuration: "10m" apiVersion: credentialprovider.kubelet.k8s.io/v1 tokenAttributes: serviceAccountTokenAudience: "my-registry-audience" cacheType: "ServiceAccount" # New in beta requireServiceAccount: true requiredServiceAccountAnnotationKeys:
- "myregistry.io/identity-id" optionalServiceAccountAnnotationKeys:
- "myregistry.io/optional-annotation"

Image pull flow

At a high level, kubelet coordinates with your credential provider and the container runtime as follows:

When the image is not present locally:

kubelet checks its credential cache using the configured cacheType (Token or ServiceAccount)

If needed, kubelet requests a ServiceAccount token for the pod's ServiceAccount and passes it, plus any required annotations, to the credential provider

The provider exchanges that token for registry credentials and returns them to kubelet

kubelet caches credentials per the cacheType strategy and pulls the image with those credentials

kubelet records the ServiceAccount coordinates (namespace, name, UID) associated with the pulled image for later authorization checks

When the image is already present locally:

kubelet verifies the pod's ServiceAccount coordinates match the coordinates recorded for the cached image

If they match exactly, the cached image can be used without pulling from the registry

If they differ, kubelet performs a fresh pull using credentials for the new ServiceAccount

With image pull credential verification enabled:

Authorization is enforced using the recorded ServiceAccount coordinates, ensuring pods only use images pulled by a ServiceAccount they are authorized to use

Administrators can revoke access by deleting and recreating a ServiceAccount; the UID changes and previously recorded authorization no longer matches

Audience restriction

The beta release builds on service account node audience restriction (beta since v1.33) to ensure kubelet can only request tokens for authorized audiences. Administrators configure allowed audiences using RBAC to enable kubelet to request service account tokens for image pulls:

CAUTION: this is an example configuration.

Do not use this for your own cluster!

# apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: kubelet-credential-provider-audiences rules:

verbs: ["request-serviceaccounts-token-audience"] apiGroups: [""] resources: ["my-registry-audience"] resourceNames: ["registry-access-sa"] # Optional: specific SA

Getting started with beta

Prerequisites

Kubernetes v1.34 or later

Feature gate enabled: KubeletServiceAccountTokenForCredentialProviders=true (beta, enabled by default)

Credential provider support: Update your credential provider to handle ServiceAccount tokens

Migration from alpha

If you're already using the alpha version, the migration to beta requires minimal changes:

Add cacheType field: Update your credential provider configuration to include the required cacheType field

Review caching strategy: Choose between Token and ServiceAccount cache types based on your provider's behavior

Test audience restrictions: Ensure your RBAC configuration, or other cluster authorization rules, will properly restrict token audiences

Example setup

Here's a complete example for setting up a credential provider with service account tokens (this example assumes your cluster uses RBAC authorization):

CAUTION: this is an example configuration.

Do not use this for your own cluster!

Service Account with registry annotations

apiVersion: v1 kind: ServiceAccount metadata: name: registry-access-sa namespace: default annotations: myregistry.io/identity-id: "user123" ---

RBAC for audience restriction

apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: registry-audience-access rules:

verbs: ["request-serviceaccounts-token-audience"] apiGroups: [""] resources: ["my-registry-audience"] resourceNames: ["registry-access-sa"] # Optional: specific ServiceAccount --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: kubelet-registry-audience roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: registry-audience-access subjects:
kind: Group name: system:nodes apiGroup: rbac.authorization.k8s.io --- # Pod using the ServiceAccount apiVersion: v1 kind: Pod metadata: name: my-pod spec: serviceAccountName: registry-access-sa containers:
- name: my-app image: myregistry.example/my-app:latest

What's next?

For Kubernetes v1.35, we - Kubernetes SIG Auth - expect the feature to stay in beta, and we will continue to solicit feedback.

You can learn more about this feature on the service account token for image pulls page in the Kubernetes documentation.

You can also follow along on the KEP-4412 to track progress across the coming Kubernetes releases.

Call to action

In this blog post, I have covered the beta graduation of ServiceAccount token integration for Kubelet Credential Providers in Kubernetes v1.34. I discussed the key improvements, including the required cacheType field and enhanced integration with Ensure Secret Pull Images.

We have been receiving positive feedback from the community during the alpha phase and would love to hear more as we stabilize this feature for GA. In particular, we would like feedback from credential provider implementors as they integrate with the new beta API and caching mechanisms. Please reach out to us on the #sig-auth-authenticators-dev channel on Kubernetes Slack.

How to get involved

If you are interested in getting involved in the development of this feature, share feedback, or participate in any other ongoing SIG Auth projects, please reach out on the #sig-auth channel on Kubernetes Slack.

You are also welcome to join the bi-weekly SIG Auth meetings, held every other Wednesday.

via Kubernetes Blog https://kubernetes.io/

September 03, 2025 at 02:30PM

·kubernetes.io·Sep 3, 2025

Kubernetes v1.34: Service Account Token Integration for Image Pulls Graduates to Beta

Impossible Puzzles Quantum Computers Dream of Solving

Imagine you’re the mayor of a bustling city. Every day you face decisions that seem impossible: how to schedule buses so no one waits too long, how to route traffic without creating new jams, how to balance electricity demand when everyone cranks up their air conditioning at once.

·linkedin.com·Sep 3, 2025

Impossible Puzzles Quantum Computers Dream of Solving

Kubernetes v1.34: Introducing CPU Manager Static Policy Option for Uncore Cache Alignment

https://kubernetes.io/blog/2025/09/02/kubernetes-v1-34-prefer-align-by-uncore-cache-cpumanager-static-policy-optimization/

A new CPU Manager Static Policy Option called prefer-align-cpus-by-uncorecache was introduced in Kubernetes v1.32 as an alpha feature, and has graduated to beta in Kubernetes v1.34. This CPU Manager Policy Option is designed to optimize performance for specific workloads running on processors with a split uncore cache architecture. In this article, I'll explain what that means and why it's useful.

Understanding the feature

What is uncore cache?

Until relatively recently, nearly all mainstream computer processors had a monolithic last-level-cache cache that was shared across every core in a multiple CPU package. This monolithic cache is also referred to as uncore cache (because it is not linked to a specific core), or as Level 3 cache. As well as the Level 3 cache, there is other cache, commonly called Level 1 and Level 2 cache, that is associated with a specific CPU core.

In order to reduce access latency between the CPU cores and their cache, recent AMD64 and ARM architecture based processors have introduced a split uncore cache architecture, where the last-level-cache is divided into multiple physical caches, that are aligned to specific CPU groupings within the physical package. The shorter distances within the CPU package help to reduce latency.

Kubernetes is able to place workloads in a way that accounts for the cache topology within the CPU package(s).

Cache-aware workload placement

The matrix below shows the CPU-to-CPU latency measured in nanoseconds (lower is better) when passing a packet between CPUs, via its cache coherence protocol on a processor that uses split uncore cache. In this example, the processor package consists of 2 uncore caches. Each uncore cache serves 8 CPU cores.

Blue entries in the matrix represent latency between CPUs sharing the same uncore cache, while grey entries indicate latency between CPUs corresponding to different uncore caches. Latency between CPUs that correspond to different caches are higher than the latency between CPUs that belong to the same cache.

With prefer-align-cpus-by-uncorecache enabled, the static CPU Manager attempts to allocates CPU resources for a container, such that all CPUs assigned to a container share the same uncore cache. This policy operates on a best-effort basis, aiming to minimize the distribution of a container's CPU resources across uncore caches, based on the container's requirements, and accounting for allocatable resources on the node.

By running a workload, where it can, on a set of CPUS that use the smallest feasible number of uncore caches, applications benefit from reduced cache latency (as seen in the matrix above), and from reduced contention against other workloads, which can result in overall higher throughput. The benefit only shows up if your nodes use a split uncore cache topology for their processors.

The following diagram below illustrates uncore cache alignment when the feature is enabled.

By default, Kubernetes does not account for uncore cache topology; containers are assigned CPU resources using a packed methodology. As a result, Container 1 and Container 2 can experience a noisy neighbor impact due to cache access contention on Uncore Cache 0. Additionally, Container 2 will have CPUs distributed across both caches which can introduce a cross-cache latency.

With prefer-align-cpus-by-uncorecache enabled, each container is isolated on an individual cache. This resolves the cache contention between the containers and minimizes the cache latency for the CPUs being utilized.

Use cases

Common use cases can include telco applications like vRAN, Mobile Packet Core, and Firewalls. It's important to note that the optimization provided by prefer-align-cpus-by-uncorecache can be dependent on the workload. For example, applications that are memory bandwidth bound may not benefit from uncore cache alignment, as utilizing more uncore caches can increase memory bandwidth access.

Enabling the feature

To enable this feature, set the CPU Manager Policy to static and enable the CPU Manager Policy Options with prefer-align-cpus-by-uncorecache.

For Kubernetes 1.34, the feature is in the beta stage and requires the CPUManagerPolicyBetaOptions feature gate to also be enabled.

Append the following to the kubelet configuration file:

kind: KubeletConfiguration apiVersion: kubelet.config.k8s.io/v1beta1 featureGates: ... CPUManagerPolicyBetaOptions: true cpuManagerPolicy: "static" cpuManagerPolicyOptions: prefer-align-cpus-by-uncorecache: "true" reservedSystemCPUs: "0" ...

If you're making this change to an existing node, remove the cpu_manager_state file and then restart kubelet.

prefer-align-cpus-by-uncorecache can be enabled on nodes with a monolithic uncore cache processor. The feature will mimic a best-effort socket alignment effect and will pack CPU resources on the socket similar to the default static CPU Manager policy.

AIImplementation #VectorDatabases #RAG

Consider joining the channel: https://www.youtube.com/c/devopstoolkit/join

▬▬▬▬▬▬ 🔗 Additional Info 🔗 ▬▬▬▬▬▬ ➡ Transcript and commands: https://devopstoolkit.live/ai/stop-blaming-ai-vector-dbs-+-rag-=-game-changer 🔗 Qdrant: https://qdrant.tech

▬▬▬▬▬▬ 💰 Sponsorships 💰 ▬▬▬▬▬▬ If you are interested in sponsoring this channel, please visit https://devopstoolkit.live/sponsor for more information. Alternatively, feel free to contact me over Twitter or LinkedIn (see below).

▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/

▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox

▬▬▬▬▬▬ ⏱ Timecodes ⏱ ▬▬▬▬▬▬ 00:00 Vector Databases for AI Agents 02:03 Outskill (sponsor) 03:34 Why AI Hallucinates About Your Code 11:15 Vector Databases for AI Context 21:47 RAG: How AI Gets Your Context 30:50 Fix Your AI Implementation Now

via YouTube https://www.youtube.com/watch?v=zqpJr1qZhTg

·youtube.com·Sep 1, 2025

AI & DevOps Toolkit - Stop Blaming AI: Vector DBs RAG = Game Changer - https://www.youtube.com/watch?v=zqpJr1qZhTg

How to Improve CUDA Kernel Performance with Shared Memory Register Spilling | NVIDIA Technical Blog

When a CUDA kernel requires more hardware registers than are available, the compiler is forced to move the excess variables into local memory, a process known as register spilling.

·developer.nvidia.com·Aug 30, 2025

How to Improve CUDA Kernel Performance with Shared Memory Register Spilling | NVIDIA Technical Blog

Connecticut Man's Case Believed to Be First Murder-Suicide Associated With AI Psychosis

Several suicides have been blamed on AI. This appears to be the first homicide.

·gizmodo.com·Aug 30, 2025

Connecticut Man's Case Believed to Be First Murder-Suicide Associated With AI Psychosis

Intel’s “Clearwater Forest” Xeon 7 E-Core CPU Will Be A Beast

With AMD having attaining more than 40 percent revenue share and more than 27 percent shipment share in the X86 server CPU market in the first half of

·nextplatform.com·Aug 30, 2025

Intel’s “Clearwater Forest” Xeon 7 E-Core CPU Will Be A Beast

OVHcloud legal eagle on Microsoft's sovereignty admission

Interview: French provider seizes on Redmond's admission that US law could override local protections

·theregister.com·Aug 30, 2025

OVHcloud legal eagle on Microsoft's sovereignty admission

Kubernetes v1.34: Finer-Grained Control Over Container Restarts

https://kubernetes.io/blog/2025/08/29/kubernetes-v1-34-per-container-restart-policy/

With the release of Kubernetes 1.34, a new alpha feature is introduced that gives you more granular control over container restarts within a Pod. This feature, named Container Restart Policy and Rules, allows you to specify a restart policy for each container individually, overriding the Pod's global restart policy. In addition, it also allows you to conditionally restart individual containers based on their exit codes. This feature is available behind the alpha feature gate ContainerRestartRules.

This has been a long-requested feature. Let's dive into how it works and how you can use it.

The problem with a single restart policy

Before this feature, the restartPolicy was set at the Pod level. This meant that all containers in a Pod shared the same restart policy (Always, OnFailure, or Never). While this works for many use cases, it can be limiting in others.

For example, consider a Pod with a main application container and an init container that performs some initial setup. You might want the main container to always restart on failure, but the init container should only run once and never restart. With a single Pod-level restart policy, this wasn't possible.

Introducing per-container restart policies

With the new ContainerRestartRules feature gate, you can now specify a restartPolicy for each container in your Pod's spec. You can also define restartPolicyRules to control restarts based on exit codes. This gives you the fine-grained control you need to handle complex scenarios.

Use cases

Let's look at some real-life use cases where per-container restart policies can be beneficial.

In-place restarts for training jobs

In ML research, it's common to orchestrate a large number of long-running AI/ML training workloads. In these scenarios, workload failures are unavoidable. When a workload fails with a retriable exit code, you want the container to restart quickly without rescheduling the entire Pod, which consumes a significant amount of time and resources. Restarting the failed container "in-place" is critical for better utilization of compute resources. The container should only restart "in-place" if it failed due to a retriable error; otherwise, the container and Pod should terminate and possibly be rescheduled.

This can now be achieved with container-level restartPolicyRules. The workload can exit with different codes to represent retriable and non-retriable errors. With restartPolicyRules, the workload can be restarted in-place quickly, but only when the error is retriable.

Try-once init containers

Init containers are often used to perform initialization work for the main container, such as setting up environments and credentials. Sometimes, you want the main container to always be restarted, but you don't want to retry initialization if it fails.

With a container-level restartPolicy, this is now possible. The init container can be executed only once, and its failure would be considered a Pod failure. If the initialization succeeds, the main container can be always restarted.

Pods with multiple containers

For Pods that run multiple containers, you might have different restart requirements for each container. Some containers might have a clear definition of success and should only be restarted on failure. Others might need to be always restarted.

This is now possible with a container-level restartPolicy, allowing individual containers to have different restart policies.

How to use it

To use this new feature, you need to enable the ContainerRestartRules feature gate on your Kubernetes cluster control-plane and worker nodes running Kubernetes 1.34+. Once enabled, you can specify the restartPolicy and restartPolicyRules fields in your container definitions.

Here are some examples:

Example 1: Restarting on specific exit codes

In this example, the container should restart if and only if it fails with a retriable error, represented by exit code 42.

To achieve this, the container has restartPolicy: Never, and a restart policy rule that tells Kubernetes to restart the container in-place if it exits with code 42.

apiVersion: v1 kind: Pod metadata: name: restart-on-exit-codes annotations: kubernetes.io/description: "This Pod only restart the container only when it exits with code 42." spec: restartPolicy: Never containers:

name: restart-on-exit-codes image: docker.io/library/busybox:1.28 command: ['sh', '-c', 'sleep 60 && exit 0'] restartPolicy: Never # Container restart policy must be specified if rules are specified restartPolicyRules: # Only restart the container if it exits with code 42
action: Restart exitCodes: operator: In values: [42]

Example 2: A try-once init container

In this example, a Pod should always be restarted once the initialization succeeds. However, the initialization should only be tried once.

To achieve this, the Pod has an Always restart policy. The init-once init container will only try once. If it fails, the Pod will fail. This allows the Pod to fail if the initialization failed, but also keep running once the initialization succeeds.

apiVersion: v1 kind: Pod metadata: name: fail-pod-if-init-fails annotations: kubernetes.io/description: "This Pod has an init container that runs only once. After initialization succeeds, the main container will always be restarted." spec: restartPolicy: Always initContainers:

name: init-once # This init container will only try once. If it fails, the Pod will fail. image: docker.io/library/busybox:1.28 command: ['sh', '-c', 'echo "Failing initialization" && sleep 10 && exit 1'] restartPolicy: Never containers:
name: main-container # This container will always be restarted once initialization succeeds. image: docker.io/library/busybox:1.28 command: ['sh', '-c', 'sleep 1800 && exit 0']

Example 3: Containers with different restart policies

In this example, there are two containers with different restart requirements. One should always be restarted, while the other should only be restarted on failure.

This is achieved by using a different container-level restartPolicy on each of the two containers.

apiVersion: v1 kind: Pod metadata: name: on-failure-pod annotations: kubernetes.io/description: "This Pod has two containers with different restart policies." spec: containers:

name: restart-on-failure image: docker.io/library/busybox:1.28 command: ['sh', '-c', 'echo "Not restarting after success" && sleep 10 && exit 0'] restartPolicy: OnFailure
name: restart-always image: docker.io/library/busybox:1.28 command: ['sh', '-c', 'echo "Always restarting" && sleep 1800 && exit 0'] restartPolicy: Always

Learn more

Read the documentation for container restart policy.

Read the KEP for the Container Restart Rules

Roadmap

More actions and signals to restart Pods and containers are coming! Notably, there are plans to add support for restarting the entire Pod. Planning and discussions on these features are in progress. Feel free to share feedback or requests with the SIG Node community!

Your feedback is welcome!

This is an alpha feature, and the Kubernetes project would love to hear your feedback. Please try it out. This feature is driven by the SIG Node. If you are interested in helping develop this feature, sharing feedback, or participating in any other ongoing SIG Node projects, please reach out to the SIG Node community!

You can reach SIG Node by several means:

Slack: #sig-node

Mailing list

Open Community Issues/PRs

via Kubernetes Blog https://kubernetes.io/

August 29, 2025 at 02:30PM

·kubernetes.io·Aug 30, 2025

Kubernetes v1.34: Finer-Grained Control Over Container Restarts

rqlite/rqlite: The lightweight, user-friendly, distributed relational database built on SQLite.

The lightweight, user-friendly, distributed relational database built on SQLite. - rqlite/rqlite

·github.com·Aug 29, 2025

rqlite/rqlite: The lightweight, user-friendly, distributed relational database built on SQLite.

Practical guide for avoiding burnout and living a happier life

Jono Bacon shares some quite ridiculous life choices from his early years that illustrate important ways of keeping healthy in mind, body, and spirit.

·opensource.com·Aug 29, 2025

Practical guide for avoiding burnout and living a happier life

Linux Foundation Opens the Door to DocumentDB

Amazon Web Services and Microsoft will both work on the open source, document-oriented database system, per the annoucement at Open Source Summit Europe.

·thenewstack.io·Aug 29, 2025

Linux Foundation Opens the Door to DocumentDB

Kubernetes v1.34: User preferences (kuberc) are available for testing in kubectl 1.34

https://kubernetes.io/blog/2025/08/28/kubernetes-v1-34-kubectl-kuberc-beta/

Have you ever wished you could enable interactive delete, by default, in kubectl? Or maybe, you'd like to have custom aliases defined, but not necessarily generate hundreds of them manually? Look no further. SIG-CLI has been working hard to add user preferences to kubectl, and we are happy to announce that this functionality is reaching beta as part of the Kubernetes v1.34 release.

How it works

A full description of this functionality is available in our official documentation, but this blog post will answer both of the questions from the beginning of this article.

Before we dive into details, let's quickly cover what the user preferences file looks like and where to place it. By default, kubectl will look for kuberc file in your default kubeconfig directory, which is $HOME/.kube. Alternatively, you can specify this location using --kuberc option or the KUBERC environment variable.

Just like every Kubernetes manifest, kuberc file will start with an apiVersion and kind:

apiVersion: kubectl.config.k8s.io/v1beta1 kind: Preference

the user preferences will follow here

Defaults

Let's start by setting default values for kubectl command options. Our goal is to always use interactive delete, which means we want the --interactive option for kubectl delete to always be set to true. This can be achieved with the following addition to our kuberc file:

defaults:

command: delete options:
- name: interactive default: "true"

In the above example, I'm introducing defaults section, which allows users to define default values for kubectl options. In this case, we're setting the interactive option for kubectl delete to be true by default. This default can be overridden if a user explicitly provides a different value such as kubectl delete --interactive=false, in which case the explicit option takes precedence.

Another highly encouraged default from SIG-CLI, is using Server-Side Apply. To do so, you can add the following snippet to your preferences:

continuing defaults section

command: apply options:
- name: server-side default: "true"

Aliases

The ability to define aliases allows us to save precious seconds when typing commands. I bet that you most likely have one defined for kubectl, because typing seven letters is definitely longer than just pressing k.

For this reason, the ability to define aliases was a must-have when we decided to implement user preferences, alongside defaulting. To define an alias for any of the built-in commands, expand your kuberc file with the following addition:

aliases:

name: gns command: get prependArgs:
- namespace options:
- name: output default: json

There's a lot going on above, so let me break this down. First, we're introducing a new section: aliases. Here, we're defining a new alias gns, which is mapped to the command get command. Next, we're defining arguments (namespace resource) that will be inserted right after the command name. Additionally, we're setting --output=json option for this alias. The structure of options block is identical to the one in the defaults section.

You probably noticed that we've introduced a mechanism for prepending arguments, and you might wonder if there is a complementary setting for appending them (in other words, adding to the end of the command, after user-provided arguments). This can be achieved through appendArgs block, which is presented below:

continuing aliases section

name: runx command: run options:
- name: image default: busybox
- name: namespace default: test-ns appendArgs:
- --
- custom-arg

Here, we're introducing another alias: runx, which invokes kubectl run command, passing --image and --namespace options with predefined values, and also appending -- and custom-arg at the end of the invocation.

Debugging

We hope that kubectl user preferences will open up new possibilities for our users. Whenever you're in doubt, feel free to run kubectl with increased verbosity. At -v=5, you should get all the possible debugging information from this feature, which will be crucial when reporting issues.

To learn more, I encourage you to read through our official documentation and the actual proposal.

Get involved

Kubectl user preferences feature has reached beta, and we are very interested in your feedback. We'd love to hear what you like about it and what problems you'd like to see it solve. Feel free to join SIG-CLI slack channel, or open an issue against kubectl repository. You can also join us at our community meetings, which happen every other Wednesday, and share your stories with us.

via Kubernetes Blog https://kubernetes.io/

August 28, 2025 at 02:30PM

·kubernetes.io·Aug 29, 2025

Kubernetes v1.34: User preferences (kuberc) are available for testing in kubectl 1.34

KYAML · Issue #5295 · kubernetes/enhancements

Enhancement Description One-line enhancement description (can be used as a release note): Add KYAML output for kubectl Kubernetes Enhancement Proposal: https://github.com/kubernetes/enhancements/bl...

#Kubernetes #KYAML

·kep.k8s.io·Aug 27, 2025

KYAML · Issue #5295 · kubernetes/enhancements

Last Week in Kubernetes Development - Week Ending August 24 2025

Week Ending August 24, 2025

https://lwkd.info/2025/20250827

Developer News

Kubernetes 1.34 is released! This version, named “Of Wind & Will”, includes DRA GA, KYAML spec, structured authentication config, better watch cache initialization, and much more.

Yuki Iwai is nominated as a new Working Group Batch lead, joining Marcin and Kevin, as Swati and Maciej step down. Raise any Concerns before September 4, 2025.

Tim Hockin is stepping down as SIG Network co-chair and nominating Bowei Du as his replacement. He will remain a SIG Network Tech Lead. Lazy consensus on August 29, 2025.

Steering Committee Election

The Steering Committee election has started. This first stage is candidate nominations, to register potential new steering members. Have you considered working on the Steering Committee?

It is also time to verify if you are am eligible voter. If you are not, and should be, file a ballot exception.

Release Schedule

Next Deadline: Release Day 27th August

Kubernetes v1.34 is released.

A regression in kube-proxy v1.34.* that prevented startup on single-stack IPv4 or IPv6 hosts was identified and fixed ahead of release cut. A huge thank you to all contributors, reviewers, and release team members whose efforts made this release possible!

The next scheduled patch releases are on September 9, 2025 (cherry pick deadline: September 5, 2025). As a reminder, Kubernetes 1.31 will enter maintenance mode on August 28, 2025, with End of Life (EOL) planned for October 28, 2025.

Featured PRs

133604: Fix storage counting all objects instead of objects for resource

This PR fixes a regression where apiserver_storage_objects was overcounted by counting all etcd objects (using /registry) instead of just the target resource (e.g., pods); It now counts only that resource’s objects, thus giving accurate per-resource metrics and avoiding extra work when the watch cache is disabled.

KEP of the Week

KEP 24: Add AppArmor Support

This KEP introduces support for AppArmor within a cluster. AppArmor can enable users to run a more secure deployment, and/or provide better auditing and monitoring of their systems. The AppArmor support provides users an alternative to SELinux, and provides an interface for users that are already maintaining a set of AppArmor profiles. This KEP is proposing a minimal path to GA, per the no perma-Beta requirement.

This KEP was released as Stable in 1.34

Other Merges

Count storage types accurately when filtering per type

Prevent data race around claimsToAllocate

Subprojects and Dependency Updates

cluster-api v1.11.0 adds support for Kubernetes v1.33 (management and workload clusters), introduces the v1beta2 API, and includes new providers (Scaleway, cdk8s)

kubespray v2.28.1 fixes etcd and kubeadm issues while improving Cilium, Hubble, and Calico networking stability

Shoutouts

Christian Schlotter (@chrischdi): Thanks to Fabrizio Pandini (@fabrizio.pandini) and Stefan Büringer (@sbueringer) for the huge amount of work they did for the latest cluster api :cluster-api: v1.11.0 release to set the stage for the v1beta2 api version, which benefits all users to have a more clear and consistent API as well as a better feedback loop!

via Last Week in Kubernetes Development https://lwkd.info/

August 27, 2025 at 05:50PM

·lwkd.info·Aug 27, 2025

Last Week in Kubernetes Development - Week Ending August 24 2025

A hacker used AI to automate an 'unprecedented' cybercrime spree, Anthropic says

The company behind the Claude chatbot said it caught a hacker using its chatbot to identify, hack and extort at least 17 companies.

·nbcnews.com·Aug 27, 2025

A hacker used AI to automate an 'unprecedented' cybercrime spree, Anthropic says

Kubernetes v1.34: Of Wind & Will (O' WaW)

https://kubernetes.io/blog/2025/08/27/kubernetes-v1-34-release/

Editors: Agustina Barbetta, Alejandro Josue Leon Bellido, Graziano Casto, Melony Qin, Dipesh Rawat

Similar to previous releases, the release of Kubernetes v1.34 introduces new stable, beta, and alpha features. The consistent delivery of high-quality releases underscores the strength of our development cycle and the vibrant support from our community.

This release consists of 58 enhancements. Of those enhancements, 23 have graduated to Stable, 22 have entered Beta, and 13 have entered Alpha.

There are also some deprecations and removals in this release; make sure to read about those.

Release theme and logo

A release powered by the wind around us — and the will within us.

Every release cycle, we inherit winds that we don't really control — the state of our tooling, documentation, and the historical quirks of our project. Sometimes these winds fill our sails, sometimes they push us sideways or die down.

What keeps Kubernetes moving isn't the perfect winds, but the will of our sailors who adjust the sails, man the helm, chart the courses and keep the ship steady. The release happens not because conditions are always ideal, but because of the people who build it, the people who release it, and the bears ^, cats, dogs, wizards, and curious minds who keep Kubernetes sailing strong — no matter which way the wind blows.

This release, Of Wind & Will (O' WaW), honors the winds that have shaped us, and the will that propels us forward.

^ Oh, and you wonder why bears? Keep wondering!

Spotlight on key updates

Kubernetes v1.34 is packed with new features and improvements. Here are a few select updates the Release Team would like to highlight!

Stable: The core of DRA is GA

Dynamic Resource Allocation (DRA) enables more powerful ways to select, allocate, share, and configure GPUs, TPUs, NICs and other devices.

Since the v1.30 release, DRA has been based around claiming devices using structured parameters that are opaque to the core of Kubernetes. This enhancement took inspiration from dynamic provisioning for storage volumes. DRA with structured parameters relies on a set of supporting API kinds: ResourceClaim, DeviceClass, ResourceClaimTemplate, and ResourceSlice API types under resource.k8s.io, while extending the .spec for Pods with a new resourceClaims field.

The resource.k8s.io/v1 APIs have graduated to stable and are now available by default.

This work was done as part of KEP #4381 led by WG Device Management.

Beta: Projected ServiceAccount tokens for kubelet image credential providers

The kubelet credential providers, used for pulling private container images, traditionally relied on long-lived Secrets stored on the node or in the cluster. This approach increased security risks and management overhead, as these credentials were not tied to the specific workload and did not rotate automatically.

To solve this, the kubelet can now request short-lived, audience-bound ServiceAccount tokens for authenticating to container registries. This allows image pulls to be authorized based on the Pod's own identity rather than a node-level credential.

The primary benefit is a significant security improvement. It eliminates the need for long-lived Secrets for image pulls, reducing the attack surface and simplifying credential management for both administrators and developers.

This work was done as part of KEP #4412 led by SIG Auth and SIG Node.

Alpha: Support for KYAML, a Kubernetes dialect of YAML

KYAML aims to be a safer and less ambiguous YAML subset, and was designed specifically for Kubernetes. Whatever version of Kubernetes you use, starting from Kubernetes v1.34 you are able to use KYAML as a new output format for kubectl.

KYAML addresses specific challenges with both YAML and JSON. YAML's significant whitespace requires careful attention to indentation and nesting, while its optional string-quoting can lead to unexpected type coercion (for example: "The Norway Bug"). Meanwhile, JSON lacks comment support and has strict requirements for trailing commas and quoted keys.

You can write KYAML and pass it as an input to any version of kubectl, because all KYAML files are also valid as YAML. With kubectl v1.34, you are also able to request KYAML output (as in kubectl get -o kyaml …) by setting environment variable KUBECTL_KYAML=true. If you prefer, you can still request the output in JSON or YAML format.

This work was done as part of KEP #5295 led by SIG CLI.

Features graduating to Stable

This is a selection of some of the improvements that are now stable following the v1.34 release.

Delayed creation of Job’s replacement Pods

By default, Job controllers create replacement Pods immediately when a Pod starts terminating, causing both Pods to run simultaneously. This can cause resource contention in constrained clusters, where the replacement Pod may struggle to find available nodes until the original Pod fully terminates. The situation can also trigger unwanted cluster autoscaler scale-ups. Additionally, some machine learning frameworks like TensorFlow and JAX require only one Pod per index to run at a time, making simultaneous Pod execution problematic. This feature introduces .spec.podReplacementPolicy in Jobs. You may choose to create replacement Pods only when the Pod is fully terminated (has .status.phase: Failed). To do this, set .spec.podReplacementPolicy: Failed.

Introduced as alpha in v1.28, this feature has graduated to stable in v1.34.

This work was done as part of KEP #3939 led by SIG Apps.

Recovery from volume expansion failure

This feature allows users to cancel volume expansions that are unsupported by the underlying storage provider, and retry volume expansion with smaller values that may succeed.

Introduced as alpha in v1.23, this feature has graduated to stable in v1.34.

This work was done as part of KEP #1790 led by SIG Storage.

VolumeAttributesClass for volume modification

VolumeAttributesClass has graduated to stable in v1.34. VolumeAttributesClass is a generic, Kubernetes-native API for modifying volume parameters like provisioned IO. It allows workloads to vertically scale their volumes on-line to balance cost and performance, if supported by their provider.

Like all new volume features in Kubernetes, this API is implemented via the container storage interface (CSI). Your provisioner-specific CSI driver must support the new ModifyVolume API which is the CSI side of this feature.

This work was done as part of KEP #3751 led by SIG Storage.

Structured authentication configuration

Kubernetes v1.29 introduced a configuration file format to manage API server client authentication, moving away from the previous reliance on a large set of command-line options. The AuthenticationConfiguration kind allows administrators to support multiple JWT authenticators, CEL expression validation, and dynamic reloading. This change significantly improves the manageability and auditability of the cluster's authentication settings - and has graduated to stable in v1.34.

This work was done as part of KEP #3331 led by SIG Auth.

Finer-grained authorization based on selectors

Kubernetes authorizers, including webhook authorizers and the built-in node authorizer, can now make authorization decisions based on field and label selectors in incoming requests. When you send list, watch or deletecollection requests with selectors, the authorization layer can now evaluate access with that additional context.

For example, you can write an authorization policy that only allows listing Pods bound to a specific .spec.nodeName. The client (perhaps the kubelet on a particular node) must specify the field selector that the policy requires, otherwise the request is forbidden. This change makes it feasible to set up least privilege rules, provided that the client knows how to conform to the restrictions you set. Kubernetes v1.34 now supports more granular control in environments like per-node isolation or custom multi-tenant setups.

This work was done as part of KEP #4601 led by SIG Auth.

Restrict anonymous requests with fine-grained controls

Instead of fully enabling or disabling anonymous access, you can now configure a strict list of endpoints where unauthenticated requests are allowed. This provides a safer alternative for clusters that rely on anonymous access to health or bootstrap endpoints like /healthz, /readyz, or /livez.

With this feature, accidental RBAC misconfigurations that grant broad access to anonymous users can be avoided without requiring changes to external probes or bootstrapping tools.

This work was done as part of KEP #4633 led by SIG Auth.

More efficient requeueing through plugin-specific callbacks

The kube-scheduler can now make more accurate decisions about when to retry scheduling Pods that were previously unschedulable. Each scheduling plugin can now register callback functions that tell the scheduler whether an incoming cluster event is likely to make a rejected Pod schedulable again.

This reduces unnecessary retries and improves overall scheduling throughput - especially in clusters using dynamic resource allocation. The feature also lets certain plugins skip the usual backoff delay when it is safe to do so, making scheduling faster in specific cases.

This work was done as part of KEP #4247 led by SIG Scheduling.

Ordered Namespace deletion

Semi-random resource deletion order can create security gaps or unintended behavior, such as Pods persisting after their associated NetworkPolicies are deleted.

This improvement introduces a more structured deletion process for Kubernetes namespaces to ensure secure and deterministic resource removal. By enforcing a structured deletion sequence that respects logical and security dependencies, this approach ensures Pods are removed before other resources.

This feature was introduced in Kubernetes v1.33 and graduated to stable in

·kubernetes.io·Aug 27, 2025

Kubernetes v1.34: Of Wind & Will (O' WaW)

DigitalOcean MCP Server is now available | DigitalOcean

DigitalOcean is thrilled to offer its support for MCP Server, a new way to leverage AI.

·digitalocean.com·Aug 27, 2025

DigitalOcean MCP Server is now available | DigitalOcean

Shocked I tell you, shocked /s | Dish gives up on becoming the fourth major wireless carrier

Boost Mobile will primarily use AT&T for connectivity.

·theverge.com·Aug 27, 2025

Shocked I tell you, shocked /s | Dish gives up on becoming the fourth major wireless carrier

AI & DevOps Toolkit - Ep33 - Ask Me Anything About Anything with Scott Rosenberg - https://www.youtube.com/watch?v=7_-PoHIWVl4

Ep33 - Ask Me Anything About Anything with Scott Rosenberg

There are no restrictions in this AMA session. You can ask anything about DevOps, AI, Cloud, Kubernetes, Platform Engineering, containers, or anything else. Scott Rosenberg, a regular guest, will be here to help us out.

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: Codefresh 🔗 GitOps Argo CD Certifications: https://learning.codefresh.io (use "viktor" for a 50% discount) ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/

▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox

via YouTube https://www.youtube.com/watch?v=7_-PoHIWVl4

·youtube.com·Aug 26, 2025

AI & DevOps Toolkit - Ep33 - Ask Me Anything About Anything with Scott Rosenberg - https://www.youtube.com/watch?v=7_-PoHIWVl4