
Suggested Reads
Ep30 - Ask Me Anything About Anything with Scott Rosenberg
There are no restrictions in this AMA session. You can ask anything about DevOps, AI, Cloud, Kubernetes, Platform Engineering, containers, or anything else. Scott Rosenberg, regular guest, will be here to help us out.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: Codefresh 🔗 GitOps Argo CD Certifications: https://learning.codefresh.io (use "viktor" for a 50% discount) ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/
▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox
via YouTube https://www.youtube.com/watch?v=Z_l6oj4OGzE
Can AI Replace Your Terraform Modules? Infrastructure's New Future?
Watch AI agents redefine infrastructure management by learning from their mistakes and building complex setups from scratch—no golden paths required. In this video, you'll witness an AI first deploy an application using a pre-built abstraction with ease, then tackle the daunting task of assembling a database entirely from raw cloud APIs. Through trial, error, and adaptive learning, the AI evolves from making basic mistakes to successfully provisioning intricate resources—all without pre-built templates or human-crafted abstractions.
Explore the implications for platform engineering as we ask crucial questions: Do we still need meticulously crafted golden paths, or can AI build better solutions on the fly? How can Kubernetes and API discovery empower AI to independently manage infrastructure? And most importantly, are we ready for a future where AI agents autonomously handle complex infrastructure challenges, learning and adapting in real-time?
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: TestSprite 🔗 https://testsprite.com ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
PlatformEngineering #AIInfrastructure #kubernetes
Consider joining the channel: https://www.youtube.com/c/devopstoolkit/join
▬▬▬▬▬▬ 🔗 Additional Info 🔗 ▬▬▬▬▬▬ ➡ Transcript and commands: https://devopstoolkit.live/ai/can-ai-replace-your-terraform-modules-infrastructures-new-future 🔗 Claude Code: https://anthropic.com/claude-code 🎬 Forget CLIs and GUIs: AI is the New Interface for Developer Platforms: https://youtu.be/ApjnCa-a2xI
▬▬▬▬▬▬ 💰 Sponsorships 💰 ▬▬▬▬▬▬ If you are interested in sponsoring this channel, please visit https://devopstoolkit.live/sponsor for more information. Alternatively, feel free to contact me over Twitter or LinkedIn (see below).
▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/
▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox
▬▬▬▬▬▬ ⏱ Timecodes ⏱ ▬▬▬▬▬▬ 00:00 Introduction 01:58 TestSprite (sponsor) 03:20 The Testing Ground 05:55 AI Deploys Apps 10:15 Database Without Safety Nets 14:48 The AI Prompts 16:07 Platform Engineering's Future
via YouTube https://www.youtube.com/watch?v=F2Qis5cmwT8
Kubernetes v1.34 Sneak Peek
https://kubernetes.io/blog/2025/07/28/kubernetes-v1-34-sneak-peek/
Kubernetes v1.34 is coming at the end of August 2025. This release will not include any removal or deprecation, but it is packed with an impressive number of enhancements. Here are some of the features we are most excited about in this cycle!
Please note that this information reflects the current state of v1.34 development and may change before release.
Featured enhancements of Kubernetes v1.34
The following list highlights some of the notable enhancements likely to be included in the v1.34 release, but is not an exhaustive list of all planned changes. This is not a commitment and the release content is subject to change.
The core of DRA targets stable
Dynamic Resource Allocation (DRA) provides a flexible way to categorize, request, and use devices like GPUs or custom hardware in your Kubernetes cluster.
Since the v1.30 release, DRA has been based around claiming devices using structured parameters that are opaque to the core of Kubernetes. The relevant enhancement proposal, KEP-4381, took inspiration from dynamic provisioning for storage volumes. DRA with structured parameters relies on a set of supporting API kinds: ResourceClaim, DeviceClass, ResourceClaimTemplate, and ResourceSlice API types under resource.k8s.io, while extending the .spec for Pods with a new resourceClaims field. The core of DRA is targeting graduation to stable in Kubernetes v1.34.
With DRA, device drivers and cluster admins define device classes that are available for use. Workloads can claim devices from a device class within device requests. Kubernetes allocates matching devices to specific claims and places the corresponding Pods on nodes that can access the allocated devices. This framework provides flexible device filtering using CEL, centralized device categorization, and simplified Pod requests, among other benefits.
Once this feature has graduated, the resource.k8s.io/v1 APIs will be available by default.
ServiceAccount tokens for image pull authentication
The ServiceAccount token integration for kubelet credential providers is likely to reach beta and be enabled by default in Kubernetes v1.34. This allows the kubelet to use these tokens when pulling container images from registries that require authentication.
That support already exists as alpha, and is tracked as part of KEP-4412.
The existing alpha integration allows the kubelet to use short-lived, automatically rotated ServiceAccount tokens (that follow OIDC-compliant semantics) to authenticate to a container image registry. Each token is scoped to one associated Pod; the overall mechanism replaces the need for long-lived image pull Secrets.
Adopting this new approach reduces security risks, supports workload-level identity, and helps cut operational overhead. It brings image pull authentication closer to modern, identity-aware good practice.
Pod replacement policy for Deployments
After a change to a Deployment, terminating pods may stay up for a considerable amount of time and may consume additional resources. As part of KEP-3973, the .spec.podReplacementPolicy field will be introduced (as alpha) for Deployments.
If your cluster has the feature enabled, you'll be able to select one of two policies:
TerminationStarted
Creates new pods as soon as old ones start terminating, resulting in faster rollouts at the cost of potentially higher resource consumption.
TerminationComplete
Waits until old pods fully terminate before creating new ones, resulting in slower rollouts but ensuring controlled resource consumption.
This feature makes Deployment behavior more predictable by letting you choose when new pods should be created during updates or scaling. It's beneficial when working in clusters with tight resource constraints or with workloads with long termination periods.
It's expected to be available as an alpha feature and can be enabled using the DeploymentPodReplacementPolicy and DeploymentReplicaSetTerminatingReplicas feature gates in the API server and kube-controller-manager.
Production-ready tracing for kubelet and API Server
To address the longstanding challenge of debugging node-level issues by correlating disconnected logs, KEP-2831 provides deep, contextual insights into the kubelet.
This feature instruments critical kubelet operations, particularly its gRPC calls to the Container Runtime Interface (CRI), using the vendor-agnostic OpenTelemetry standard. It allows operators to visualize the entire lifecycle of events (for example: a Pod startup) to pinpoint sources of latency and errors. Its most powerful aspect is the propagation of trace context; the kubelet passes a trace ID with its requests to the container runtime, enabling runtimes to link their own spans.
This effort is complemented by a parallel enhancement, KEP-647, which brings the same tracing capabilities to the Kubernetes API server. Together, these enhancements provide a more unified, end-to-end view of events, simplifying the process of pinpointing latency and errors from the control plane down to the node. These features have matured through the official Kubernetes release process. KEP-2831 was introduced as an alpha feature in v1.25, while KEP-647 debuted as alpha in v1.22. Both enhancements were promoted to beta together in the v1.27 release. Looking forward, Kubelet Tracing (KEP-2831) and API Server Tracing (KEP-647) are now targeting graduation to stable in the upcoming v1.34 release.
PreferSameZone and PreferSameNode traffic distribution for Services
The spec.trafficDistribution field within a Kubernetes Service allows users to express preferences for how traffic should be routed to Service endpoints.
KEP-3015 deprecates PreferClose and introduces two additional values: PreferSameZone and PreferSameNode. PreferSameZone is equivalent to the current PreferClose. PreferSameNode prioritizes sending traffic to endpoints on the same node as the client.
This feature was introduced in v1.33 behind the PreferSameTrafficDistribution feature gate. It is targeting graduation to beta in v1.34 with its feature gate enabled by default.
Support for KYAML: a Kubernetes dialect of YAML
KYAML aims to be a safer and less ambiguous YAML subset, and was designed specifically for Kubernetes. Whatever version of Kubernetes you use, you'll be able use KYAML for writing manifests and/or Helm charts. You can write KYAML and pass it as an input to any version of kubectl, because all KYAML files are also valid as YAML. With kubectl v1.34, we expect you'll also be able to request KYAML output from kubectl (as in kubectl get -o kyaml …). If you prefer, you can still request the output in JSON or YAML format.
KYAML addresses specific challenges with both YAML and JSON. YAML's significant whitespace requires careful attention to indentation and nesting, while its optional string-quoting can lead to unexpected type coercion (for example: "The Norway Bug"). Meanwhile, JSON lacks comment support and has strict requirements for trailing commas and quoted keys.
KEP-5295 introduces KYAML, which tries to address the most significant problems by:
Always double-quoting value strings
Leaving keys unquoted unless they are potentially ambiguous
Always using {} for mappings (associative arrays)
Always using [] for lists
This might sound a lot like JSON, because it is! But unlike JSON, KYAML supports comments, allows trailing commas, and doesn't require quoted keys.
We're hoping to see KYAML introduced as a new output format for kubectl v1.34. As with all these features, none of these changes are 100% confirmed; watch this space!
As a format, KYAML is and will remain a strict subset of YAML, ensuring that any compliant YAML parser can parse KYAML documents. Kubernetes does not require you to provide input specifically formatted as KYAML, and we have no plans to change that.
Fine-grained autoscaling control with HPA configurable tolerance
KEP-4951 introduces a new feature that allows users to configure autoscaling tolerance on a per-HPA basis, overriding the default cluster-wide 10% tolerance setting that often proves too coarse-grained for diverse workloads. The enhancement adds an optional tolerance field to the HPA's spec.behavior.scaleUp and spec.behavior.scaleDown sections, enabling different tolerance values for scale-up and scale-down operations, which is particularly valuable since scale-up responsiveness is typically more critical than scale-down speed for handling traffic surges.
Released as alpha in Kubernetes v1.33 behind the HPAConfigurableTolerance feature gate, this feature is expected to graduate to beta in v1.34. This improvement helps to address scaling challenges with large deployments, where for scaling in, a 10% tolerance might mean leaving hundreds of unnecessary Pods running. Using the new, more flexible approach would enable workload-specific optimization for both responsive and conservative scaling behaviors.
Want to know more?
New features and deprecations are also announced in the Kubernetes release notes. We will formally announce what's new in Kubernetes v1.34 as part of the CHANGELOG for that release.
The Kubernetes v1.34 release is planned for Wednesday 27th August 2025. Stay tuned for updates!
Get involved
The simplest way to get involved with Kubernetes is to join one of the many Special Interest Groups (SIGs) that align with your interests. Have something you'd like to broadcast to the Kubernetes community? Share your voice at our weekly community meeting, and through the channels below. Thank you for your continued feedback and support.
Follow us on Bluesky @kubernetes.io for the latest updates
Join the community discussion on Discuss
Join the community on Slack
Post questions (or answer questions) on Server Fault or Stack Overflow
Share your Kubernetes story
Read more about what's happening with Kubernetes on the blog
Learn more about the Kubernetes Release
Week Ending July 20, 2025
https://lwkd.info/2025/20250723
Developer News
Code Freeze and Test Freeze for the Kubernetes v1.34 release begins at 02:00 UTC on Friday, July 25, 2025 (7:00 PM PDT on Thursday, July 24, 2025). Developers should ensure that all pull requests for KEPs and major changes targeting v1.34 are merged by the deadline.
Release Schedule
Next Deadline: Code and Test Freeze, July 24/25
Code and Test Freeze starts this week at 0200 UTC on Friday, July 25. Your PRs should all be merged by then. If you think you may miss the deadline, file an exception request.
Featured PRs
51630: Add Hugo Segments for Faster Local Website Builds
This PR introduces support for Hugo segments, allowing users to render specific parts of the Kubernetes website locally; For example, the build can be limited to English (en) or Persian (fa) content instead of rendering the entire site; This significantly reduces build time and resource usage when previewing documentation changes.
The default method make container-serve continues to build the whole site.
To build a specific segment, users can use the following commands
make container-serve segments=en # To build individual segments make container-serve segments=en,fa # To build multiple segments
131700: Add Support for CEL Extended Lists Library
This PR adds the support for using CEL extended lists library in Kubernetes by integrating upstream support from cel-go. This adds new list functions that allow more advanced list operations in CEL expressions. These functions can improve how conditions are written in features that use CEL-based evaluation, such as admission control and CRD validations.
KEP of the Week
KEP-5080: Ordered Namespace Deletion
This KEP introduces a secure and deterministic mechanism for deleting Kubernetes namespaces. The motivation comes from security and operational concerns with the current semi-random deletion order — for example, pods might continue running after their protecting NetworkPolicy is removed. This KEP ensures that all pods are deleted first and only then are the remaining resources removed, reducing the risk of exposed workloads. It is implemented through a feature gate OrderedNamespaceDeletion that enforces this opinionated deletion order during namespace cleanup.
This KEP is tracked as stable in v1.34
Other Merges
DRA: fixes watch handling on apiserver restart when conversion is needed
CSR declarative validation enabled for /status and /approval
e2e test added for DRA Admin Access
LIST request estimation accounts for maximum object size and caching
APF max seats to 100 for LIST request
deviceplugin and podresources APIs in kubelet from gogo to protoc
InPlacePodVerticalScaling kubelet_container_resize_requests_total metric to include all resize-related updates
Jitter added to periodic storage processes to reduce synchronized execution
InPlacePodVerticalScaling to retry pending resizes only if aggregated requests decrease
kubeadm: generate default etcd command based on etcd version
Optional listMapKeys supported in server-side apply for associative lists
In kubectl describe pod, port names are now included alongside port numbers when specified in the pod spec
kubelet_credential_provider_config_info metric reports credential provider config hash
CSR.status.conditions in v1 and v1beta1 enforce approved/denied exclusivity with declarative validation tags
Support reducing memory limits via NotRequired restart policy, with safeguards against OOM kills
e2e test for batch pod deletion in kubelet
Union validation rule tags added and +k8s:item chaining enabled in validation-gen
PodCPUAndMemoryStats added to the stats.Provider interface for fetching the CPU & memory stats for a single pod
apiserver_storage_objects metric is deprecated and replaced by apiserver_resource_objects with consistent labels
claimsToAllocate is passed through Allocate instead of NewAllocator
Memory tracking functionality added to the scheduler performance tests
kubelet: Instrumentation for in-place pod resize
Test coverage increased for pkg/kubelet/types
Fix for CPUManager non-regression test to handle CPU quota edge cases
InPlacePodVerticalScaling adds an event for pod resize completion
Fix for incorrect label key used in PodTopologyLabelAdmission, blocking beta graduation
kubelet supports contextual logging, and components including apis, kubeletconfig, nodeshutdown, pod, preemption, and memory manager have been migrated to use it
kuberuntime migrated to contextual logging
Image pull credential verification enabled for service account–based credential providers
Mirror pods test for generation and observedGeneration
More complex e2e test created for deferred resizes
DRA filter plugin times out after 10s to avoid long scheduling delays, configurable via FilterTimeout
Pause version updated to registry.k8s.io/pause:3.10.1
kube-apiserver support for PodCertificateRequest and PodCertificate projected volumes enabled
Warnings added for headless service using loadBalancerIP, externalIPs, or sessionAffinity
last_config_info metric added for authn, authz and encryption config
Promotions
PodLifecycleSleepAction to GA
NodeSwap to GA
Recovery feature to GA
PodObservedGenerationTracking to beta
WatchList to beta
API Server Tracing to GA
KubeletServiceAccountTokenForCredentialProviders to beta
ListFromCacheSnapshot to beta
Version Updates
Bumped cel-go to v0.26.0
Subprojects and Dependency Updates
cluster-api v1.11.0-beta.0: releases beta version for testing
via Last Week in Kubernetes Development https://lwkd.info/
July 23, 2025 at 05:38AM
Ep30 - Ask Me Anything About Anything with Scott Rosenberg
There are no restrictions in this AMA session. You can ask anything about DevOps, AI, Cloud, Kubernetes, Platform Engineering, containers, or anything else. Scott Rosenberg, regular guest, will be here to help us out.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: Codefresh 🔗 GitOps Argo CD Certifications: https://learning.codefresh.io (use "viktor" for a 50% discount) ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/
▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox
via YouTube https://www.youtube.com/watch?v=DHsb-Q1i_A4
The End of Infrastructure-as-Code: AI Changes Everything
Infrastructure-as-Code tools like Terraform and Pulumi have revolutionized how we manage cloud and Kubernetes resources, yet their days may be numbered. AI agents, capable of handling complexity without the cognitive limitations of humans, don't need the same abstractions we've built our industry around. This video examines how API-driven infrastructure has led to tools designed for human convenience and why the rise of AI agents will render these tools obsolete.
Discover how each generation of infrastructure tools, from Chef and Puppet to Terraform and Crossplane, failed to adapt to paradigm shifts, and why AI agents represent the next big transformation. Will today's industry leaders adapt to AI-first operations, or will they repeat the mistakes of history by clinging to outdated paradigms? The AI revolution is coming, and the future belongs to those bold enough to rethink everything.
InfrastructureAsCode #AI #DevOps
Consider joining the channel: https://www.youtube.com/c/devopstoolkit/join
▬▬▬▬▬▬ 🔗 Additional Info 🔗 ▬▬▬▬▬▬ ➡ Transcript and commands: https://devopstoolkit.live/infrastructure-as-code/the-end-of-infrastructure-as-code-ai-changes-everything
▬▬▬▬▬▬ 💰 Sponsorships 💰 ▬▬▬▬▬▬ If you are interested in sponsoring this channel, please visit https://devopstoolkit.live/sponsor for more information. Alternatively, feel free to contact me over Twitter or LinkedIn (see below).
▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/
▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox
▬▬▬▬▬▬ ⏱ Timecodes ⏱ ▬▬▬▬▬▬ 00:00 Introduction 01:02 The API Foundation 03:44 Human-Centric Tools 05:24 The Kubernetes Revolution 07:25 The Pattern of Tool Death 11:39 AI Today 15:13 The AI Revolution 18:48 Tools Are Dead
via YouTube https://www.youtube.com/watch?v=xwDQh3gAec0
914 Days Later — My Journey Off Klonopin
https://chrisshort.net/914-days-later-my-journey-off-klonopin/
This is a significant accomplishment for me. I’m taking a moment to celebrate it while at the same time trying to warn others about the highly addictive drug, Klonopin (clonazepam).
Mental healthcare in the United States can be great, once you figure out how to get it. It’s not talked about enough, it’s stigmatized, and it requires incredible courage to take the first step. But, it’s 200% worth it.
However, some doctors may not be keeping up with the latest knowledge about which drugs are effective for specific symptoms as new ones emerge and the industry replaces older ones. It happens. I knew this going in. But, I made one critical mistake that I want to highlight: I didn’t ask enough questions before my doctor prescribed it.
via Chris Short https://chrisshort.net/
July 20, 2025
Post-Quantum Cryptography in Kubernetes
https://kubernetes.io/blog/2025/07/18/pqc-in-k8s/
The world of cryptography is on the cusp of a major shift with the advent of quantum computing. While powerful quantum computers are still largely theoretical for many applications, their potential to break current cryptographic standards is a serious concern, especially for long-lived systems. This is where Post-Quantum Cryptography (PQC) comes in. In this article, I'll dive into what PQC means for TLS and, more specifically, for the Kubernetes ecosystem. I'll explain what the (suprising) state of PQC in Kubernetes is and what the implications are for current and future clusters.
What is Post-Quantum Cryptography
Post-Quantum Cryptography refers to cryptographic algorithms that are thought to be secure against attacks by both classical and quantum computers. The primary concern is that quantum computers, using algorithms like Shor's Algorithm, could efficiently break widely used public-key cryptosystems such as RSA and Elliptic Curve Cryptography (ECC), which underpin much of today's secure communication, including TLS. The industry is actively working on standardizing and adopting PQC algorithms. One of the first to be standardized by NIST is the Module-Lattice Key Encapsulation Mechanism (ML-KEM), formerly known as Kyber, and now standardized as FIPS-203 (PDF download).
It is difficult to predict when quantum computers will be able to break classical algorithms. However, it is clear that we need to start migrating to PQC algorithms now, as the next section shows. To get a feeling for the predicted timeline we can look at a NIST report covering the transition to post-quantum cryptography standards. It declares that system with classical crypto should be deprecated after 2030 and disallowed after 2035.
Key exchange vs. digital signatures: different needs, different timelines
In TLS, there are two main cryptographic operations we need to secure:
Key Exchange: This is how the client and server agree on a shared secret to encrypt their communication. If an attacker records encrypted traffic today, they could decrypt it in the future, if they gain access to a quantum computer capable of breaking the key exchange. This makes migrating KEMs to PQC an immediate priority.
Digital Signatures: These are primarily used to authenticate the server (and sometimes the client) via certificates. The authenticity of a server is verified at the time of connection. While important, the risk of an attack today is much lower, because the decision of trusting a server cannot be abused after the fact. Additionally, current PQC signature schemes often come with significant computational overhead and larger key/signature sizes compared to their classical counterparts.
Another significant hurdle in the migration to PQ certificates is the upgrade of root certificates. These certificates have long validity periods and are installed in many devices and operating systems as trust anchors.
Given these differences, the focus for immediate PQC adoption in TLS has been on hybrid key exchange mechanisms. These combine a classical algorithm (such as Elliptic Curve Diffie-Hellman Ephemeral (ECDHE)) with a PQC algorithm (such as ML-KEM). The resulting shared secret is secure as long as at least one of the component algorithms remains unbroken. The X25519MLKEM768 hybrid scheme is the most widely supported one.
State of PQC key exchange mechanisms (KEMs) today
Support for PQC KEMs is rapidly improving across the ecosystem.
Go: The Go standard library's crypto/tls package introduced support for X25519MLKEM768 in version 1.24 (released February 2025). Crucially, it's enabled by default when there is no explicit configuration, i.e., Config.CurvePreferences is nil.
Browsers & OpenSSL: Major browsers like Chrome (version 131, November 2024) and Firefox (version 135, February 2025), as well as OpenSSL (version 3.5.0, April 2025), have also added support for the ML-KEM based hybrid scheme.
Apple is also rolling out support for X25519MLKEM768 in version 26 of their operating systems. Given the proliferation of Apple devices, this will have a significant impact on the global PQC adoption.
For a more detailed overview of the state of PQC in the wider industry, see this blog post by Cloudflare.
Post-quantum KEMs in Kubernetes: an unexpected arrival
So, what does this mean for Kubernetes? Kubernetes components, including the API server and kubelet, are built with Go.
As of Kubernetes v1.33, released in April 2025, the project uses Go 1.24. A quick check of the Kubernetes codebase reveals that Config.CurvePreferences is not explicitly set. This leads to a fascinating conclusion: Kubernetes v1.33, by virtue of using Go 1.24, supports hybrid post-quantum X25519MLKEM768 for TLS connections by default!
You can test this yourself. If you set up a Minikube cluster running Kubernetes v1.33.0, you can connect to the API server using a recent OpenSSL client:
$ minikube start --kubernetes-version=v1.33.0 $ kubectl cluster-info Kubernetes control plane is running at https://127.0.0.1:<PORT> $ kubectl config view --minify --raw -o jsonpath='{.clusters[0].cluster.certificate-authority-data}' | base64 -d > ca.crt $ openssl version OpenSSL 3.5.0 8 Apr 2025 (Library: OpenSSL 3.5.0 8 Apr 2025) $ echo -n "Q" | openssl s_client -connect 127.0.0.1:<PORT> -CAfile ca.crt [...] Negotiated TLS1.3 group: X25519MLKEM768 [...] DONE
Lo and behold, the negotiated group is X25519MLKEM768! This is a significant step towards making Kubernetes quantum-safe, seemingly without a major announcement or dedicated KEP (Kubernetes Enhancement Proposal).
The Go version mismatch pitfall
An interesting wrinkle emerged with Go versions 1.23 and 1.24. Go 1.23 included experimental support for a draft version of ML-KEM, identified as X25519Kyber768Draft00. This was also enabled by default if Config.CurvePreferences was nil. Kubernetes v1.32 used Go 1.23. However, Go 1.24 removed the draft support and replaced it with the standardized version X25519MLKEM768.
What happens if a client and server are using mismatched Go versions (one on 1.23, the other on 1.24)? They won't have a common PQC KEM to negotiate, and the handshake will fall back to classical ECC curves (e.g., X25519). How could this happen in practice?
Consider a scenario:
A Kubernetes cluster is running v1.32 (using Go 1.23 and thus X25519Kyber768Draft00). A developer upgrades their kubectl to v1.33, compiled with Go 1.24, only supporting X25519MLKEM768. Now, when kubectl communicates with the v1.32 API server, they no longer share a common PQC algorithm. The connection will downgrade to classical cryptography, silently losing the PQC protection that has been in place. This highlights the importance of understanding the implications of Go version upgrades, and the details of the TLS stack.
Limitations: packet size
One practical consideration with ML-KEM is the size of its public keys with encoded key sizes of around 1.2 kilobytes for ML-KEM-768. This can cause the initial TLS ClientHello message not to fit inside a single TCP/IP packet, given the typical networking constraints (most commonly, the standard Ethernet frame size limit of 1500 bytes). Some TLS libraries or network appliances might not handle this gracefully, assuming the Client Hello always fits in one packet. This issue has been observed in some Kubernetes-related projects and networking components, potentially leading to connection failures when PQC KEMs are used. More details can be found at tldr.fail.
State of Post-Quantum Signatures
While KEMs are seeing broader adoption, PQC digital signatures are further behind in terms of widespread integration into standard toolchains. NIST has published standards for PQC signatures, such as ML-DSA (FIPS-204) and SLH-DSA (FIPS-205). However, implementing these in a way that's broadly usable (e.g., for PQC Certificate Authorities) presents challenges:
Larger Keys and Signatures: PQC signature schemes often have significantly larger public keys and signature sizes compared to classical algorithms like Ed25519 or RSA. For instance, Dilithium2 keys can be 30 times larger than Ed25519 keys, and certificates can be 12 times larger.
Performance: Signing and verification operations can be substantially slower. While some algorithms are on par with classical algorithms, others may have a much higher overhead, sometimes on the order of 10x to 1000x worse performance. To improve this situation, NIST is running a second round of standardization for PQC signatures.
Toolchain Support: Mainstream TLS libraries and CA software do not yet have mature, built-in support for these new signature algorithms. The Go team, for example, has indicated that ML-DSA support is a high priority, but the soonest it might appear in the standard library is Go 1.26 (as of May 2025).
Cloudflare's CIRCL (Cloudflare Interoperable Reusable Cryptographic Library) library implements some PQC signature schemes like variants of Dilithium, and they maintain a fork of Go (cfgo) that integrates CIRCL. Using cfgo, it's possible to experiment with generating certificates signed with PQC algorithms like Ed25519-Dilithium2. However, this requires using a custom Go toolchain and is not yet part of the mainstream Kubernetes or Go distributions.
Conclusion
The journey to a post-quantum secure Kubernetes is underway, and perhaps further along than many realize, thanks to the proactive adoption of ML-KEM in Go. With Kubernetes v1.33, users are already benefiting from hybrid post-quantum key exchange in many TLS connections by default.
However, awareness of potential pitfalls, such as Go version mismatches leading to downgrades and issues with Client Hello packet sizes, is crucial. While PQC for KEMs is becoming a reality, PQC for digital signatures and certificate hierarchies is still in earlier stages of development and adoption for mainstream use. As Kubernetes maint
Blog: Post-Quantum Cryptography in Kubernetes
https://www.kubernetes.dev/blog/2025/07/18/pqc-in-k8s/
The world of cryptography is on the cusp of a major shift with the advent of quantum computing. While powerful quantum computers are still largely theoretical for many applications, their potential to break current cryptographic standards is a serious concern, especially for long-lived systems. This is where Post-Quantum Cryptography (PQC) comes in. In this article, I'll dive into what PQC means for TLS and, more specifically, for the Kubernetes ecosystem. I’ll explain what the (suprising) state of PQC in Kubernetes is and what the implications are for current and future clusters.
What is Post-Quantum Cryptography
Post-Quantum Cryptography refers to cryptographic algorithms that are thought to be secure against attacks by both classical and quantum computers. The primary concern is that quantum computers, using algorithms like Shor's Algorithm, could efficiently break widely used public-key cryptosystems such as RSA and Elliptic Curve Cryptography (ECC), which underpin much of today's secure communication, including TLS. The industry is actively working on standardizing and adopting PQC algorithms. One of the first to be standardized by NIST is the Module-Lattice Key Encapsulation Mechanism (ML-KEM), formerly known as Kyber, and now standardized as FIPS-203 (PDF download).
It is difficult to predict when quantum computers will be able to break classical algorithms. However, it is clear that we need to start migrating to PQC algorithms now, as the next section shows. To get a feeling for the predicted timeline we can look at a NIST report covering the transition to post-quantum cryptography standards. It declares that system with classical crypto should be deprecated after 2030 and disallowed after 2035.
Key exchange vs. digital signatures: different needs, different timelines
In TLS, there are two main cryptographic operations we need to secure:
Key Exchange: This is how the client and server agree on a shared secret to encrypt their communication. If an attacker records encrypted traffic today, they could decrypt it in the future, if they gain access to a quantum computer capable of breaking the key exchange. This makes migrating KEMs to PQC an immediate priority.
Digital Signatures: These are primarily used to authenticate the server (and sometimes the client) via certificates. The authenticity of a server is verified at the time of connection. While important, the risk of an attack today is much lower, because the decision of trusting a server cannot be abused after the fact. Additionally, current PQC signature schemes often come with significant computational overhead and larger key/signature sizes compared to their classical counterparts.
Another significant hurdle in the migration to PQ certificates is the upgrade of root certificates. These certificates have long validity periods and are installed in many devices and operating systems as trust anchors.
Given these differences, the focus for immediate PQC adoption in TLS has been on hybrid key exchange mechanisms. These combine a classical algorithm (such as Elliptic Curve Diffie-Hellman Ephemeral (ECDHE)) with a PQC algorithm (such as ML-KEM). The resulting shared secret is secure as long as at least one of the component algorithms remains unbroken. The X25519MLKEM768 hybrid scheme is the most widely supported one.
State of PQC key exchange mechanisms (KEMs) today
Support for PQC KEMs is rapidly improving across the ecosystem.
Go: The Go standard library's crypto/tls package introduced support for X25519MLKEM768 in version 1.24 (released February 2025). Crucially, it's enabled by default when there is no explicit configuration, i.e., Config.CurvePreferences is nil.
Browsers & OpenSSL: Major browsers like Chrome (version 131, November 2024) and Firefox (version 135, February 2025), as well as OpenSSL (version 3.5.0, April 2025), have also added support for the ML-KEM based hybrid scheme.
Apple is also rolling out support for X25519MLKEM768 in version 26 of their operating systems. Given the proliferation of Apple devices, this will have a significant impact on the global PQC adoption.
For a more detailed overview of the state of PQC in the wider industry, see this blog post by Cloudflare.
Post-quantum KEMs in Kubernetes: an unexpected arrival
So, what does this mean for Kubernetes? Kubernetes components, including the API server and kubelet, are built with Go.
As of Kubernetes v1.33, released in April 2025, the project uses Go 1.24. A quick check of the Kubernetes codebase reveals that Config.CurvePreferences is not explicitly set. This leads to a fascinating conclusion: Kubernetes v1.33, by virtue of using Go 1.24, supports hybrid post-quantum X25519MLKEM768 for TLS connections by default!
You can test this yourself. If you set up a Minikube cluster running Kubernetes v1.33.0, you can connect to the API server using a recent OpenSSL client:
$ minikube start --kubernetes-version=v1.33.0 $ kubectl cluster-info Kubernetes control plane is running at https://127.0.0.1:<PORT> $ kubectl config view --minify --raw -o jsonpath='{.clusters[0].cluster.certificate-authority-data}' | base64 -d > ca.crt $ openssl version OpenSSL 3.5.0 8 Apr 2025 (Library: OpenSSL 3.5.0 8 Apr 2025) $ echo -n "Q" | openssl s_client -connect 127.0.0.1:<PORT> -CAfile ca.crt [...] Negotiated TLS1.3 group: X25519MLKEM768 [...] DONE
Lo and behold, the negotiated group is X25519MLKEM768! This is a significant step towards making Kubernetes quantum-safe, seemingly without a major announcement or dedicated KEP (Kubernetes Enhancement Proposal).
The Go version mismatch pitfall
An interesting wrinkle emerged with Go versions 1.23 and 1.24. Go 1.23 included experimental support for a draft version of ML-KEM, identified as X25519Kyber768Draft00. This was also enabled by default if Config.CurvePreferences was nil. Kubernetes v1.32 used Go 1.23. However, Go 1.24 removed the draft support and replaced it with the standardized version X25519MLKEM768.
What happens if a client and server are using mismatched Go versions (one on 1.23, the other on 1.24)? They won't have a common PQC KEM to negotiate, and the handshake will fall back to classical ECC curves (e.g., X25519). How could this happen in practice?
Consider a scenario:
A Kubernetes cluster is running v1.32 (using Go 1.23 and thus X25519Kyber768Draft00). A developer upgrades their kubectl to v1.33, compiled with Go 1.24, only supporting X25519MLKEM768. Now, when kubectl communicates with the v1.32 API server, they no longer share a common PQC algorithm. The connection will downgrade to classical cryptography, silently losing the PQC protection that has been in place. This highlights the importance of understanding the implications of Go version upgrades, and the details of the TLS stack.
Limitations: packet size
One practical consideration with ML-KEM is the size of its public keys with encoded key sizes of around 1.2 kilobytes for ML-KEM-768. This can cause the initial TLS ClientHello message not to fit inside a single TCP/IP packet, given the typical networking constraints (most commonly, the standard Ethernet frame size limit of 1500 bytes). Some TLS libraries or network appliances might not handle this gracefully, assuming the Client Hello always fits in one packet. This issue has been observed in some Kubernetes-related projects and networking components, potentially leading to connection failures when PQC KEMs are used. More details can be found at tldr.fail.
State of Post-Quantum Signatures
While KEMs are seeing broader adoption, PQC digital signatures are further behind in terms of widespread integration into standard toolchains. NIST has published standards for PQC signatures, such as ML-DSA (FIPS-204) and SLH-DSA (FIPS-205). However, implementing these in a way that's broadly usable (e.g., for PQC Certificate Authorities) presents challenges:
Larger Keys and Signatures: PQC signature schemes often have significantly larger public keys and signature sizes compared to classical algorithms like Ed25519 or RSA. For instance, Dilithium2 keys can be 30 times larger than Ed25519 keys, and certificates can be 12 times larger.
Performance: Signing and verification operations can be substantially slower. While some algorithms are on par with classical algorithms, others may have a much higher overhead, sometimes on the order of 10x to 1000x worse performance. To improve this situation, NIST is running a second round of standardization for PQC signatures.
Toolchain Support: Mainstream TLS libraries and CA software do not yet have mature, built-in support for these new signature algorithms. The Go team, for example, has indicated that ML-DSA support is a high priority, but the soonest it might appear in the standard library is Go 1.26 (as of May 2025).
Cloudflare's CIRCL (Cloudflare Interoperable Reusable Cryptographic Library) library implements some PQC signature schemes like variants of Dilithium, and they maintain a fork of Go (cfgo) that integrates CIRCL. Using cfgo, it's possible to experiment with generating certificates signed with PQC algorithms like Ed25519-Dilithium2. However, this requires using a custom Go toolchain and is not yet part of the mainstream Kubernetes or Go distributions.
Conclusion
The journey to a post-quantum secure Kubernetes is underway, and perhaps further along than many realize, thanks to the proactive adoption of ML-KEM in Go. With Kubernetes v1.33, users are already benefiting from hybrid post-quantum key exchange in many TLS connections by default.
However, awareness of potential pitfalls, such as Go version mismatches leading to downgrades and issues with Client Hello packet sizes, is crucial. While PQC for KEMs is becoming a reality, PQC for digital signatures and certificate hierarchies is still in earlier stages of development and adoption for mainstream use. As Kuber
Week Ending July 13, 2025
https://lwkd.info/2025/20250717
Developer News
SIG-Network proposed a new AI Gateway Working Group, dedicated to exploring the intersection of AI and networking. The WG will focus on standardizing how Kubernetes manages AI-specific traffic, with particular attention to routing, filters, and policy requirements for AI workloads.
The KubeCon North America 2025 Maintainer Summit CFP is open and closes soon on July 20th. Make sure to submit your talks before the deadline!
LFX Mentorship 2025 Term 3 is now open for SIGs to submit mentorship project ideas. To propose a project, submit a PR to the project_ideas repository by July 29th 2025. If you have any questions about the LFX mentorship program, feel free to ask in the #sig-contribex.
Release Schedule
Next Deadline: Code and Test Freeze, July 24/25
Code and Test Freeze starts at 0200 UTC on Friday, July 25. Your PRs should all be merged by then.
Kubernetes v1.34.0-beta.0 has been built and pushed using Golang version 1.24.5.
Patch Releases 1.32.7 and 1.31.11 are released. These releases includes bug fixes for Jobs and etcd member promotion in kubeadm.
Featured PRs
132832: add SuccessCriteriaMet status for kubectl get job
This PR updates the kubectl get job output by adding a new SuccessCriteriaMet column; This column indicates whether the job has met its success criteria, based on the Job job successPolicy; This makes it easier for users to see if a job has satisfied its configured success conditions.
132838: Drop Deprecated Etcd Flags in Kubeadm
This PR removes the usage of two long-deprecated etcd flags in Kubeadm:
--experimental-initial-corrupt-check
--experimental-watch-progress-notify-interval
These flags were deprecated in etcd v3.6.0 and removed in v3.7.0; The corresponding functionality is now supported via a feature gate InitialCorruptCheck=true, and a renamed flag --watch-progress-notify-interval (without the experimental prefix).
KEP of the Week
KEP-4427: Relaxed DNS search string validation
This KEP proposes relaxing Kubernetes’ strict DNS validation rules for dnsConfig.searches in Pod specs. It allows underscores (_) and a single dot (.), which are commonly used in real-world DNS use cases like SRV records or to bypass Kubernetes’ internal DNS search paths. Without this change, such configurations are rejected due to RFC-1123 hostname restrictions, making it difficult to support some legacy or external systems
This KEP is tracked as stable in v1.34.
Other Merges
Remaining strPtr replaced with ptr.To
SizeBasedListCostEstimate feature gate added which assigns 1 APF seat per 100KB for LIST requests
Reflector detects unsupported meta.Table GVKs for LIST+WATCH
boolPtrFn replaced with k8s.io/utils/ptr
Service IP processing delayed by 5s during recreate to avoid race conditions
Egress selector support to JWT authenticator
ReplicaSet to ReplicationController conversion test added
DetectCacheInconsistency enabled to compare apiserver cache with etcd and purge inconsistent snapshots
Compactor test added
local-up-cluster cleaned up and support for automated upgrade/downgrade testing added
Compaction revision exposed from compactor
Verbosity of frequent logs in volume binding plugin lowered from V(4) to V(5)
validation-gen adds k8s:enum validators
Kubelet token cache made UID-aware to prevent stale tokens after service account recreation
kubeadm uses named port probe-port for probes in static pod manifests
unschedulablePods struct moved to a separate file
Internal LoadBalancer port uses EndpointSlice container port when targetPort is unspecified
scheduler_perf logs added to report failures in measuring SchedulingThroughput
ServiceAccountTokenCacheType support added to credential provider plugin
Validation error messages simplified by removing redundant field names
validation-gen enhanced with new rules and core refactoring
PreBindPreFlight added and implemented in in-tree plugins
Implications of using hostNetwork with ports documented
kube-proxy considers timeouts when fetching Node objects or NodeIPs as fatal
Inconsistencies reset cache snapshots and block new ones until the cache is marked consistent again
Allocation manager AddPod() unit tests added
Duplicate DaemonSet update validations removed to avoid redundant checks
kube-proxy in nftables mode drops traffic to Services with no endpoints using filter chains at priority 0
In-place pod vertical scaling prioritizes resize requests based on priorityClass and QoS when resources are limited
PodResources API includes only active Pods
CPUManager aligns uncore cache for odd-numbered CPUs
Flag registration moved into kube-apiserver to eliminate global state
Metrics for MutatingAdmissionPolicy
DRA: Improves allocator with better backtracking
Linux masks thermal interrupt info in /proc and /sys
observedGeneration in pod resize conditions fixed under InPlacePodVerticalScaling feature gate
RelaxedEnvironmentVariableValidation test to Conformance
OrderedNamespaceDeletion test to Conformance
Two EndpointSlice e2e tests to Conformance
Promotions
ConsistentListFromCache to GA
KubeletTracing to GA
Version Updates
Bumped dependencies and images to Go 1.24.5 and distroless iptables
Bumped kube-openapi to SHA f3f2b991d03b and updated structured-merge-diff from v4 to v6
Shoutouts
Drew Hagen: Big thanks to @Matteo, @satyampsoni, @Angelos Kolaitis for hovering around late in the day in your time zones to help me cut my first Kubernetes release cut, v1.34.0-alpha.3!!
via Last Week in Kubernetes Development https://lwkd.info/
July 17, 2025 at 12:35PM
Ep29 - Ask Me Anything About Anything with Scott Rosenberg
There are no restrictions in this AMA session. You can ask anything about DevOps, AI, Cloud, Kubernetes, Platform Engineering, containers, or anything else. Scott Rosenberg, regular guest, will be here to help us out.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: Codefresh 🔗 GitOps Argo CD Certifications: https://learning.codefresh.io (use "viktor" for a 50% discount) ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/
▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox
via YouTube https://www.youtube.com/watch?v=4bZgHXrMCmU