
1_r/devopsish
Post-Quantum Cryptography in Kubernetes
https://kubernetes.io/blog/2025/07/18/pqc-in-k8s/
The world of cryptography is on the cusp of a major shift with the advent of quantum computing. While powerful quantum computers are still largely theoretical for many applications, their potential to break current cryptographic standards is a serious concern, especially for long-lived systems. This is where Post-Quantum Cryptography (PQC) comes in. In this article, I'll dive into what PQC means for TLS and, more specifically, for the Kubernetes ecosystem. I'll explain what the (suprising) state of PQC in Kubernetes is and what the implications are for current and future clusters.
What is Post-Quantum Cryptography
Post-Quantum Cryptography refers to cryptographic algorithms that are thought to be secure against attacks by both classical and quantum computers. The primary concern is that quantum computers, using algorithms like Shor's Algorithm, could efficiently break widely used public-key cryptosystems such as RSA and Elliptic Curve Cryptography (ECC), which underpin much of today's secure communication, including TLS. The industry is actively working on standardizing and adopting PQC algorithms. One of the first to be standardized by NIST is the Module-Lattice Key Encapsulation Mechanism (ML-KEM), formerly known as Kyber, and now standardized as FIPS-203 (PDF download).
It is difficult to predict when quantum computers will be able to break classical algorithms. However, it is clear that we need to start migrating to PQC algorithms now, as the next section shows. To get a feeling for the predicted timeline we can look at a NIST report covering the transition to post-quantum cryptography standards. It declares that system with classical crypto should be deprecated after 2030 and disallowed after 2035.
Key exchange vs. digital signatures: different needs, different timelines
In TLS, there are two main cryptographic operations we need to secure:
Key Exchange: This is how the client and server agree on a shared secret to encrypt their communication. If an attacker records encrypted traffic today, they could decrypt it in the future, if they gain access to a quantum computer capable of breaking the key exchange. This makes migrating KEMs to PQC an immediate priority.
Digital Signatures: These are primarily used to authenticate the server (and sometimes the client) via certificates. The authenticity of a server is verified at the time of connection. While important, the risk of an attack today is much lower, because the decision of trusting a server cannot be abused after the fact. Additionally, current PQC signature schemes often come with significant computational overhead and larger key/signature sizes compared to their classical counterparts.
Another significant hurdle in the migration to PQ certificates is the upgrade of root certificates. These certificates have long validity periods and are installed in many devices and operating systems as trust anchors.
Given these differences, the focus for immediate PQC adoption in TLS has been on hybrid key exchange mechanisms. These combine a classical algorithm (such as Elliptic Curve Diffie-Hellman Ephemeral (ECDHE)) with a PQC algorithm (such as ML-KEM). The resulting shared secret is secure as long as at least one of the component algorithms remains unbroken. The X25519MLKEM768 hybrid scheme is the most widely supported one.
State of PQC key exchange mechanisms (KEMs) today
Support for PQC KEMs is rapidly improving across the ecosystem.
Go: The Go standard library's crypto/tls package introduced support for X25519MLKEM768 in version 1.24 (released February 2025). Crucially, it's enabled by default when there is no explicit configuration, i.e., Config.CurvePreferences is nil.
Browsers & OpenSSL: Major browsers like Chrome (version 131, November 2024) and Firefox (version 135, February 2025), as well as OpenSSL (version 3.5.0, April 2025), have also added support for the ML-KEM based hybrid scheme.
Apple is also rolling out support for X25519MLKEM768 in version 26 of their operating systems. Given the proliferation of Apple devices, this will have a significant impact on the global PQC adoption.
For a more detailed overview of the state of PQC in the wider industry, see this blog post by Cloudflare.
Post-quantum KEMs in Kubernetes: an unexpected arrival
So, what does this mean for Kubernetes? Kubernetes components, including the API server and kubelet, are built with Go.
As of Kubernetes v1.33, released in April 2025, the project uses Go 1.24. A quick check of the Kubernetes codebase reveals that Config.CurvePreferences is not explicitly set. This leads to a fascinating conclusion: Kubernetes v1.33, by virtue of using Go 1.24, supports hybrid post-quantum X25519MLKEM768 for TLS connections by default!
You can test this yourself. If you set up a Minikube cluster running Kubernetes v1.33.0, you can connect to the API server using a recent OpenSSL client:
$ minikube start --kubernetes-version=v1.33.0 $ kubectl cluster-info Kubernetes control plane is running at https://127.0.0.1:<PORT> $ kubectl config view --minify --raw -o jsonpath='{.clusters[0].cluster.certificate-authority-data}' | base64 -d > ca.crt $ openssl version OpenSSL 3.5.0 8 Apr 2025 (Library: OpenSSL 3.5.0 8 Apr 2025) $ echo -n "Q" | openssl s_client -connect 127.0.0.1:<PORT> -CAfile ca.crt [...] Negotiated TLS1.3 group: X25519MLKEM768 [...] DONE
Lo and behold, the negotiated group is X25519MLKEM768! This is a significant step towards making Kubernetes quantum-safe, seemingly without a major announcement or dedicated KEP (Kubernetes Enhancement Proposal).
The Go version mismatch pitfall
An interesting wrinkle emerged with Go versions 1.23 and 1.24. Go 1.23 included experimental support for a draft version of ML-KEM, identified as X25519Kyber768Draft00. This was also enabled by default if Config.CurvePreferences was nil. Kubernetes v1.32 used Go 1.23. However, Go 1.24 removed the draft support and replaced it with the standardized version X25519MLKEM768.
What happens if a client and server are using mismatched Go versions (one on 1.23, the other on 1.24)? They won't have a common PQC KEM to negotiate, and the handshake will fall back to classical ECC curves (e.g., X25519). How could this happen in practice?
Consider a scenario:
A Kubernetes cluster is running v1.32 (using Go 1.23 and thus X25519Kyber768Draft00). A developer upgrades their kubectl to v1.33, compiled with Go 1.24, only supporting X25519MLKEM768. Now, when kubectl communicates with the v1.32 API server, they no longer share a common PQC algorithm. The connection will downgrade to classical cryptography, silently losing the PQC protection that has been in place. This highlights the importance of understanding the implications of Go version upgrades, and the details of the TLS stack.
Limitations: packet size
One practical consideration with ML-KEM is the size of its public keys with encoded key sizes of around 1.2 kilobytes for ML-KEM-768. This can cause the initial TLS ClientHello message not to fit inside a single TCP/IP packet, given the typical networking constraints (most commonly, the standard Ethernet frame size limit of 1500 bytes). Some TLS libraries or network appliances might not handle this gracefully, assuming the Client Hello always fits in one packet. This issue has been observed in some Kubernetes-related projects and networking components, potentially leading to connection failures when PQC KEMs are used. More details can be found at tldr.fail.
State of Post-Quantum Signatures
While KEMs are seeing broader adoption, PQC digital signatures are further behind in terms of widespread integration into standard toolchains. NIST has published standards for PQC signatures, such as ML-DSA (FIPS-204) and SLH-DSA (FIPS-205). However, implementing these in a way that's broadly usable (e.g., for PQC Certificate Authorities) presents challenges:
Larger Keys and Signatures: PQC signature schemes often have significantly larger public keys and signature sizes compared to classical algorithms like Ed25519 or RSA. For instance, Dilithium2 keys can be 30 times larger than Ed25519 keys, and certificates can be 12 times larger.
Performance: Signing and verification operations can be substantially slower. While some algorithms are on par with classical algorithms, others may have a much higher overhead, sometimes on the order of 10x to 1000x worse performance. To improve this situation, NIST is running a second round of standardization for PQC signatures.
Toolchain Support: Mainstream TLS libraries and CA software do not yet have mature, built-in support for these new signature algorithms. The Go team, for example, has indicated that ML-DSA support is a high priority, but the soonest it might appear in the standard library is Go 1.26 (as of May 2025).
Cloudflare's CIRCL (Cloudflare Interoperable Reusable Cryptographic Library) library implements some PQC signature schemes like variants of Dilithium, and they maintain a fork of Go (cfgo) that integrates CIRCL. Using cfgo, it's possible to experiment with generating certificates signed with PQC algorithms like Ed25519-Dilithium2. However, this requires using a custom Go toolchain and is not yet part of the mainstream Kubernetes or Go distributions.
Conclusion
The journey to a post-quantum secure Kubernetes is underway, and perhaps further along than many realize, thanks to the proactive adoption of ML-KEM in Go. With Kubernetes v1.33, users are already benefiting from hybrid post-quantum key exchange in many TLS connections by default.
However, awareness of potential pitfalls, such as Go version mismatches leading to downgrades and issues with Client Hello packet sizes, is crucial. While PQC for KEMs is becoming a reality, PQC for digital signatures and certificate hierarchies is still in earlier stages of development and adoption for mainstream use. As Kubernetes maint
Blog: Post-Quantum Cryptography in Kubernetes
https://www.kubernetes.dev/blog/2025/07/18/pqc-in-k8s/
The world of cryptography is on the cusp of a major shift with the advent of quantum computing. While powerful quantum computers are still largely theoretical for many applications, their potential to break current cryptographic standards is a serious concern, especially for long-lived systems. This is where Post-Quantum Cryptography (PQC) comes in. In this article, I'll dive into what PQC means for TLS and, more specifically, for the Kubernetes ecosystem. I’ll explain what the (suprising) state of PQC in Kubernetes is and what the implications are for current and future clusters.
What is Post-Quantum Cryptography
Post-Quantum Cryptography refers to cryptographic algorithms that are thought to be secure against attacks by both classical and quantum computers. The primary concern is that quantum computers, using algorithms like Shor's Algorithm, could efficiently break widely used public-key cryptosystems such as RSA and Elliptic Curve Cryptography (ECC), which underpin much of today's secure communication, including TLS. The industry is actively working on standardizing and adopting PQC algorithms. One of the first to be standardized by NIST is the Module-Lattice Key Encapsulation Mechanism (ML-KEM), formerly known as Kyber, and now standardized as FIPS-203 (PDF download).
It is difficult to predict when quantum computers will be able to break classical algorithms. However, it is clear that we need to start migrating to PQC algorithms now, as the next section shows. To get a feeling for the predicted timeline we can look at a NIST report covering the transition to post-quantum cryptography standards. It declares that system with classical crypto should be deprecated after 2030 and disallowed after 2035.
Key exchange vs. digital signatures: different needs, different timelines
In TLS, there are two main cryptographic operations we need to secure:
Key Exchange: This is how the client and server agree on a shared secret to encrypt their communication. If an attacker records encrypted traffic today, they could decrypt it in the future, if they gain access to a quantum computer capable of breaking the key exchange. This makes migrating KEMs to PQC an immediate priority.
Digital Signatures: These are primarily used to authenticate the server (and sometimes the client) via certificates. The authenticity of a server is verified at the time of connection. While important, the risk of an attack today is much lower, because the decision of trusting a server cannot be abused after the fact. Additionally, current PQC signature schemes often come with significant computational overhead and larger key/signature sizes compared to their classical counterparts.
Another significant hurdle in the migration to PQ certificates is the upgrade of root certificates. These certificates have long validity periods and are installed in many devices and operating systems as trust anchors.
Given these differences, the focus for immediate PQC adoption in TLS has been on hybrid key exchange mechanisms. These combine a classical algorithm (such as Elliptic Curve Diffie-Hellman Ephemeral (ECDHE)) with a PQC algorithm (such as ML-KEM). The resulting shared secret is secure as long as at least one of the component algorithms remains unbroken. The X25519MLKEM768 hybrid scheme is the most widely supported one.
State of PQC key exchange mechanisms (KEMs) today
Support for PQC KEMs is rapidly improving across the ecosystem.
Go: The Go standard library's crypto/tls package introduced support for X25519MLKEM768 in version 1.24 (released February 2025). Crucially, it's enabled by default when there is no explicit configuration, i.e., Config.CurvePreferences is nil.
Browsers & OpenSSL: Major browsers like Chrome (version 131, November 2024) and Firefox (version 135, February 2025), as well as OpenSSL (version 3.5.0, April 2025), have also added support for the ML-KEM based hybrid scheme.
Apple is also rolling out support for X25519MLKEM768 in version 26 of their operating systems. Given the proliferation of Apple devices, this will have a significant impact on the global PQC adoption.
For a more detailed overview of the state of PQC in the wider industry, see this blog post by Cloudflare.
Post-quantum KEMs in Kubernetes: an unexpected arrival
So, what does this mean for Kubernetes? Kubernetes components, including the API server and kubelet, are built with Go.
As of Kubernetes v1.33, released in April 2025, the project uses Go 1.24. A quick check of the Kubernetes codebase reveals that Config.CurvePreferences is not explicitly set. This leads to a fascinating conclusion: Kubernetes v1.33, by virtue of using Go 1.24, supports hybrid post-quantum X25519MLKEM768 for TLS connections by default!
You can test this yourself. If you set up a Minikube cluster running Kubernetes v1.33.0, you can connect to the API server using a recent OpenSSL client:
$ minikube start --kubernetes-version=v1.33.0 $ kubectl cluster-info Kubernetes control plane is running at https://127.0.0.1:<PORT> $ kubectl config view --minify --raw -o jsonpath='{.clusters[0].cluster.certificate-authority-data}' | base64 -d > ca.crt $ openssl version OpenSSL 3.5.0 8 Apr 2025 (Library: OpenSSL 3.5.0 8 Apr 2025) $ echo -n "Q" | openssl s_client -connect 127.0.0.1:<PORT> -CAfile ca.crt [...] Negotiated TLS1.3 group: X25519MLKEM768 [...] DONE
Lo and behold, the negotiated group is X25519MLKEM768! This is a significant step towards making Kubernetes quantum-safe, seemingly without a major announcement or dedicated KEP (Kubernetes Enhancement Proposal).
The Go version mismatch pitfall
An interesting wrinkle emerged with Go versions 1.23 and 1.24. Go 1.23 included experimental support for a draft version of ML-KEM, identified as X25519Kyber768Draft00. This was also enabled by default if Config.CurvePreferences was nil. Kubernetes v1.32 used Go 1.23. However, Go 1.24 removed the draft support and replaced it with the standardized version X25519MLKEM768.
What happens if a client and server are using mismatched Go versions (one on 1.23, the other on 1.24)? They won't have a common PQC KEM to negotiate, and the handshake will fall back to classical ECC curves (e.g., X25519). How could this happen in practice?
Consider a scenario:
A Kubernetes cluster is running v1.32 (using Go 1.23 and thus X25519Kyber768Draft00). A developer upgrades their kubectl to v1.33, compiled with Go 1.24, only supporting X25519MLKEM768. Now, when kubectl communicates with the v1.32 API server, they no longer share a common PQC algorithm. The connection will downgrade to classical cryptography, silently losing the PQC protection that has been in place. This highlights the importance of understanding the implications of Go version upgrades, and the details of the TLS stack.
Limitations: packet size
One practical consideration with ML-KEM is the size of its public keys with encoded key sizes of around 1.2 kilobytes for ML-KEM-768. This can cause the initial TLS ClientHello message not to fit inside a single TCP/IP packet, given the typical networking constraints (most commonly, the standard Ethernet frame size limit of 1500 bytes). Some TLS libraries or network appliances might not handle this gracefully, assuming the Client Hello always fits in one packet. This issue has been observed in some Kubernetes-related projects and networking components, potentially leading to connection failures when PQC KEMs are used. More details can be found at tldr.fail.
State of Post-Quantum Signatures
While KEMs are seeing broader adoption, PQC digital signatures are further behind in terms of widespread integration into standard toolchains. NIST has published standards for PQC signatures, such as ML-DSA (FIPS-204) and SLH-DSA (FIPS-205). However, implementing these in a way that's broadly usable (e.g., for PQC Certificate Authorities) presents challenges:
Larger Keys and Signatures: PQC signature schemes often have significantly larger public keys and signature sizes compared to classical algorithms like Ed25519 or RSA. For instance, Dilithium2 keys can be 30 times larger than Ed25519 keys, and certificates can be 12 times larger.
Performance: Signing and verification operations can be substantially slower. While some algorithms are on par with classical algorithms, others may have a much higher overhead, sometimes on the order of 10x to 1000x worse performance. To improve this situation, NIST is running a second round of standardization for PQC signatures.
Toolchain Support: Mainstream TLS libraries and CA software do not yet have mature, built-in support for these new signature algorithms. The Go team, for example, has indicated that ML-DSA support is a high priority, but the soonest it might appear in the standard library is Go 1.26 (as of May 2025).
Cloudflare's CIRCL (Cloudflare Interoperable Reusable Cryptographic Library) library implements some PQC signature schemes like variants of Dilithium, and they maintain a fork of Go (cfgo) that integrates CIRCL. Using cfgo, it's possible to experiment with generating certificates signed with PQC algorithms like Ed25519-Dilithium2. However, this requires using a custom Go toolchain and is not yet part of the mainstream Kubernetes or Go distributions.
Conclusion
The journey to a post-quantum secure Kubernetes is underway, and perhaps further along than many realize, thanks to the proactive adoption of ML-KEM in Go. With Kubernetes v1.33, users are already benefiting from hybrid post-quantum key exchange in many TLS connections by default.
However, awareness of potential pitfalls, such as Go version mismatches leading to downgrades and issues with Client Hello packet sizes, is crucial. While PQC for KEMs is becoming a reality, PQC for digital signatures and certificate hierarchies is still in earlier stages of development and adoption for mainstream use. As Kuber
Week Ending July 13, 2025
https://lwkd.info/2025/20250717
Developer News
SIG-Network proposed a new AI Gateway Working Group, dedicated to exploring the intersection of AI and networking. The WG will focus on standardizing how Kubernetes manages AI-specific traffic, with particular attention to routing, filters, and policy requirements for AI workloads.
The KubeCon North America 2025 Maintainer Summit CFP is open and closes soon on July 20th. Make sure to submit your talks before the deadline!
LFX Mentorship 2025 Term 3 is now open for SIGs to submit mentorship project ideas. To propose a project, submit a PR to the project_ideas repository by July 29th 2025. If you have any questions about the LFX mentorship program, feel free to ask in the #sig-contribex.
Release Schedule
Next Deadline: Code and Test Freeze, July 24/25
Code and Test Freeze starts at 0200 UTC on Friday, July 25. Your PRs should all be merged by then.
Kubernetes v1.34.0-beta.0 has been built and pushed using Golang version 1.24.5.
Patch Releases 1.32.7 and 1.31.11 are released. These releases includes bug fixes for Jobs and etcd member promotion in kubeadm.
Featured PRs
132832: add SuccessCriteriaMet status for kubectl get job
This PR updates the kubectl get job output by adding a new SuccessCriteriaMet column; This column indicates whether the job has met its success criteria, based on the Job job successPolicy; This makes it easier for users to see if a job has satisfied its configured success conditions.
132838: Drop Deprecated Etcd Flags in Kubeadm
This PR removes the usage of two long-deprecated etcd flags in Kubeadm:
--experimental-initial-corrupt-check
--experimental-watch-progress-notify-interval
These flags were deprecated in etcd v3.6.0 and removed in v3.7.0; The corresponding functionality is now supported via a feature gate InitialCorruptCheck=true, and a renamed flag --watch-progress-notify-interval (without the experimental prefix).
KEP of the Week
KEP-4427: Relaxed DNS search string validation
This KEP proposes relaxing Kubernetes’ strict DNS validation rules for dnsConfig.searches in Pod specs. It allows underscores (_) and a single dot (.), which are commonly used in real-world DNS use cases like SRV records or to bypass Kubernetes’ internal DNS search paths. Without this change, such configurations are rejected due to RFC-1123 hostname restrictions, making it difficult to support some legacy or external systems
This KEP is tracked as stable in v1.34.
Other Merges
Remaining strPtr replaced with ptr.To
SizeBasedListCostEstimate feature gate added which assigns 1 APF seat per 100KB for LIST requests
Reflector detects unsupported meta.Table GVKs for LIST+WATCH
boolPtrFn replaced with k8s.io/utils/ptr
Service IP processing delayed by 5s during recreate to avoid race conditions
Egress selector support to JWT authenticator
ReplicaSet to ReplicationController conversion test added
DetectCacheInconsistency enabled to compare apiserver cache with etcd and purge inconsistent snapshots
Compactor test added
local-up-cluster cleaned up and support for automated upgrade/downgrade testing added
Compaction revision exposed from compactor
Verbosity of frequent logs in volume binding plugin lowered from V(4) to V(5)
validation-gen adds k8s:enum validators
Kubelet token cache made UID-aware to prevent stale tokens after service account recreation
kubeadm uses named port probe-port for probes in static pod manifests
unschedulablePods struct moved to a separate file
Internal LoadBalancer port uses EndpointSlice container port when targetPort is unspecified
scheduler_perf logs added to report failures in measuring SchedulingThroughput
ServiceAccountTokenCacheType support added to credential provider plugin
Validation error messages simplified by removing redundant field names
validation-gen enhanced with new rules and core refactoring
PreBindPreFlight added and implemented in in-tree plugins
Implications of using hostNetwork with ports documented
kube-proxy considers timeouts when fetching Node objects or NodeIPs as fatal
Inconsistencies reset cache snapshots and block new ones until the cache is marked consistent again
Allocation manager AddPod() unit tests added
Duplicate DaemonSet update validations removed to avoid redundant checks
kube-proxy in nftables mode drops traffic to Services with no endpoints using filter chains at priority 0
In-place pod vertical scaling prioritizes resize requests based on priorityClass and QoS when resources are limited
PodResources API includes only active Pods
CPUManager aligns uncore cache for odd-numbered CPUs
Flag registration moved into kube-apiserver to eliminate global state
Metrics for MutatingAdmissionPolicy
DRA: Improves allocator with better backtracking
Linux masks thermal interrupt info in /proc and /sys
observedGeneration in pod resize conditions fixed under InPlacePodVerticalScaling feature gate
RelaxedEnvironmentVariableValidation test to Conformance
OrderedNamespaceDeletion test to Conformance
Two EndpointSlice e2e tests to Conformance
Promotions
ConsistentListFromCache to GA
KubeletTracing to GA
Version Updates
Bumped dependencies and images to Go 1.24.5 and distroless iptables
Bumped kube-openapi to SHA f3f2b991d03b and updated structured-merge-diff from v4 to v6
Shoutouts
Drew Hagen: Big thanks to @Matteo, @satyampsoni, @Angelos Kolaitis for hovering around late in the day in your time zones to help me cut my first Kubernetes release cut, v1.34.0-alpha.3!!
via Last Week in Kubernetes Development https://lwkd.info/
July 17, 2025 at 12:35PM
Ep29 - Ask Me Anything About Anything with Scott Rosenberg
There are no restrictions in this AMA session. You can ask anything about DevOps, AI, Cloud, Kubernetes, Platform Engineering, containers, or anything else. Scott Rosenberg, regular guest, will be here to help us out.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: Codefresh 🔗 GitOps Argo CD Certifications: https://learning.codefresh.io (use "viktor" for a 50% discount) ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/
▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox
via YouTube https://www.youtube.com/watch?v=4bZgHXrMCmU
AI vs Developer: Can GitHub Copilot or Claude Action Replace My Job?
I challenged two autonomous AI coding agents; GitHub Copilot Code Review and Claude Action to implement the same detailed feature specification without any human help. One of them failed spectacularly, disappointing me with broken promises and poor code quality. The other exceeded my expectations, demonstrating impressive coding skills, clear self-awareness, and effective collaboration. But this experience left me conflicted: if autonomous coding agents can truly handle complex implementations this effectively, are software engineering careers at risk?
In this video, I share the fascinating results of the head-to-head match-up, revealing which AI agent came out on top, why it succeeded, and what this means for the future of programming and our jobs as developers.
AIcoding #GitHubCopilot #ClaudeAI
Consider joining the channel: https://www.youtube.com/c/devopstoolkit/join
▬▬▬▬▬▬ 🔗 Additional Info 🔗 ▬▬▬▬▬▬ ➡ Transcript and commands: https://devopstoolkit.live/ai/ai-vs-developer-can-github-copilot-or-claude-replace-my-job 🔗 GitHub Copilot Coding Agent: https://github.com/features/copilot
▬▬▬▬▬▬ 💰 Sponsorships 💰 ▬▬▬▬▬▬ If you are interested in sponsoring this channel, please visit https://devopstoolkit.live/sponsor for more information. Alternatively, feel free to contact me over Twitter or LinkedIn (see below).
▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/
▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox
▬▬▬▬▬▬ ⏱ Timecodes ⏱ ▬▬▬▬▬▬ 00:00 GitHub Copilot and Claude Action Autonomous Coding Agents 04:32 Copilot's Epic Fail: When AI Gets Overconfident 10:11 Claude's Redemption: The Agent That Changed My Mind 15:30 The Verdict: Should You Trust Autonomous Agents With Your Code? 19:13 The Clear Winner: Why Claude Dominates Autonomous Coding
via YouTube https://www.youtube.com/watch?v=ahTkFqssZxM
Updating an old Ubuntu to a supported version
https://anonymoushash.vmbrasseur.com/2025/07/old-ubuntu-upgrade.html
I host my own Mastodon instance, which generally is pretty easy to maintain. The great team in the Mastodon community does a super job in making it easy to upgrade as they release new versions. I’ve therefore been keeping my Mastodon installation up to date. Go me!
Unfortunately, I haven’t been keeping up on my operating system updates, so my Digital Ocean droplet was still way back on Ubuntu-22.10. The latest LTS release is 24.04. Ooooops.
More unfortunate, the standard do-release-upgrade won’t work between releases that are as far apart as mine is from the latest release. What to do?
The answer is to work my way through the version upgrades manually. This answer on Ask Ubuntu was especially helpful for figuring out how to do this.
Is this a tedious pain in the ass? Yes, yes it is.
Is it entirely my fault for not keeping my OS up-to-date? Also very much yes.
How I did it
Checking the meta-release file for Ubuntu, I see that I need to do two manual upgrades (from kinetic to lunar, then lunar to mantic), then I should be able to use do-release-upgrade from mantic to noble (aka Noble Numbat, aka the current LTS release).
So for each of lunar and mantic, I did these things…
Downloaded the appropriate UpgradeTool from the meta-release file for Ubuntu
Created a directory then unpacked the upgrade tool tar.gz file into it
Solved problems along the way (see below)
Ran the upgrade tool
Then I was able to run do-release-upgrade and, finally after hours of putzing about trying to get the Ubuntu upgrade going, then update Mastodon. Success!
Problems I solved along the way
Irritatingly, the yarn and postgres errors below needed to be fixed before the kinetic to lunar upgrade tool would run successfully.
The yarn gpg key was expired, causing an error during upgrade
The error in question included this line:
The following signatures were invalid: EXPKEYSIG 23E7166788B63E1E Yarn Packaging yarn@dan.cx
According to this issue in the yarn repo, I should’ve just been able to curl the latest GPG key, run apt-key, and all would be well with the world. Except that didn’t work. No, I don’t know why and I don’t much care. I just wanted to get this thing done.
I found an It’s FOSS article about dealing with GPG keys. It’s not the error I was working on, but it was the information I needed to put the correct key in place:
First I backed up the existing key, which was /usr/share/keyrings/yarnkey.gpg.
Then I downloaded and added the latest GPG key:
curl -sS curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | gpg --dearmor | sudo tee /usr/share/keyrings/yarnkey.gpg
And finally, I updated /et/apt/sources.list.d/yarn.list to use that key for decryption:
deb [signed-by=/usr/share/keyrings/yarnkey.gpg] https://dl.yarnpkg.com/debian stable main
That solved the expired key problem. At some point I’ll need to change that back to remove the signed-by bit since I doubt that manually updated key will get any automatic updates.
The postgresql source for apt no longer had a release file for kinetic
The sources for a Postgres update were just as out of date as the ones for Ubuntu. This resulted in this error:
Ign http://apt.postgresql.org/pub/repos/apt kinetic-pgdg InRelease Err http://apt.postgresql.org/pub/repos/apt kinetic-pgdg Release 404 Not Found [IP: 2a04:4e42:2f::820 80]
The answer for this one was pretty easy, once I finally bothered to read the Postgres wiki page for apt. I needed to change /etc/apt/sources.list.d/postgresql.list to point to apt-archive.postgresql.org instead of apt.postgresql.org. The final file contents look like this:
deb [signed-by=/usr/share/postgresql-common/pgdg/apt.postgresql.org.asc] https://apt-archive.postgresql.org/pub/repos/apt kinetic-pgdg main
Fixing that resolved all of the errors and the kinetic to lunar upgrade tool worked without any problems at all.
via {anonymous => 'hash'}; https://anonymoushash.vmbrasseur.com/
July 10, 2025 at 03:00AM
Week Ending July 6, 2025
https://lwkd.info/2025/20250709
Developer News
SIG-Architecture group proposes to form a new Working Group focused on AI Conformance Certification. The WG would define a standardized set of capabilities, APIs, and configurations that Kubernetes clusters must support to reliably and efficiently run AI/ML workloads.
Kubernetes has formed a dedicated Checkpoint/Restore Working Group to integrate native Checkpoint/Restore functionality, enabling container migration and workload pre-emption to improve resource efficiency and support advanced use cases like AI/ML.
Release Schedule
Next Deadline: Code and Test Freeze, July 24/25
Code and Test Freeze starts at 0200 UTC on Friday, July 25. Your PRs should all be merged by then. Vyom Yadav has shared mid-cycle status, including 72 tracked changes. Because this means an extra-long Release Blog, the Comms Team requests that leads submit their release highlights early, if you can.
Cherry-picks for the July Patch Releases are due on July 11.
Featured PRs
131641: DRA kubelet: add dra_resource_claims_in_use gauge vector
This PR introduces a new gauge vector metric, dra_resource_claims_in_use, to the Kubelet; This metric tracks active DRA drivers and informs administrators when a driver is in use, ensuring safe removal of drivers without impacting pod operations; This metric is useful to determine if drivers have active ResourceClaims, preventing issues during the driver removal process.
KEP of the Week
KEP-2831: Kubelet Tracing
This KEP adds support for distributed tracing in the kubelet to help diagnose node-level issues like pod creation latency or container startup delays. It solves the problem of limited visibility into how the kubelet talks to the API server and container runtime by exporting trace data. The implementation uses OpenTelemetry to generate and export spans in the OTLP format. An OpenTelemetry Collector, typically deployed as a DaemonSet, receives and forwards this data to a tracing backend.The feature is enabled through the KubeletTracing feature gate and configured using the TracingConfiguration in the kubelet configuration file.
This KEP is tracked as stable in v1.34.
Other Merges
logger.Error replaced with utilruntime.HandleErrorWithXXX where errors cannot be returned
Fix for validation error when specifying resource requirements at the container level for a resource not supported at the pod level
Declarative Validation enabled for CertificateSigningRequest
Names of new Services are validated with NameIsDNSLabel() relaxing pre-existing validation when RelaxedServiceNameValidation feature gate is enabled
allocationManager’s IsPodResizeInProgress method unexported
New dra_resource_claims_in_use kubelet metrics to inform about active ResourceClaims
Statefulset now respects minReadySeconds
CSIDriverRegistry cleaned up
Function to translate named port to port number cleaned up to avoid duplication
Unit tests for VolumePathHandler
Deprecated
In a major refactoring effort, replaced the deprecated package ‘k8s.io/utils/pointer’ with ‘k8s.io/utils/ptr’ across multiple components
Deprecated gogo protocol definitions removed from k8s.io/externaljwt and k8s.io/cri-api
Subprojects and Dependency Updates
cluster-api v1.11.0-alpha.2: releases alpha version for testing
cluster-api-provider-vspherev1.14.0-alpha.2 : releases alpha version for testing
kustomize [shlex] (https//github.com/google/shlex) has been replaced with carapace-shlex, Bump to viper v1.20.0 and Drop usage of forked copies of goyaml.v2 and goyaml.v3
Shoutouts
No shoutouts this week. Want to thank someone awesome in the community? Tag them in the #shoutouts channel.
via Last Week in Kubernetes Development https://lwkd.info/
July 08, 2025 at 10:56PM