
1_r/devopsish
Week Ending July 13, 2025
https://lwkd.info/2025/20250717
Developer News
SIG-Network proposed a new AI Gateway Working Group, dedicated to exploring the intersection of AI and networking. The WG will focus on standardizing how Kubernetes manages AI-specific traffic, with particular attention to routing, filters, and policy requirements for AI workloads.
The KubeCon North America 2025 Maintainer Summit CFP is open and closes soon on July 20th. Make sure to submit your talks before the deadline!
LFX Mentorship 2025 Term 3 is now open for SIGs to submit mentorship project ideas. To propose a project, submit a PR to the project_ideas repository by July 29th 2025. If you have any questions about the LFX mentorship program, feel free to ask in the #sig-contribex.
Release Schedule
Next Deadline: Code and Test Freeze, July 24/25
Code and Test Freeze starts at 0200 UTC on Friday, July 25. Your PRs should all be merged by then.
Kubernetes v1.34.0-beta.0 has been built and pushed using Golang version 1.24.5.
Patch Releases 1.32.7 and 1.31.11 are released. These releases includes bug fixes for Jobs and etcd member promotion in kubeadm.
Featured PRs
132832: add SuccessCriteriaMet status for kubectl get job
This PR updates the kubectl get job output by adding a new SuccessCriteriaMet column; This column indicates whether the job has met its success criteria, based on the Job job successPolicy; This makes it easier for users to see if a job has satisfied its configured success conditions.
132838: Drop Deprecated Etcd Flags in Kubeadm
This PR removes the usage of two long-deprecated etcd flags in Kubeadm:
--experimental-initial-corrupt-check
--experimental-watch-progress-notify-interval
These flags were deprecated in etcd v3.6.0 and removed in v3.7.0; The corresponding functionality is now supported via a feature gate InitialCorruptCheck=true, and a renamed flag --watch-progress-notify-interval (without the experimental prefix).
KEP of the Week
KEP-4427: Relaxed DNS search string validation
This KEP proposes relaxing Kubernetes’ strict DNS validation rules for dnsConfig.searches in Pod specs. It allows underscores (_) and a single dot (.), which are commonly used in real-world DNS use cases like SRV records or to bypass Kubernetes’ internal DNS search paths. Without this change, such configurations are rejected due to RFC-1123 hostname restrictions, making it difficult to support some legacy or external systems
This KEP is tracked as stable in v1.34.
Other Merges
Remaining strPtr replaced with ptr.To
SizeBasedListCostEstimate feature gate added which assigns 1 APF seat per 100KB for LIST requests
Reflector detects unsupported meta.Table GVKs for LIST+WATCH
boolPtrFn replaced with k8s.io/utils/ptr
Service IP processing delayed by 5s during recreate to avoid race conditions
Egress selector support to JWT authenticator
ReplicaSet to ReplicationController conversion test added
DetectCacheInconsistency enabled to compare apiserver cache with etcd and purge inconsistent snapshots
Compactor test added
local-up-cluster cleaned up and support for automated upgrade/downgrade testing added
Compaction revision exposed from compactor
Verbosity of frequent logs in volume binding plugin lowered from V(4) to V(5)
validation-gen adds k8s:enum validators
Kubelet token cache made UID-aware to prevent stale tokens after service account recreation
kubeadm uses named port probe-port for probes in static pod manifests
unschedulablePods struct moved to a separate file
Internal LoadBalancer port uses EndpointSlice container port when targetPort is unspecified
scheduler_perf logs added to report failures in measuring SchedulingThroughput
ServiceAccountTokenCacheType support added to credential provider plugin
Validation error messages simplified by removing redundant field names
validation-gen enhanced with new rules and core refactoring
PreBindPreFlight added and implemented in in-tree plugins
Implications of using hostNetwork with ports documented
kube-proxy considers timeouts when fetching Node objects or NodeIPs as fatal
Inconsistencies reset cache snapshots and block new ones until the cache is marked consistent again
Allocation manager AddPod() unit tests added
Duplicate DaemonSet update validations removed to avoid redundant checks
kube-proxy in nftables mode drops traffic to Services with no endpoints using filter chains at priority 0
In-place pod vertical scaling prioritizes resize requests based on priorityClass and QoS when resources are limited
PodResources API includes only active Pods
CPUManager aligns uncore cache for odd-numbered CPUs
Flag registration moved into kube-apiserver to eliminate global state
Metrics for MutatingAdmissionPolicy
DRA: Improves allocator with better backtracking
Linux masks thermal interrupt info in /proc and /sys
observedGeneration in pod resize conditions fixed under InPlacePodVerticalScaling feature gate
RelaxedEnvironmentVariableValidation test to Conformance
OrderedNamespaceDeletion test to Conformance
Two EndpointSlice e2e tests to Conformance
Promotions
ConsistentListFromCache to GA
KubeletTracing to GA
Version Updates
Bumped dependencies and images to Go 1.24.5 and distroless iptables
Bumped kube-openapi to SHA f3f2b991d03b and updated structured-merge-diff from v4 to v6
Shoutouts
Drew Hagen: Big thanks to @Matteo, @satyampsoni, @Angelos Kolaitis for hovering around late in the day in your time zones to help me cut my first Kubernetes release cut, v1.34.0-alpha.3!!
via Last Week in Kubernetes Development https://lwkd.info/
July 17, 2025 at 12:35PM
Ep29 - Ask Me Anything About Anything with Scott Rosenberg
There are no restrictions in this AMA session. You can ask anything about DevOps, AI, Cloud, Kubernetes, Platform Engineering, containers, or anything else. Scott Rosenberg, regular guest, will be here to help us out.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: Codefresh 🔗 GitOps Argo CD Certifications: https://learning.codefresh.io (use "viktor" for a 50% discount) ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/
▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox
via YouTube https://www.youtube.com/watch?v=4bZgHXrMCmU
AI vs Developer: Can GitHub Copilot or Claude Action Replace My Job?
I challenged two autonomous AI coding agents; GitHub Copilot Code Review and Claude Action to implement the same detailed feature specification without any human help. One of them failed spectacularly, disappointing me with broken promises and poor code quality. The other exceeded my expectations, demonstrating impressive coding skills, clear self-awareness, and effective collaboration. But this experience left me conflicted: if autonomous coding agents can truly handle complex implementations this effectively, are software engineering careers at risk?
In this video, I share the fascinating results of the head-to-head match-up, revealing which AI agent came out on top, why it succeeded, and what this means for the future of programming and our jobs as developers.
AIcoding #GitHubCopilot #ClaudeAI
Consider joining the channel: https://www.youtube.com/c/devopstoolkit/join
▬▬▬▬▬▬ 🔗 Additional Info 🔗 ▬▬▬▬▬▬ ➡ Transcript and commands: https://devopstoolkit.live/ai/ai-vs-developer-can-github-copilot-or-claude-replace-my-job 🔗 GitHub Copilot Coding Agent: https://github.com/features/copilot
▬▬▬▬▬▬ 💰 Sponsorships 💰 ▬▬▬▬▬▬ If you are interested in sponsoring this channel, please visit https://devopstoolkit.live/sponsor for more information. Alternatively, feel free to contact me over Twitter or LinkedIn (see below).
▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/
▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox
▬▬▬▬▬▬ ⏱ Timecodes ⏱ ▬▬▬▬▬▬ 00:00 GitHub Copilot and Claude Action Autonomous Coding Agents 04:32 Copilot's Epic Fail: When AI Gets Overconfident 10:11 Claude's Redemption: The Agent That Changed My Mind 15:30 The Verdict: Should You Trust Autonomous Agents With Your Code? 19:13 The Clear Winner: Why Claude Dominates Autonomous Coding
via YouTube https://www.youtube.com/watch?v=ahTkFqssZxM
Updating an old Ubuntu to a supported version
https://anonymoushash.vmbrasseur.com/2025/07/old-ubuntu-upgrade.html
I host my own Mastodon instance, which generally is pretty easy to maintain. The great team in the Mastodon community does a super job in making it easy to upgrade as they release new versions. I’ve therefore been keeping my Mastodon installation up to date. Go me!
Unfortunately, I haven’t been keeping up on my operating system updates, so my Digital Ocean droplet was still way back on Ubuntu-22.10. The latest LTS release is 24.04. Ooooops.
More unfortunate, the standard do-release-upgrade won’t work between releases that are as far apart as mine is from the latest release. What to do?
The answer is to work my way through the version upgrades manually. This answer on Ask Ubuntu was especially helpful for figuring out how to do this.
Is this a tedious pain in the ass? Yes, yes it is.
Is it entirely my fault for not keeping my OS up-to-date? Also very much yes.
How I did it
Checking the meta-release file for Ubuntu, I see that I need to do two manual upgrades (from kinetic to lunar, then lunar to mantic), then I should be able to use do-release-upgrade from mantic to noble (aka Noble Numbat, aka the current LTS release).
So for each of lunar and mantic, I did these things…
Downloaded the appropriate UpgradeTool from the meta-release file for Ubuntu
Created a directory then unpacked the upgrade tool tar.gz file into it
Solved problems along the way (see below)
Ran the upgrade tool
Then I was able to run do-release-upgrade and, finally after hours of putzing about trying to get the Ubuntu upgrade going, then update Mastodon. Success!
Problems I solved along the way
Irritatingly, the yarn and postgres errors below needed to be fixed before the kinetic to lunar upgrade tool would run successfully.
The yarn gpg key was expired, causing an error during upgrade
The error in question included this line:
The following signatures were invalid: EXPKEYSIG 23E7166788B63E1E Yarn Packaging yarn@dan.cx
According to this issue in the yarn repo, I should’ve just been able to curl the latest GPG key, run apt-key, and all would be well with the world. Except that didn’t work. No, I don’t know why and I don’t much care. I just wanted to get this thing done.
I found an It’s FOSS article about dealing with GPG keys. It’s not the error I was working on, but it was the information I needed to put the correct key in place:
First I backed up the existing key, which was /usr/share/keyrings/yarnkey.gpg.
Then I downloaded and added the latest GPG key:
curl -sS curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | gpg --dearmor | sudo tee /usr/share/keyrings/yarnkey.gpg
And finally, I updated /et/apt/sources.list.d/yarn.list to use that key for decryption:
deb [signed-by=/usr/share/keyrings/yarnkey.gpg] https://dl.yarnpkg.com/debian stable main
That solved the expired key problem. At some point I’ll need to change that back to remove the signed-by bit since I doubt that manually updated key will get any automatic updates.
The postgresql source for apt no longer had a release file for kinetic
The sources for a Postgres update were just as out of date as the ones for Ubuntu. This resulted in this error:
Ign http://apt.postgresql.org/pub/repos/apt kinetic-pgdg InRelease Err http://apt.postgresql.org/pub/repos/apt kinetic-pgdg Release 404 Not Found [IP: 2a04:4e42:2f::820 80]
The answer for this one was pretty easy, once I finally bothered to read the Postgres wiki page for apt. I needed to change /etc/apt/sources.list.d/postgresql.list to point to apt-archive.postgresql.org instead of apt.postgresql.org. The final file contents look like this:
deb [signed-by=/usr/share/postgresql-common/pgdg/apt.postgresql.org.asc] https://apt-archive.postgresql.org/pub/repos/apt kinetic-pgdg main
Fixing that resolved all of the errors and the kinetic to lunar upgrade tool worked without any problems at all.
via {anonymous => 'hash'}; https://anonymoushash.vmbrasseur.com/
July 10, 2025 at 03:00AM
Week Ending July 6, 2025
https://lwkd.info/2025/20250709
Developer News
SIG-Architecture group proposes to form a new Working Group focused on AI Conformance Certification. The WG would define a standardized set of capabilities, APIs, and configurations that Kubernetes clusters must support to reliably and efficiently run AI/ML workloads.
Kubernetes has formed a dedicated Checkpoint/Restore Working Group to integrate native Checkpoint/Restore functionality, enabling container migration and workload pre-emption to improve resource efficiency and support advanced use cases like AI/ML.
Release Schedule
Next Deadline: Code and Test Freeze, July 24/25
Code and Test Freeze starts at 0200 UTC on Friday, July 25. Your PRs should all be merged by then. Vyom Yadav has shared mid-cycle status, including 72 tracked changes. Because this means an extra-long Release Blog, the Comms Team requests that leads submit their release highlights early, if you can.
Cherry-picks for the July Patch Releases are due on July 11.
Featured PRs
131641: DRA kubelet: add dra_resource_claims_in_use gauge vector
This PR introduces a new gauge vector metric, dra_resource_claims_in_use, to the Kubelet; This metric tracks active DRA drivers and informs administrators when a driver is in use, ensuring safe removal of drivers without impacting pod operations; This metric is useful to determine if drivers have active ResourceClaims, preventing issues during the driver removal process.
KEP of the Week
KEP-2831: Kubelet Tracing
This KEP adds support for distributed tracing in the kubelet to help diagnose node-level issues like pod creation latency or container startup delays. It solves the problem of limited visibility into how the kubelet talks to the API server and container runtime by exporting trace data. The implementation uses OpenTelemetry to generate and export spans in the OTLP format. An OpenTelemetry Collector, typically deployed as a DaemonSet, receives and forwards this data to a tracing backend.The feature is enabled through the KubeletTracing feature gate and configured using the TracingConfiguration in the kubelet configuration file.
This KEP is tracked as stable in v1.34.
Other Merges
logger.Error replaced with utilruntime.HandleErrorWithXXX where errors cannot be returned
Fix for validation error when specifying resource requirements at the container level for a resource not supported at the pod level
Declarative Validation enabled for CertificateSigningRequest
Names of new Services are validated with NameIsDNSLabel() relaxing pre-existing validation when RelaxedServiceNameValidation feature gate is enabled
allocationManager’s IsPodResizeInProgress method unexported
New dra_resource_claims_in_use kubelet metrics to inform about active ResourceClaims
Statefulset now respects minReadySeconds
CSIDriverRegistry cleaned up
Function to translate named port to port number cleaned up to avoid duplication
Unit tests for VolumePathHandler
Deprecated
In a major refactoring effort, replaced the deprecated package ‘k8s.io/utils/pointer’ with ‘k8s.io/utils/ptr’ across multiple components
Deprecated gogo protocol definitions removed from k8s.io/externaljwt and k8s.io/cri-api
Subprojects and Dependency Updates
cluster-api v1.11.0-alpha.2: releases alpha version for testing
cluster-api-provider-vspherev1.14.0-alpha.2 : releases alpha version for testing
kustomize [shlex] (https//github.com/google/shlex) has been replaced with carapace-shlex, Bump to viper v1.20.0 and Drop usage of forked copies of goyaml.v2 and goyaml.v3
Shoutouts
No shoutouts this week. Want to thank someone awesome in the community? Tag them in the #shoutouts channel.
via Last Week in Kubernetes Development https://lwkd.info/
July 08, 2025 at 10:56PM