Link Sharing
Week Ending October 26, 2025
https://lwkd.info/2025/20251031
Developer News
The steering committee election voting period closed last week. The results will be announced in the public steering meeting next Wednesday.
Some reminders for folks attending KubeCon NA 2025 about the Kubernetes Contributor Hour and the SIG/WG Meet and Greet
Release Schedule
Next Deadline: Code Freeze, 7th November
With the feature blog freeze in place, KEP assignees are expected to open placeholder PRs for their blogs. Please reach out to the Release Comms team for more information. We’re one week away from the v1.35 code freeze. Get your PRs ready and don’t forget to file an early exception if you anticipate any delays!
October patch releases have been skipped altogether.
KEP of the Week
KEP-5007: DRA: Device Binding Conditions
This KEP introduces BindingConditions, enabling the scheduler to delay Pod binding until external resources such as fabric-attached GPUs or FPGAs are confirmed ready. This improves scheduling reliability by preventing premature bindings that could lead to Pod failures or require manual intervention. The mechanism also supports asynchronous or failure-prone scenarios, including remote accelerators and FPGA reprogramming.
This KEP is tracked for beta in v1.35.
Other Merges
DRA resources use eachKey declarative validation to mirror map-key checks and keep generated DV in sync with handwritten rules
CSI NodePublishVolumeRequest now carries pod service account tokens in the gRPC secrets field instead of volume_context
DRA DeviceAttribute now declares its non-discriminated union with +k8s:unionMember, so declarative validation can enforce “exactly one value set”
Add +k8s:maxLength (and +k8s:optional) to NetworkDeviceData so generated DV can cap interfaceName / hardwareAddress lengths and match handwritten validation
Wire storage.k8s.io (StorageClass) into declarative validation and mark provisioner as +k8s:required, so generated DV now matches the old handwritten strategy on create/update
StorageVersionMigration (SVM) graduates to v1beta1 and drops the old v1alpha1/unused fields, so clusters must clean up any storage.k8s.io/v1alpha1 SVM objects before upgrading
kubectl finally drops support for the long-deprecated certificates.k8s.io/v1beta1 CertificateSigningRequest.
Add mtlsclient and mtlsserver for the mtls validations
apiserver cacher’s lister_watcher now exposes WatchList semantics
Enable declarative validation for resource.k8s.io ResourceSlice (v1/v1beta1/v1beta2)
Introduce pod queuing in endpoint/slice controllers
Add k8s-resource-fully-qualified-name format
Implements synthetic create authz permission check for exec, attach, and portforward
Enable Declarative Validation(DV) support for ClusterRole and RoleBinding
Replace HandleCrash and HandleError calls to use context-aware alternative
Bump supported etcd version to v3.5.24 for release v1.32, v1.33, and v1.3
Promotions
Pod Generation to GA
ContainerRestartRules to beta
RelaxedServiceNameValidation to beta
PreferSameTrafficDistribution to GA
Version Updates
etcd sdk to v3.6.5
system-validators to v1.12.1
Subprojects and Dependency Updates
containerd v2.2.0-rc.0 (pre-release) adds a mount manager, supports conf.d includes in the default config, and adds back-references in the garbage collector. It improves CRI with ListPodSandboxMetrics and image-volume subpaths, adds parallel image unpack and a referrers fetcher, updates EROFS snapshotter, enables OTEL traces and WASM plugin support in NRI, speeds shim reloads, and postpones some deprecations to 2.3.
containerd API v1.10.0-rc.0 (pre-release) aligns with containerd 2.2, introducing the mount manager and parallel unpack support in the API.
prometheus v3.7.3 fixes a UI redirect regression with -web.external-url and -web.route-prefix, corrects federation for some native histograms, fixes a promtool check config failure when --lint=none is set, and resolves a remote-write queue resharding deadlock.
via Last Week in Kubernetes Development https://lwkd.info/
October 31, 2025 at 02:41PM
Ep38 - Ask Me Anything About Anything with Scott Rosenberg
There are no restrictions in this AMA session. You can ask anything about DevOps, AI, Cloud, Kubernetes, Platform Engineering, containers, or anything else. Scott Rosenberg, a regular guest, will be here to help us out.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: Octopus 🔗 Enterprise Support for Argo: https://octopus.com/support/enterprise-argo-support ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/
▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox
via YouTube https://www.youtube.com/watch?v=-nYVMVQosHc
Our Journey to GitOps: Migrating to ArgoCD with Zero Downtime, with Andrew Jeffree
Andrew Jeffree from SafetyCulture walks through their complete migration of 250+ microservices from a fragile Helm-based setup to GitOps with ArgoCD, all without any downtime. He explains how they replaced YAML configurations with a domain-specific language built in CUE, creating a better developer experience while adding stronger validation and reducing operational pain points.
You will learn:
Zero-downtime migration techniques using temporary deployments with prune-last sync options to ensure healthy services before removing legacy ones
How CUE lang improves on YAML by providing schema validation, early error detection, and a cleaner interface for developers
Human-centric platform engineering approaches that prioritize developer experience and reduce on-call burden through empathy-driven design decisions
Sponsor
This episode is brought to you by Testkube—where teams run millions of performance tests in real Kubernetes infrastructure. From air-gapped environments to massive scale deployments, orchestrate every testing tool in one platform. Check it out at testkube.io
More info
Find all the links and info for this episode here: https://ku.bz/Xvyp1_Qcv
Interested in sponsoring an episode? Learn more.
via KubeFM https://kube.fm
October 28, 2025 at 06:00AM
Self-Healing Kubernetes: When to Use AI vs Traditional Automation
Tired of being woken up at 2 AM to manually troubleshoot Kubernetes incidents that could be fixed automatically? This video explores how to build intelligent self-healing systems that watch Kubernetes events, analyze problems, and remediate issues before they ruin your weekend. We'll break down the complete automation pipeline—from understanding how Kubernetes events work and what makes them ideal triggers, to implementing a maturity progression from manual firefighting through rule-based automation to AI-assisted remediation.
Learn when traditional automation works best (alerting and known patterns), where AI genuinely excels (analysis and unknown scenarios), and how to strategically combine both approaches. We'll cover the three phases of incident response—alerting, analysis, and remediation—and show you how to build systems that handle knowns with efficient controllers while leveraging AI for novel problems. The key is creating feedback loops that continuously graduate unknowns into automated knowns, progressively shrinking the surface area where human intervention is needed. Includes links to open-source projects demonstrating these principles in production.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: JFrog Fly 🔗 https://jfrog.com/fly_viktor ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Kubernetes #SelfHealingSystems #AIAutomation
Consider joining the channel: https://www.youtube.com/c/devopstoolkit/join
▬▬▬▬▬▬ 🔗 Additional Info 🔗 ▬▬▬▬▬▬ ➡ Transcript and commands: https://devopstoolkit.live/kubernetes/self-healing-kubernetes-when-to-use-ai-vs-traditional-automation 🔗 DevOps AI Toolkit: https://github.com/vfarcic/dot-ai
▬▬▬▬▬▬ 💰 Sponsorships 💰 ▬▬▬▬▬▬ If you are interested in sponsoring this channel, please visit https://devopstoolkit.live/sponsor for more information. Alternatively, feel free to contact me over Twitter or LinkedIn (see below).
▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/
▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox
▬▬▬▬▬▬ ⏱ Timecodes ⏱ ▬▬▬▬▬▬ 00:00 Kubernetes Remediation 01:15 JFrog fly (sponsor) 02:43 Kubernetes Events Explained 06:21 Kubernetes Automation Pipeline 12:46 AI-Powered Kubernetes Remediation 19:26 Building Self-Healing Systems
via YouTube https://www.youtube.com/watch?v=rIdcJYLtCdo