
Suggested Reads
Week Ending June 22, 2025
https://lwkd.info/2025/20250627
Developer News
Having completed their work, WG-Policy is being archived. Congrats Policy team!
There is an ongoing discussion in the Kubernetes community regarding the Slack migration, and new platform options are currently being evaluated. Please share your thoughts to help shortlist a suitable new platform.
The CFPS for the CNCF-hosted Co-located Events North America 2025 are closing soon. Make sure to submit your proposals by June 30th.
The KubeCon North America 2025 Maintainer Summit CFP is also open. Please submit your sessions by July 20th.
Release Schedule
Next Deadline: Open Doc Placeholders, July 3
With 70 enhancements tracked, it’s time to wrap up work on those changes. The next step is opening a Docs placeholder PR so that the Docs team knows that you’ll be ready by Docs deadline on Jul 29. Didn’t get your Enhancement approved in time? You have until July 7th to request an exception.
Patch releases v1.33.2, 1.32.6, 1.31.10 and 1.30.14 are released, including a security update for Golang. This is likely to be the last patch release for Kubernetes 1.30, so users on that version should plan to upgrade soon.
Featured PRs
132504: Introduce OpenAPI format support for k8s-short-name and k8s-long-name
This PR introduces support for k8s-short-name and k8s-long-name in OpenAPI schema validation for Custom Resource Definitions (CRDs); These formats are now recognized in the OpenAPI validation of CRD schemas, allowing Kubernetes-native name formats to be used consistently in the validation of CRD fields.
126619: Show namespace on delete
This PR updates the kubectl delete command to include the namespace in the output, improving clarity when resources are deleted across multiple namespaces; Previously, the output could be ambiguous, especially when targeting resources in different namespaces; This enhancement helps to avoid confusion by explicitly showing the namespace during delete operations.
KEP of the Week
KEP 4800: Split UncoreCache Topology Awareness in CPU Manager
This KEP introduced a new static policy prefer-align-cpus-by-uncorecache for the CPU Manager that groups CPU resources by uncore cache where possible. An uncore cache refers to the cache that exists at a shared level among CPU cores. This is primarily beneficial for CPU architectures that utilize multiple uncore caches, or split uncore caches, within the processor.
This KEP is tracked for beta in v1.34.
Other Merges
Actively poll for namespace termination instead of sleeping
Fix for being able to custom resources with server side apply even when its CustomResourceDefinition was terminating
e2e/watchlist test for checking metadata informer
apimachinery/pkg/util/errors to deprecate MessageCountMap
API response for StorageClassList to return a graceful error message if the provided ResourceVersion is too large
MutableCSINodeAllocatableCount storage e2e test refactored to use the Mock CSI driver
omitempty and opt tag added to the API v1beta2 AdminAccess type in the DeviceRequestAllocationResult struct
Job controller now uses controller UID index for pod lookups
ListAll and ListAllByNamespace optimized to return directly when there is nothing to select
Cleanup after alpha feature MountContainers was removed
New runtime.ApplyConfiguration interface added that is implemented by all generated applyconfigs
cloud provider calls in storage/volume_provisioning.go removed
Usage of deprecated function ExtractCommentTags migrated to ExtractFunctionStyleCommentTags
Delay added to node updates after kubelet startup
Conntrack reconciler now considers service’s target port during cleanup of stale flow entries
kube-scheduler: Apply EnablePlugins to CoreResourceEnqueueTestCases
etcd server overrides to etcd probe factory for healthz and readyz
endpointsleases and configmapsleases options removed from leader-elect-resource-lock in LeaderElectionConfiguration
Deprecated –register-schedulable command line argument removed from the kubelet
Promotions
JobPodReplacementPolicy to GA
Subprojects and Dependency Updates
containerd v2.1.3: fixes registry fetch and transfer service issues
cluster-api v1.11.0-alpha.1: releases alpha version for testing
Shoutouts
Josh Berkus (@jberkus): Kudos to Mario Fahlandt (@Mario Fahlandt) for figuring out how to back up private channels from Slack.
via Last Week in Kubernetes Development https://lwkd.info/
June 27, 2025 at 09:08AM
On with the show
https://anonymoushash.vmbrasseur.com/2025/06/on-with-the-show.html
Well, that didn’t work out as hoped, but unfortunately it did work out as expected.
I was legitimately excited by the opportunity and potential for open source and open collaboration in the digital agriculture space. I still am, to be honest. Not only is it good for humanity, there are a lot of fantasticly strong business strategies that it could enable.
So when I noted some red flags during the interview process, that opportunity and potential were great enough that I made the informed choice to move forward with the job anyway. There were still far more unknowns than knowns and it was possible (and maybe even likely) that the red flags would turn out to be false alarms.
As you’ve already figured out, that wasn’t the case. First the CEO and other execs were shown the door in December, then the interim leadership made a number of large organisational changes. These changes included cutting a very significant percentage of total headcount, myself included. It’s little comfort knowing that I was in good company when the company laid me off in April, but little is better than none. At least it wasn’t personal.
In the past two months I’ve caught up on sleep, made my yearly pilgrimage to Montreal, hosted guests, and done a little work for the fantastic folks at Open Robotics. I’m feeling much better following my months at Semios and can once again turn my mind to all things strategy, operations, business, and open source.
I’m considering my options for what to do next and having conversations with folks to clarify that direction. Will I remain in the corporate world, or is it finally time for me to dedicate myself fully to the nonprofit space? Do I stay with open source or will I apply my experience and strategic skills more broadly?
I’m figuring out the answers to these and other questions. If you’d like to be a part of those conversations—or if you’d just like to catch up or say howdy—drop me a line. I’d welcome the chance to chat.
via {anonymous => 'hash'}; https://anonymoushash.vmbrasseur.com/
June 26, 2025 at 03:00AM
Image Compatibility In Cloud Native Environments
https://kubernetes.io/blog/2025/06/25/image-compatibility-in-cloud-native-environments/
In industries where systems must run very reliably and meet strict performance criteria such as telecommunication, high-performance or AI computing, containerized applications often need specific operating system configuration or hardware presence. It is common practice to require the use of specific versions of the kernel, its configuration, device drivers, or system components. Despite the existence of the Open Container Initiative (OCI), a governing community to define standards and specifications for container images, there has been a gap in expression of such compatibility requirements. The need to address this issue has led to different proposals and, ultimately, an implementation in Kubernetes' Node Feature Discovery (NFD).
NFD is an open source Kubernetes project that automatically detects and reports hardware and system features of cluster nodes. This information helps users to schedule workloads on nodes that meet specific system requirements, which is especially useful for applications with strict hardware or operating system dependencies.
The need for image compatibility specification
Dependencies between containers and host OS
A container image is built on a base image, which provides a minimal runtime environment, often a stripped-down Linux userland, completely empty or distroless. When an application requires certain features from the host OS, compatibility issues arise. These dependencies can manifest in several ways:
Drivers: Host driver versions must match the supported range of a library version inside the container to avoid compatibility problems. Examples include GPUs and network drivers.
Libraries or Software: The container must come with a specific version or range of versions for a library or software to run optimally in the environment. Examples from high performance computing are MPI, EFA, or Infiniband.
Kernel Modules or Features:: Specific kernel features or modules must be present. Examples include having support of write protected huge page faults, or the presence of VFIO
And more…
While containers in Kubernetes are the most likely unit of abstraction for these needs, the definition of compatibility can extend further to include other container technologies such as Singularity and other OCI artifacts such as binaries from a spack binary cache.
Multi-cloud and hybrid cloud challenges
Containerized applications are deployed across various Kubernetes distributions and cloud providers, where different host operating systems introduce compatibility challenges. Often those have to be pre-configured before workload deployment or are immutable. For instance, different cloud providers will include different operating systems like:
RHCOS/RHEL
Photon OS
Amazon Linux 2
Container-Optimized OS
Azure Linux OS
And more...
Each OS comes with unique kernel versions, configurations, and drivers, making compatibility a non-trivial issue for applications requiring specific features. It must be possible to quickly assess a container for its suitability to run on any specific environment.
Image compatibility initiative
An effort was made within the Open Containers Initiative Image Compatibility working group to introduce a standard for image compatibility metadata. A specification for compatibility would allow container authors to declare required host OS features, making compatibility requirements discoverable and programmable. The specification implemented in Kubernetes Node Feature Discovery is one of the discussed proposals. It aims to:
Define a structured way to express compatibility in OCI image manifests.
Support a compatibility specification alongside container images in image registries.
Allow automated validation of compatibility before scheduling containers.
The concept has since been implemented in the Kubernetes Node Feature Discovery project.
Implementation in Node Feature Discovery
The solution integrates compatibility metadata into Kubernetes via NFD features and the NodeFeatureGroup API. This interface enables the user to match containers to nodes based on exposing features of hardware and software, allowing for intelligent scheduling and workload optimization.
Compatibility specification
The compatibility specification is a structured list of compatibility objects containing Node Feature Groups. These objects define image requirements and facilitate validation against host nodes. The feature requirements are described by using the list of available features from the NFD project. The schema has the following structure:
version (string) - Specifies the API version.
compatibilities (array of objects) - List of compatibility sets.
rules (object) - Specifies NodeFeatureGroup to define image requirements.
weight (int, optional) - Node affinity weight.
tag (string, optional) - Categorization tag.
description (string, optional) - Short description.
An example might look like the following:
version: v1alpha1 compatibilities:
- description: "My image requirements"
rules:
- name: "kernel and cpu" matchFeatures:
- feature: kernel.loadedmodule matchExpressions: vfio-pci: {op: Exists}
- feature: cpu.model matchExpressions: vendor_id: {op: In, value: ["Intel", "AMD"]}
- name: "one of available nics" matchAny:
- matchFeatures:
- feature: pci.device matchExpressions: vendor: {op: In, value: ["0eee"]} class: {op: In, value: ["0200"]}
- matchFeatures:
- feature: pci.device matchExpressions: vendor: {op: In, value: ["0fff"]} class: {op: In, value: ["0200"]}
Client implementation for node validation
To streamline compatibility validation, we implemented a client tool that allows for node validation based on an image's compatibility artifact. In this workflow, the image author would generate a compatibility artifact that points to the image it describes in a registry via the referrers API. When a need arises to assess the fit of an image to a host, the tool can discover the artifact and verify compatibility of an image to a node before deployment. The client can validate nodes both inside and outside a Kubernetes cluster, extending the utility of the tool beyond the single Kubernetes use case. In the future, image compatibility could play a crucial role in creating specific workload profiles based on image compatibility requirements, aiding in more efficient scheduling. Additionally, it could potentially enable automatic node configuration to some extent, further optimizing resource allocation and ensuring seamless deployment of specialized workloads.
Examples of usage
Define image compatibility metadata
A container image can have metadata that describes its requirements based on features discovered from nodes, like kernel modules or CPU models. The previous compatibility specification example in this article exemplified this use case.
Attach the artifact to the image
The image compatibility specification is stored as an OCI artifact. You can attach this metadata to your container image using the oras tool. The registry only needs to support OCI artifacts, support for arbitrary types is not required. Keep in mind that the container image and the artifact must be stored in the same registry. Use the following command to attach the artifact to the image:
oras attach \ --artifact-type application/vnd.nfd.image-compatibility.v1alpha1 <image-url> \ <path-to-spec>.yaml:application/vnd.nfd.image-compatibility.spec.v1alpha1+yaml
Validate image compatibility
After attaching the compatibility specification, you can validate whether a node meets the image's requirements. This validation can be done using the nfd client:
nfd compat validate-node --image <image-url>
Read the output from the client
Finally you can read the report generated by the tool or use your own tools to act based on the generated JSON report.
Conclusion
The addition of image compatibility to Kubernetes through Node Feature Discovery underscores the growing importance of addressing compatibility in cloud native environments. It is only a start, as further work is needed to integrate compatibility into scheduling of workloads within and outside of Kubernetes. However, by integrating this feature into Kubernetes, mission-critical workloads can now define and validate host OS requirements more efficiently. Moving forward, the adoption of compatibility metadata within Kubernetes ecosystems will significantly enhance the reliability and performance of specialized containerized applications, ensuring they meet the stringent requirements of industries like telecommunications, high-performance computing or any environment that requires special hardware or host OS configuration.
Get involved
Join the Kubernetes Node Feature Discovery project if you're interested in getting involved with the design and development of Image Compatibility API and tools. We always welcome new contributors.
via Kubernetes Blog https://kubernetes.io/
June 24, 2025 at 08:00PM
Dear friend, you have built a Kubernetes, with Mac Chaffee
Mac Chaffee, a platform engineer and security champion, examines why developers often underestimate the complexity of running modern applications and how overconfidence leads to expensive technical mistakes.
You will learn:
Why teams reject Kubernetes then rebuild it piece by piece - understanding the psychological factors, like overconfidence, that drive initial rejection of complex but proven tools
How to identify the tipping point when DIY solutions become more complex than adopting established orchestration tools, especially around scaling and high availability challenges
The right approach to abstracting Kubernetes complexity - why hiding the Kubernetes API often backfires and how to build effective guardrails instead of reinventing interfaces
Why mentorship gaps lead to poor technical decisions - how the lack of proper apprenticeship programs in tech results in teams making expensive mistakes when building infrastructure
Sponsor
This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/9nFPmG85f
Interested in sponsoring an episode? Learn more.
via KubeFM https://kube.fm
June 24, 2025 at 06:00AM
Ep25 - Ask Me Anything About Anything with Scott Rosenberg
There are no restrictions in this AMA session. You can ask anything about DevOps, Cloud, Kubernetes, Platform Engineering, containers, or anything else. Scott Rosenberg, regular guest, will be here to help us out.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: Codefresh 🔗 GitOps Argo CD Certifications: https://learning.codefresh.io (use "viktor" for a 50% discount) ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/
▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox
via YouTube https://www.youtube.com/watch?v=6UZnp38Txf4