Suggested Reads

Suggested Reads

54794 bookmarks
Newest
The Kubernetes Course 2025
The Kubernetes Course 2025
🚀 Welcome to the ultimate Kubernetes Course! Whether you're just starting out or want to level up your Kubernetes skills, this hands-on course walks you thr...
·youtu.be·
The Kubernetes Course 2025
Last Week in Kubernetes Development - Week Ending June 22 2025
Last Week in Kubernetes Development - Week Ending June 22 2025

Week Ending June 22, 2025

https://lwkd.info/2025/20250627

Developer News

Having completed their work, WG-Policy is being archived. Congrats Policy team!

There is an ongoing discussion in the Kubernetes community regarding the Slack migration, and new platform options are currently being evaluated. Please share your thoughts to help shortlist a suitable new platform.

The CFPS for the CNCF-hosted Co-located Events North America 2025 are closing soon. Make sure to submit your proposals by June 30th.

The KubeCon North America 2025 Maintainer Summit CFP is also open. Please submit your sessions by July 20th.

Release Schedule

Next Deadline: Open Doc Placeholders, July 3

With 70 enhancements tracked, it’s time to wrap up work on those changes. The next step is opening a Docs placeholder PR so that the Docs team knows that you’ll be ready by Docs deadline on Jul 29. Didn’t get your Enhancement approved in time? You have until July 7th to request an exception.

Patch releases v1.33.2, 1.32.6, 1.31.10 and 1.30.14 are released, including a security update for Golang. This is likely to be the last patch release for Kubernetes 1.30, so users on that version should plan to upgrade soon.

Featured PRs

132504: Introduce OpenAPI format support for k8s-short-name and k8s-long-name

This PR introduces support for k8s-short-name and k8s-long-name in OpenAPI schema validation for Custom Resource Definitions (CRDs); These formats are now recognized in the OpenAPI validation of CRD schemas, allowing Kubernetes-native name formats to be used consistently in the validation of CRD fields.

126619: Show namespace on delete

This PR updates the kubectl delete command to include the namespace in the output, improving clarity when resources are deleted across multiple namespaces; Previously, the output could be ambiguous, especially when targeting resources in different namespaces; This enhancement helps to avoid confusion by explicitly showing the namespace during delete operations.

KEP of the Week

KEP 4800: Split UncoreCache Topology Awareness in CPU Manager

This KEP introduced a new static policy prefer-align-cpus-by-uncorecache for the CPU Manager that groups CPU resources by uncore cache where possible. An uncore cache refers to the cache that exists at a shared level among CPU cores. This is primarily beneficial for CPU architectures that utilize multiple uncore caches, or split uncore caches, within the processor.

This KEP is tracked for beta in v1.34.

Other Merges

Actively poll for namespace termination instead of sleeping

Fix for being able to custom resources with server side apply even when its CustomResourceDefinition was terminating

e2e/watchlist test for checking metadata informer

apimachinery/pkg/util/errors to deprecate MessageCountMap

API response for StorageClassList to return a graceful error message if the provided ResourceVersion is too large

MutableCSINodeAllocatableCount storage e2e test refactored to use the Mock CSI driver

omitempty and opt tag added to the API v1beta2 AdminAccess type in the DeviceRequestAllocationResult struct

Job controller now uses controller UID index for pod lookups

ListAll and ListAllByNamespace optimized to return directly when there is nothing to select

Cleanup after alpha feature MountContainers was removed

New runtime.ApplyConfiguration interface added that is implemented by all generated applyconfigs

cloud provider calls in storage/volume_provisioning.go removed

Usage of deprecated function ExtractCommentTags migrated to ExtractFunctionStyleCommentTags

Delay added to node updates after kubelet startup

Conntrack reconciler now considers service’s target port during cleanup of stale flow entries

kube-scheduler: Apply EnablePlugins to CoreResourceEnqueueTestCases

etcd server overrides to etcd probe factory for healthz and readyz

endpointsleases and configmapsleases options removed from leader-elect-resource-lock in LeaderElectionConfiguration

Deprecated –register-schedulable command line argument removed from the kubelet

Promotions

JobPodReplacementPolicy to GA

Subprojects and Dependency Updates

containerd v2.1.3: fixes registry fetch and transfer service issues

cluster-api v1.11.0-alpha.1: releases alpha version for testing

Shoutouts

Josh Berkus (@jberkus): Kudos to Mario Fahlandt (@Mario Fahlandt) for figuring out how to back up private channels from Slack.

via Last Week in Kubernetes Development https://lwkd.info/

June 27, 2025 at 09:08AM

·lwkd.info·
Last Week in Kubernetes Development - Week Ending June 22 2025
On with the show
On with the show

On with the show

https://anonymoushash.vmbrasseur.com/2025/06/on-with-the-show.html

Well, that didn’t work out as hoped, but unfortunately it did work out as expected.

I was legitimately excited by the opportunity and potential for open source and open collaboration in the digital agriculture space. I still am, to be honest. Not only is it good for humanity, there are a lot of fantasticly strong business strategies that it could enable.

So when I noted some red flags during the interview process, that opportunity and potential were great enough that I made the informed choice to move forward with the job anyway. There were still far more unknowns than knowns and it was possible (and maybe even likely) that the red flags would turn out to be false alarms.

As you’ve already figured out, that wasn’t the case. First the CEO and other execs were shown the door in December, then the interim leadership made a number of large organisational changes. These changes included cutting a very significant percentage of total headcount, myself included. It’s little comfort knowing that I was in good company when the company laid me off in April, but little is better than none. At least it wasn’t personal.

In the past two months I’ve caught up on sleep, made my yearly pilgrimage to Montreal, hosted guests, and done a little work for the fantastic folks at Open Robotics. I’m feeling much better following my months at Semios and can once again turn my mind to all things strategy, operations, business, and open source.

I’m considering my options for what to do next and having conversations with folks to clarify that direction. Will I remain in the corporate world, or is it finally time for me to dedicate myself fully to the nonprofit space? Do I stay with open source or will I apply my experience and strategic skills more broadly?

I’m figuring out the answers to these and other questions. If you’d like to be a part of those conversations—or if you’d just like to catch up or say howdy—drop me a line. I’d welcome the chance to chat.

via {anonymous => 'hash'}; https://anonymoushash.vmbrasseur.com/

June 26, 2025 at 03:00AM

·anonymoushash.vmbrasseur.com·
On with the show
LLM Bias Towards Helpfulness
LLM Bias Towards Helpfulness
I've been using Cursor for coding tasks lately, trying to explore what kinds of work it performs well and poorly. It's pretty good at most simple tasks. It's good-to-okay at some complicated tasks. But then, some simple tasks can stump it – usually in very niche domains. Other tasks are so big that they need to be…
·ashfurrow.com·
LLM Bias Towards Helpfulness
In Praise of “Normal” Engineers
In Praise of “Normal” Engineers
This article was originally commissioned by Luca Rossi (paywalled) for refactoring.fm, on February 11th, 2025. Luca edited a version of it that emphasized the importance of building “10x engi…
·charity.wtf·
In Praise of “Normal” Engineers
Trump administration scrambles to rehire key federal workers after DOGE firings | CNN Politics
Trump administration scrambles to rehire key federal workers after DOGE firings | CNN Politics
Federal agencies are rehiring and ordering back from leave some of the employees who were laid off in the weeks after President Donald Trump took office as they scramble to fill critical gaps in services left by the Department of Government Efficiency-led effort to shrink the federal workforce.
·cnn.com·
Trump administration scrambles to rehire key federal workers after DOGE firings | CNN Politics
Image Compatibility In Cloud Native Environments
Image Compatibility In Cloud Native Environments

Image Compatibility In Cloud Native Environments

https://kubernetes.io/blog/2025/06/25/image-compatibility-in-cloud-native-environments/

In industries where systems must run very reliably and meet strict performance criteria such as telecommunication, high-performance or AI computing, containerized applications often need specific operating system configuration or hardware presence. It is common practice to require the use of specific versions of the kernel, its configuration, device drivers, or system components. Despite the existence of the Open Container Initiative (OCI), a governing community to define standards and specifications for container images, there has been a gap in expression of such compatibility requirements. The need to address this issue has led to different proposals and, ultimately, an implementation in Kubernetes' Node Feature Discovery (NFD).

NFD is an open source Kubernetes project that automatically detects and reports hardware and system features of cluster nodes. This information helps users to schedule workloads on nodes that meet specific system requirements, which is especially useful for applications with strict hardware or operating system dependencies.

The need for image compatibility specification

Dependencies between containers and host OS

A container image is built on a base image, which provides a minimal runtime environment, often a stripped-down Linux userland, completely empty or distroless. When an application requires certain features from the host OS, compatibility issues arise. These dependencies can manifest in several ways:

Drivers: Host driver versions must match the supported range of a library version inside the container to avoid compatibility problems. Examples include GPUs and network drivers.

Libraries or Software: The container must come with a specific version or range of versions for a library or software to run optimally in the environment. Examples from high performance computing are MPI, EFA, or Infiniband.

Kernel Modules or Features:: Specific kernel features or modules must be present. Examples include having support of write protected huge page faults, or the presence of VFIO

And more…

While containers in Kubernetes are the most likely unit of abstraction for these needs, the definition of compatibility can extend further to include other container technologies such as Singularity and other OCI artifacts such as binaries from a spack binary cache.

Multi-cloud and hybrid cloud challenges

Containerized applications are deployed across various Kubernetes distributions and cloud providers, where different host operating systems introduce compatibility challenges. Often those have to be pre-configured before workload deployment or are immutable. For instance, different cloud providers will include different operating systems like:

RHCOS/RHEL

Photon OS

Amazon Linux 2

Container-Optimized OS

Azure Linux OS

And more...

Each OS comes with unique kernel versions, configurations, and drivers, making compatibility a non-trivial issue for applications requiring specific features. It must be possible to quickly assess a container for its suitability to run on any specific environment.

Image compatibility initiative

An effort was made within the Open Containers Initiative Image Compatibility working group to introduce a standard for image compatibility metadata. A specification for compatibility would allow container authors to declare required host OS features, making compatibility requirements discoverable and programmable. The specification implemented in Kubernetes Node Feature Discovery is one of the discussed proposals. It aims to:

Define a structured way to express compatibility in OCI image manifests.

Support a compatibility specification alongside container images in image registries.

Allow automated validation of compatibility before scheduling containers.

The concept has since been implemented in the Kubernetes Node Feature Discovery project.

Implementation in Node Feature Discovery

The solution integrates compatibility metadata into Kubernetes via NFD features and the NodeFeatureGroup API. This interface enables the user to match containers to nodes based on exposing features of hardware and software, allowing for intelligent scheduling and workload optimization.

Compatibility specification

The compatibility specification is a structured list of compatibility objects containing Node Feature Groups. These objects define image requirements and facilitate validation against host nodes. The feature requirements are described by using the list of available features from the NFD project. The schema has the following structure:

version (string) - Specifies the API version.

compatibilities (array of objects) - List of compatibility sets.

rules (object) - Specifies NodeFeatureGroup to define image requirements.

weight (int, optional) - Node affinity weight.

tag (string, optional) - Categorization tag.

description (string, optional) - Short description.

An example might look like the following:

version: v1alpha1 compatibilities:

  • description: "My image requirements" rules:
    • name: "kernel and cpu" matchFeatures:
    • feature: kernel.loadedmodule matchExpressions: vfio-pci: {op: Exists}
    • feature: cpu.model matchExpressions: vendor_id: {op: In, value: ["Intel", "AMD"]}
    • name: "one of available nics" matchAny:
    • matchFeatures:
    • feature: pci.device matchExpressions: vendor: {op: In, value: ["0eee"]} class: {op: In, value: ["0200"]}
    • matchFeatures:
    • feature: pci.device matchExpressions: vendor: {op: In, value: ["0fff"]} class: {op: In, value: ["0200"]}

Client implementation for node validation

To streamline compatibility validation, we implemented a client tool that allows for node validation based on an image's compatibility artifact. In this workflow, the image author would generate a compatibility artifact that points to the image it describes in a registry via the referrers API. When a need arises to assess the fit of an image to a host, the tool can discover the artifact and verify compatibility of an image to a node before deployment. The client can validate nodes both inside and outside a Kubernetes cluster, extending the utility of the tool beyond the single Kubernetes use case. In the future, image compatibility could play a crucial role in creating specific workload profiles based on image compatibility requirements, aiding in more efficient scheduling. Additionally, it could potentially enable automatic node configuration to some extent, further optimizing resource allocation and ensuring seamless deployment of specialized workloads.

Examples of usage

Define image compatibility metadata

A container image can have metadata that describes its requirements based on features discovered from nodes, like kernel modules or CPU models. The previous compatibility specification example in this article exemplified this use case.

Attach the artifact to the image

The image compatibility specification is stored as an OCI artifact. You can attach this metadata to your container image using the oras tool. The registry only needs to support OCI artifacts, support for arbitrary types is not required. Keep in mind that the container image and the artifact must be stored in the same registry. Use the following command to attach the artifact to the image:

oras attach \ --artifact-type application/vnd.nfd.image-compatibility.v1alpha1 <image-url> \ <path-to-spec>.yaml:application/vnd.nfd.image-compatibility.spec.v1alpha1+yaml

Validate image compatibility

After attaching the compatibility specification, you can validate whether a node meets the image's requirements. This validation can be done using the nfd client:

nfd compat validate-node --image <image-url>

Read the output from the client

Finally you can read the report generated by the tool or use your own tools to act based on the generated JSON report.

Conclusion

The addition of image compatibility to Kubernetes through Node Feature Discovery underscores the growing importance of addressing compatibility in cloud native environments. It is only a start, as further work is needed to integrate compatibility into scheduling of workloads within and outside of Kubernetes. However, by integrating this feature into Kubernetes, mission-critical workloads can now define and validate host OS requirements more efficiently. Moving forward, the adoption of compatibility metadata within Kubernetes ecosystems will significantly enhance the reliability and performance of specialized containerized applications, ensuring they meet the stringent requirements of industries like telecommunications, high-performance computing or any environment that requires special hardware or host OS configuration.

Get involved

Join the Kubernetes Node Feature Discovery project if you're interested in getting involved with the design and development of Image Compatibility API and tools. We always welcome new contributors.

via Kubernetes Blog https://kubernetes.io/

June 24, 2025 at 08:00PM

·kubernetes.io·
Image Compatibility In Cloud Native Environments
OpenELA’s Automated Process Delivers Rapid and Reliable Access to Enterprise Linux Sources
OpenELA’s Automated Process Delivers Rapid and Reliable Access to Enterprise Linux Sources
OpenELA provides a comprehensive resource for ISVs, IHVs, processor manufacturers, and independent developers building downstream enterprise Linux distributions, with sources available within a few days of RHEL releases RENO, Nev., AUSTIN, Texas, and LUXEMBOURG—July 12, 2024— The Open Enterprise Linux Association (OpenELA) has launched an automated process to make new enterprise Linux sources available just days after each release of new versions of Red Hat Enterprise Linux (RHEL). Packages for the most recent such releases —RHEL 9.
·openela.org·
OpenELA’s Automated Process Delivers Rapid and Reliable Access to Enterprise Linux Sources
Dear friend you have built a Kubernetes with Mac Chaffee
Dear friend you have built a Kubernetes with Mac Chaffee

Dear friend, you have built a Kubernetes, with Mac Chaffee

https://ku.bz/9nFPmG85f

Mac Chaffee, a platform engineer and security champion, examines why developers often underestimate the complexity of running modern applications and how overconfidence leads to expensive technical mistakes.

You will learn:

Why teams reject Kubernetes then rebuild it piece by piece - understanding the psychological factors, like overconfidence, that drive initial rejection of complex but proven tools

How to identify the tipping point when DIY solutions become more complex than adopting established orchestration tools, especially around scaling and high availability challenges

The right approach to abstracting Kubernetes complexity - why hiding the Kubernetes API often backfires and how to build effective guardrails instead of reinventing interfaces

Why mentorship gaps lead to poor technical decisions - how the lack of proper apprenticeship programs in tech results in teams making expensive mistakes when building infrastructure

Sponsor

This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.

More info

Find all the links and info for this episode here: https://ku.bz/9nFPmG85f

Interested in sponsoring an episode? Learn more.

via KubeFM https://kube.fm

June 24, 2025 at 06:00AM

·kube.fm·
Dear friend you have built a Kubernetes with Mac Chaffee
DevOps Toolkit - Ep25 - Ask Me Anything About Anything with Scott Rosenberg - https://www.youtube.com/watch?v=6UZnp38Txf4
DevOps Toolkit - Ep25 - Ask Me Anything About Anything with Scott Rosenberg - https://www.youtube.com/watch?v=6UZnp38Txf4

Ep25 - Ask Me Anything About Anything with Scott Rosenberg

There are no restrictions in this AMA session. You can ask anything about DevOps, Cloud, Kubernetes, Platform Engineering, containers, or anything else. Scott Rosenberg, regular guest, will be here to help us out.

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: Codefresh 🔗 GitOps Argo CD Certifications: https://learning.codefresh.io (use "viktor" for a 50% discount) ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/

▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox

via YouTube https://www.youtube.com/watch?v=6UZnp38Txf4

·youtube.com·
DevOps Toolkit - Ep25 - Ask Me Anything About Anything with Scott Rosenberg - https://www.youtube.com/watch?v=6UZnp38Txf4