Suggested Reads

Suggested Reads

54832 bookmarks
Newest
The Redis saga continues | Redis is now available under the the OSI-approved AGPLv3 open source license.
The Redis saga continues | Redis is now available under the the OSI-approved AGPLv3 open source license.
The rise of hyperscalers like AWS and GCP has unlocked incredible speed and scale for startups and enterprises alike. But for companies rooted in open source, it has posed a fundamental challenge: how do you keep innovating and investing in OSS projects when cloud providers reap the profits and control the infrastructure without proportional contributions […]
·redis.io·
The Redis saga continues | Redis is now available under the the OSI-approved AGPLv3 open source license.
Kubernetes v1.33: New features in DRA
Kubernetes v1.33: New features in DRA

Kubernetes v1.33: New features in DRA

https://kubernetes.io/blog/2025/05/01/kubernetes-v1-33-dra-updates/

Kubernetes Dynamic Resource Allocation (DRA) was originally introduced as an alpha feature in the v1.26 release, and then went through a significant redesign for Kubernetes v1.31. The main DRA feature went to beta in v1.32, and the project hopes it will be generally available in Kubernetes v1.34.

The basic feature set of DRA provides a far more powerful and flexible API for requesting devices than Device Plugin. And while DRA remains a beta feature for v1.33, the DRA team has been hard at work implementing a number of new features and UX improvements. One feature has been promoted to beta, while a number of new features have been added in alpha. The team has also made progress towards getting DRA ready for GA.

Features promoted to beta

Driver-owned Resource Claim Status was promoted to beta. This allows the driver to report driver-specific device status data for each allocated device in a resource claim, which is particularly useful for supporting network devices.

New alpha features

Partitionable Devices lets a driver advertise several overlapping logical devices (“partitions”), and the driver can reconfigure the physical device dynamically based on the actual devices allocated. This makes it possible to partition devices on-demand to meet the needs of the workloads and therefore increase the utilization.

Device Taints and Tolerations allow devices to be tainted and for workloads to tolerate those taints. This makes it possible for drivers or cluster administrators to mark devices as unavailable. Depending on the effect of the taint, this can prevent devices from being allocated or cause eviction of pods that are using the device.

Prioritized List lets users specify a list of acceptable devices for their workloads, rather than just a single type of device. So while the workload might run best on a single high-performance GPU, it might also be able to run on 2 mid-level GPUs. The scheduler will attempt to satisfy the alternatives in the list in order, so the workload will be allocated the best set of devices available in the cluster.

Admin Access has been updated so that only users with access to a namespace with the resource.k8s.io/admin-access: "true" label are authorized to create ResourceClaim or ResourceClaimTemplates objects with the adminAccess field within the namespace. This grants administrators access to in-use devices and may enable additional permissions when making the device available in a container. This ensures that non-admin users cannot misuse the feature.

Preparing for general availability

A new v1beta2 API has been added to simplify the user experience and to prepare for additional features being added in the future. The RBAC rules for DRA have been improved and support has been added for seamless upgrades of DRA drivers.

What’s next?

The plan for v1.34 is even more ambitious than for v1.33. Most importantly, we (the Kubernetes device management working group) hope to bring DRA to general availability, which will make it available by default on all v1.34 Kubernetes clusters. This also means that many, perhaps all, of the DRA features that are still beta in v1.34 will become enabled by default, making it much easier to use them.

The alpha features that were added in v1.33 will be brought to beta in v1.34.

Getting involved

A good starting point is joining the WG Device Management Slack channel and meetings, which happen at US/EU and EU/APAC friendly time slots.

Not all enhancement ideas are tracked as issues yet, so come talk to us if you want to help or have some ideas yourself! We have work to do at all levels, from difficult core changes to usability enhancements in kubectl, which could be picked up by newcomers.

Acknowledgments

A huge thanks to everyone who has contributed:

Cici Huang (cici37)

Ed Bartosh (bart0sh

John Belamaric (johnbelamaric)

Jon Huhn (nojnhuh)

Kevin Klues (klueska)

Morten Torkildsen (mortent)

Patrick Ohly (pohly)

Rita Zhang (ritazh)

Shingo Omura (everpeace)

via Kubernetes Blog https://kubernetes.io/

May 01, 2025 at 02:30PM

·kubernetes.io·
Kubernetes v1.33: New features in DRA
Music Assistant
Music Assistant
Music Assistant is a music library manager for local and streaming providers
·music-assistant.io·
Music Assistant
AirBorne: Wormable Zero-Click RCE in Apple AirPlay Puts Billions of Devices at Risk | Oligo Security | Oligo Security
AirBorne: Wormable Zero-Click RCE in Apple AirPlay Puts Billions of Devices at Risk | Oligo Security | Oligo Security
Oligo Security reveals AirBorne, a new set of vulnerabilities in Apple’s AirPlay protocol and SDK. Learn how zero-click RCEs, ACL bypasses, and wormable exploits could endanger Apple and IoT devices worldwide — and how to protect yourself.
·oligo.security·
AirBorne: Wormable Zero-Click RCE in Apple AirPlay Puts Billions of Devices at Risk | Oligo Security | Oligo Security
Kubernetes v1.33: Storage Capacity Scoring of Nodes for Dynamic Provisioning (alpha)
Kubernetes v1.33: Storage Capacity Scoring of Nodes for Dynamic Provisioning (alpha)

Kubernetes v1.33: Storage Capacity Scoring of Nodes for Dynamic Provisioning (alpha)

https://kubernetes.io/blog/2025/04/30/kubernetes-v1-33-storage-capacity-scoring-feature/

Kubernetes v1.33 introduces a new alpha feature called StorageCapacityScoring. This feature adds a scoring method for pod scheduling with the topology-aware volume provisioning. This feature eases to schedule pods on nodes with either the most or least available storage capacity.

About this feature

This feature extends the kube-scheduler's VolumeBinding plugin to perform scoring using node storage capacity information obtained from Storage Capacity. Currently, you can only filter out nodes with insufficient storage capacity. So, you have to use a scheduler extender to achieve storage-capacity-based pod scheduling.

This feature is useful for provisioning node-local PVs, which have size limits based on the node's storage capacity. By using this feature, you can assign the PVs to the nodes with the most available storage space so that you can expand the PVs later as much as possible.

In another use case, you might want to reduce the number of nodes as much as possible for low operation costs in cloud environments by choosing the least storage capacity node. This feature helps maximize resource utilization by filling up nodes more sequentially, starting with the most utilized nodes first that still have enough storage capacity for the requested volume size.

How to use

Enabling the feature

In the alpha phase, StorageCapacityScoring is disabled by default. To use this feature, add StorageCapacityScoring=true to the kube-scheduler command line option --feature-gates.

Configuration changes

You can configure node priorities based on storage utilization using the shape parameter in the VolumeBinding plugin configuration. This allows you to prioritize nodes with higher available storage capacity (default) or, conversely, nodes with lower available storage capacity. For example, to prioritize lower available storage capacity, configure KubeSchedulerConfiguration as follows:

apiVersion: kubescheduler.config.k8s.io/v1 kind: KubeSchedulerConfiguration profiles: ... pluginConfig:

  • name: VolumeBinding args: ... shape:
  • utilization: 0 score: 0
  • utilization: 100 score: 10

For more details, please refer to the documentation.

Further reading

KEP-4049: Storage Capacity Scoring of Nodes for Dynamic Provisioning

Additional note: Relationship with VolumeCapacityPriority

The alpha feature gate VolumeCapacityPriority, which performs node scoring based on available storage capacity during static provisioning, will be deprecated and replaced by StorageCapacityScoring.

Please note that while VolumeCapacityPriority prioritizes nodes with lower available storage capacity by default, StorageCapacityScoring prioritizes nodes with higher available storage capacity by default.

via Kubernetes Blog https://kubernetes.io/

April 30, 2025 at 02:30PM

·kubernetes.io·
Kubernetes v1.33: Storage Capacity Scoring of Nodes for Dynamic Provisioning (alpha)
DevOps Toolkit - Ep20 - Ask Me Anything About Anything with Scott Rosenberg - https://www.youtube.com/watch?v=nFWGZEI37SA
DevOps Toolkit - Ep20 - Ask Me Anything About Anything with Scott Rosenberg - https://www.youtube.com/watch?v=nFWGZEI37SA

Ep20 - Ask Me Anything About Anything with Scott Rosenberg

There are no restrictions in this AMA session. You can ask anything about DevOps, Cloud, Kubernetes, Platform Engineering, containers, or anything else. We'll have special guests Scott Rosenberg and Ramiro Berrelleza to help us out.

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: Codefresh 🔗 Codefresh GitOps Cloud: https://codefresh.io ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/

▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox

via YouTube https://www.youtube.com/watch?v=nFWGZEI37SA

·youtube.com·
DevOps Toolkit - Ep20 - Ask Me Anything About Anything with Scott Rosenberg - https://www.youtube.com/watch?v=nFWGZEI37SA
Last Week in Kubernetes Development - Week Ending April 27 2025
Last Week in Kubernetes Development - Week Ending April 27 2025

Week Ending April 27, 2025

https://lwkd.info/2025/20250430

Developer News

Benjamin Elder reminded contributors of the changes to the E2E Testing Framework that take effect now. Contributors must use framework.WithFeatureGate(features.YourFeature) for tests related to specific feature gates to ensure proper execution in CI jobs. Tests need to specify both feature gates and cluster configurations.

After 5 long years, SIG-Testing has finally acheived zero hard-coded test skips in pull-kubernetes-e2e-kind and related jobs. This is near parity with pull-kubernetes-e2e-gce (1056 tests vs 1080 test) in approximately half the runtime (~30m vs ~1h).

Applications for Project Lighting talks, Maintainer’s Track and ContribFest at KubeCon NA 2025 are open! Get your submissions in before 7th July.

Please read and comment on an ongoing discussion about AI-generated contributions to Kubernetes. Several repositories have been receiving AI-generated submissions which look acceptable until carefully reviewed. Younger developers may be more reliant on AI and may not realize that such contributions are unacceptable. Community members are discussing whether we need a more restrictive policy than the Linux Foundation’s.

Release Schedule

Next Deadline: 1.34 Release Cycle Begins – soon

We are in the between-release limbo period, so time to work on whatever you want. That irritating bug, the subproject you’ve been meaning to investigate, a birdhouse, whatever. The call for enhancements will come soon enough.

Featured PRs

131491: kubectl describe service: Add Traffic Distribution

This PR shows the Traffic Distribution field, added in Kubernetes 1.31, in kubectl describe service if the field is set. This makes the field much more accessible and useful to users.

130782: Kubeadm issue #3152 ControlPlane node setup failing with “etcdserver: can only promote a learner member”

This PR fixes a bug where in kubeadm ControlPlane node setup fails with the error “etcdserver: can only promote a learner member”; This PR adds a check to ensure that promotion does not retry if the member is already promoted and introduces a call to remove the learner member if the promotion fails entirely.

KEP of the Week

KEP 1769: Speed up recursive SELinux label change

This KEP speeds up volume mounts on SELinux-enforcing systems by using the -o context=XYZ mount option instead of slow recursive relabeling. It has rolled out in three phases: starting with ReadWriteOncePod volumes (v1.28), then adding metrics and an opt-out (v1.32), and finally applying to all volumes by default in 1.33.

Other Merges

Fix for OIDC discovery document publishing when external service account token signing is enabled

hack/update-codegen.sh now automatically ensures goimports and protoc

Deprecated scheduler cache metrics removed

Recovery feature’s status in kubelet now checks for newer resize fields

Fix for the invalid SucceededCriteriaMet condition type in the Job API

Watch handler tests moved to handlers package

Fix for error handling and CSI JSON file removal interaction

Pod resize e2e utilities moved out of e2e/framework

Fix for a possible deadlock in the watch client

Long directory names with e2e pod logs shortened

endpoint-controller and workload-leader-election FlowSchemas removed from the default APF configuration

Fix for the allocatedResourceStatuses Field name mismatch in PVC status validation

scheduler-perf adds option to enable api-server initialization

Kubelet to use the node informer to get the node addresses directly

Fix for a bug in Job controller which could result in creating unnecessary Pods for a finished Job

kube-controller-manager events to support contextual logging

Fix for a bug where NodeResizeError condition was in PVC status when the CSI driver does not support node volume expansion

kubeadm refactoring to reduce code repetition using slice package

Version Updates

google/cel-go to v0.25.0

cri-tools to v1.33.0

mockery to v2.53.3

coredns to v.1.12.1

Shoutouts

Ryota: Now that Kubernetes v1.33 is officially out, the Release Team Subteam Leads — , rayandas(Docs), Wendy Ha (Release Signal), Dipesh (Enhancements), and Ryota (Comms) — want to send a huge shoutout to our amazing Release Lead Nina Polshakova

via Last Week in Kubernetes Development https://lwkd.info/

April 30, 2025 at 05:00PM

·lwkd.info·
Last Week in Kubernetes Development - Week Ending April 27 2025
CLOTributor
CLOTributor
CLOTributor makes it easier to discover great opportunities to become a Cloud Native contributor.
·clotributor.dev·
CLOTributor
Kubernetes v1.33: Image Volumes graduate to beta!
Kubernetes v1.33: Image Volumes graduate to beta!

Kubernetes v1.33: Image Volumes graduate to beta!

https://kubernetes.io/blog/2025/04/29/kubernetes-v1-33-image-volume-beta/

Image Volumes were introduced as an Alpha feature with the Kubernetes v1.31 release as part of KEP-4639. In Kubernetes v1.33, this feature graduates to beta.

Please note that the feature is still disabled by default, because not all container runtimes have full support for it. CRI-O supports the initial feature since version v1.31 and will add support for Image Volumes as beta in v1.33. containerd merged support for the alpha feature which will be part of the v2.1.0 release and is working on beta support as part of PR #11578.

What's new

The major change for the beta graduation of Image Volumes is the support for subPath and subPathExpr mounts for containers via spec.containers[*].volumeMounts.[subPath,subPathExpr]. This allows end-users to mount a certain subdirectory of an image volume, which is still mounted as readonly (noexec). This means that non-existing subdirectories cannot be mounted by default. As for other subPath and subPathExpr values, Kubernetes will ensure that there are no absolute path or relative path components part of the specified sub path. Container runtimes are also required to double check those requirements for safety reasons. If a specified subdirectory does not exist within a volume, then runtimes should fail on container creation and provide user feedback by using existing kubelet events.

Besides that, there are also three new kubelet metrics available for image volumes:

kubelet_image_volume_requested_total: Outlines the number of requested image volumes.

kubelet_image_volume_mounted_succeed_total: Counts the number of successful image volume mounts.

kubelet_image_volume_mounted_errors_total: Accounts the number of failed image volume mounts.

To use an existing subdirectory for a specific image volume, just use it as subPath (or subPathExpr) value of the containers volumeMounts:

apiVersion: v1 kind: Pod metadata: name: image-volume spec: containers:

  • name: shell command: ["sleep", "infinity"] image: debian volumeMounts:
  • name: volume mountPath: /volume subPath: dir volumes:
  • name: volume image: reference: quay.io/crio/artifact:v2 pullPolicy: IfNotPresent

Then, create the pod on your cluster:

kubectl apply -f image-volumes-subpath.yaml

Now you can attach to the container:

kubectl attach -it image-volume bash

And check the content of the file from the dir sub path in the volume:

cat /volume/file

The output will be similar to:

1

Thank you for reading through the end of this blog post! SIG Node is proud and happy to deliver this feature graduation as part of Kubernetes v1.33.

As writer of this blog post, I would like to emphasize my special thanks to all involved individuals out there!

If you would like to provide feedback or suggestions feel free to reach out to SIG Node using the Kubernetes Slack (#sig-node) channel or the SIG Node mailing list.

Further reading

Use an Image Volume With a Pod

image volume overview

via Kubernetes Blog https://kubernetes.io/

April 29, 2025 at 02:30PM

·kubernetes.io·
Kubernetes v1.33: Image Volumes graduate to beta!
Rocky Linux Achieves FIPS 140-3 Compliance
Rocky Linux Achieves FIPS 140-3 Compliance
Rocky Linux has taken a major leap forward by achieving FIPS 140-3 compliance for versions 8 and 9.2. This achievement makes the already popular...
·linuxsecurity.com·
Rocky Linux Achieves FIPS 140-3 Compliance
From Fragile to Faultless: Kubernetes Self-Healing In Practice with Grzegorz Głąb
From Fragile to Faultless: Kubernetes Self-Healing In Practice with Grzegorz Głąb

From Fragile to Faultless: Kubernetes Self-Healing In Practice, with Grzegorz Głąb

https://ku.bz/yg_fkP0LN

Discover how to build resilient Kubernetes environments at scale with practical automation strategies from an engineer who's tackled complex production challenges.

Grzegorz Głąb, Kubernetes Engineer at Cloud Kitchens, shares his team's journey developing a comprehensive self-healing framework. He explains how they addressed issues ranging from spot node preemptions to network packet drops caused by unbalanced IRQs, providing concrete examples of automation that prevents downtime and improves reliability.

You will learn:

How managed Kubernetes services like AKS provide benefits but require customization for specific use cases

The architecture of an effective self-healing framework using DaemonSets and deployments with Kubernetes-native components

Practical solutions for common challenges like StatefulSet pods stuck on unreachable nodes and cleaning up orphaned pods

Techniques for workload-level automation, including throttling CPU-hungry pods and automating diagnostic data collection

Sponsor

This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.

More info

Find all the links and info for this episode here: https://ku.bz/yg_fkP0LN

Interested in sponsoring an episode? Learn more.

via KubeFM https://kube.fm

April 29, 2025 at 06:00AM

·kube.fm·
From Fragile to Faultless: Kubernetes Self-Healing In Practice with Grzegorz Głąb
Only Google Can Run Chrome, Company’s Browser Chief Tells Judge
Only Google Can Run Chrome, Company’s Browser Chief Tells Judge
Google is the only company that can offer the level of features and functionality that its popular Chrome web browser has today, given its “interdependencies” on other parts of the Alphabet Inc. unit, the head of Chrome testified.
·bloomberg.com·
Only Google Can Run Chrome, Company’s Browser Chief Tells Judge