
Suggested Reads
Kubernetes v1.33: New features in DRA
https://kubernetes.io/blog/2025/05/01/kubernetes-v1-33-dra-updates/
Kubernetes Dynamic Resource Allocation (DRA) was originally introduced as an alpha feature in the v1.26 release, and then went through a significant redesign for Kubernetes v1.31. The main DRA feature went to beta in v1.32, and the project hopes it will be generally available in Kubernetes v1.34.
The basic feature set of DRA provides a far more powerful and flexible API for requesting devices than Device Plugin. And while DRA remains a beta feature for v1.33, the DRA team has been hard at work implementing a number of new features and UX improvements. One feature has been promoted to beta, while a number of new features have been added in alpha. The team has also made progress towards getting DRA ready for GA.
Features promoted to beta
Driver-owned Resource Claim Status was promoted to beta. This allows the driver to report driver-specific device status data for each allocated device in a resource claim, which is particularly useful for supporting network devices.
New alpha features
Partitionable Devices lets a driver advertise several overlapping logical devices (“partitions”), and the driver can reconfigure the physical device dynamically based on the actual devices allocated. This makes it possible to partition devices on-demand to meet the needs of the workloads and therefore increase the utilization.
Device Taints and Tolerations allow devices to be tainted and for workloads to tolerate those taints. This makes it possible for drivers or cluster administrators to mark devices as unavailable. Depending on the effect of the taint, this can prevent devices from being allocated or cause eviction of pods that are using the device.
Prioritized List lets users specify a list of acceptable devices for their workloads, rather than just a single type of device. So while the workload might run best on a single high-performance GPU, it might also be able to run on 2 mid-level GPUs. The scheduler will attempt to satisfy the alternatives in the list in order, so the workload will be allocated the best set of devices available in the cluster.
Admin Access has been updated so that only users with access to a namespace with the resource.k8s.io/admin-access: "true" label are authorized to create ResourceClaim or ResourceClaimTemplates objects with the adminAccess field within the namespace. This grants administrators access to in-use devices and may enable additional permissions when making the device available in a container. This ensures that non-admin users cannot misuse the feature.
Preparing for general availability
A new v1beta2 API has been added to simplify the user experience and to prepare for additional features being added in the future. The RBAC rules for DRA have been improved and support has been added for seamless upgrades of DRA drivers.
What’s next?
The plan for v1.34 is even more ambitious than for v1.33. Most importantly, we (the Kubernetes device management working group) hope to bring DRA to general availability, which will make it available by default on all v1.34 Kubernetes clusters. This also means that many, perhaps all, of the DRA features that are still beta in v1.34 will become enabled by default, making it much easier to use them.
The alpha features that were added in v1.33 will be brought to beta in v1.34.
Getting involved
A good starting point is joining the WG Device Management Slack channel and meetings, which happen at US/EU and EU/APAC friendly time slots.
Not all enhancement ideas are tracked as issues yet, so come talk to us if you want to help or have some ideas yourself! We have work to do at all levels, from difficult core changes to usability enhancements in kubectl, which could be picked up by newcomers.
Acknowledgments
A huge thanks to everyone who has contributed:
Cici Huang (cici37)
Ed Bartosh (bart0sh
John Belamaric (johnbelamaric)
Jon Huhn (nojnhuh)
Kevin Klues (klueska)
Morten Torkildsen (mortent)
Patrick Ohly (pohly)
Rita Zhang (ritazh)
Shingo Omura (everpeace)
via Kubernetes Blog https://kubernetes.io/
May 01, 2025 at 02:30PM
Kubernetes v1.33: Storage Capacity Scoring of Nodes for Dynamic Provisioning (alpha)
https://kubernetes.io/blog/2025/04/30/kubernetes-v1-33-storage-capacity-scoring-feature/
Kubernetes v1.33 introduces a new alpha feature called StorageCapacityScoring. This feature adds a scoring method for pod scheduling with the topology-aware volume provisioning. This feature eases to schedule pods on nodes with either the most or least available storage capacity.
About this feature
This feature extends the kube-scheduler's VolumeBinding plugin to perform scoring using node storage capacity information obtained from Storage Capacity. Currently, you can only filter out nodes with insufficient storage capacity. So, you have to use a scheduler extender to achieve storage-capacity-based pod scheduling.
This feature is useful for provisioning node-local PVs, which have size limits based on the node's storage capacity. By using this feature, you can assign the PVs to the nodes with the most available storage space so that you can expand the PVs later as much as possible.
In another use case, you might want to reduce the number of nodes as much as possible for low operation costs in cloud environments by choosing the least storage capacity node. This feature helps maximize resource utilization by filling up nodes more sequentially, starting with the most utilized nodes first that still have enough storage capacity for the requested volume size.
How to use
Enabling the feature
In the alpha phase, StorageCapacityScoring is disabled by default. To use this feature, add StorageCapacityScoring=true to the kube-scheduler command line option --feature-gates.
Configuration changes
You can configure node priorities based on storage utilization using the shape parameter in the VolumeBinding plugin configuration. This allows you to prioritize nodes with higher available storage capacity (default) or, conversely, nodes with lower available storage capacity. For example, to prioritize lower available storage capacity, configure KubeSchedulerConfiguration as follows:
apiVersion: kubescheduler.config.k8s.io/v1 kind: KubeSchedulerConfiguration profiles: ... pluginConfig:
- name: VolumeBinding args: ... shape:
- utilization: 0 score: 0
- utilization: 100 score: 10
For more details, please refer to the documentation.
Further reading
KEP-4049: Storage Capacity Scoring of Nodes for Dynamic Provisioning
Additional note: Relationship with VolumeCapacityPriority
The alpha feature gate VolumeCapacityPriority, which performs node scoring based on available storage capacity during static provisioning, will be deprecated and replaced by StorageCapacityScoring.
Please note that while VolumeCapacityPriority prioritizes nodes with lower available storage capacity by default, StorageCapacityScoring prioritizes nodes with higher available storage capacity by default.
via Kubernetes Blog https://kubernetes.io/
April 30, 2025 at 02:30PM
Ep20 - Ask Me Anything About Anything with Scott Rosenberg
There are no restrictions in this AMA session. You can ask anything about DevOps, Cloud, Kubernetes, Platform Engineering, containers, or anything else. We'll have special guests Scott Rosenberg and Ramiro Berrelleza to help us out.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: Codefresh 🔗 Codefresh GitOps Cloud: https://codefresh.io ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/
▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox
via YouTube https://www.youtube.com/watch?v=nFWGZEI37SA
Week Ending April 27, 2025
https://lwkd.info/2025/20250430
Developer News
Benjamin Elder reminded contributors of the changes to the E2E Testing Framework that take effect now. Contributors must use framework.WithFeatureGate(features.YourFeature) for tests related to specific feature gates to ensure proper execution in CI jobs. Tests need to specify both feature gates and cluster configurations.
After 5 long years, SIG-Testing has finally acheived zero hard-coded test skips in pull-kubernetes-e2e-kind and related jobs. This is near parity with pull-kubernetes-e2e-gce (1056 tests vs 1080 test) in approximately half the runtime (~30m vs ~1h).
Applications for Project Lighting talks, Maintainer’s Track and ContribFest at KubeCon NA 2025 are open! Get your submissions in before 7th July.
Please read and comment on an ongoing discussion about AI-generated contributions to Kubernetes. Several repositories have been receiving AI-generated submissions which look acceptable until carefully reviewed. Younger developers may be more reliant on AI and may not realize that such contributions are unacceptable. Community members are discussing whether we need a more restrictive policy than the Linux Foundation’s.
Release Schedule
Next Deadline: 1.34 Release Cycle Begins – soon
We are in the between-release limbo period, so time to work on whatever you want. That irritating bug, the subproject you’ve been meaning to investigate, a birdhouse, whatever. The call for enhancements will come soon enough.
Featured PRs
131491: kubectl describe service: Add Traffic Distribution
This PR shows the Traffic Distribution field, added in Kubernetes 1.31, in kubectl describe service if the field is set. This makes the field much more accessible and useful to users.
130782: Kubeadm issue #3152 ControlPlane node setup failing with “etcdserver: can only promote a learner member”
This PR fixes a bug where in kubeadm ControlPlane node setup fails with the error “etcdserver: can only promote a learner member”; This PR adds a check to ensure that promotion does not retry if the member is already promoted and introduces a call to remove the learner member if the promotion fails entirely.
KEP of the Week
KEP 1769: Speed up recursive SELinux label change
This KEP speeds up volume mounts on SELinux-enforcing systems by using the -o context=XYZ mount option instead of slow recursive relabeling. It has rolled out in three phases: starting with ReadWriteOncePod volumes (v1.28), then adding metrics and an opt-out (v1.32), and finally applying to all volumes by default in 1.33.
Other Merges
Fix for OIDC discovery document publishing when external service account token signing is enabled
hack/update-codegen.sh now automatically ensures goimports and protoc
Deprecated scheduler cache metrics removed
Recovery feature’s status in kubelet now checks for newer resize fields
Fix for the invalid SucceededCriteriaMet condition type in the Job API
Watch handler tests moved to handlers package
Fix for error handling and CSI JSON file removal interaction
Pod resize e2e utilities moved out of e2e/framework
Fix for a possible deadlock in the watch client
Long directory names with e2e pod logs shortened
endpoint-controller and workload-leader-election FlowSchemas removed from the default APF configuration
Fix for the allocatedResourceStatuses Field name mismatch in PVC status validation
scheduler-perf adds option to enable api-server initialization
Kubelet to use the node informer to get the node addresses directly
Fix for a bug in Job controller which could result in creating unnecessary Pods for a finished Job
kube-controller-manager events to support contextual logging
Fix for a bug where NodeResizeError condition was in PVC status when the CSI driver does not support node volume expansion
kubeadm refactoring to reduce code repetition using slice package
Version Updates
google/cel-go to v0.25.0
cri-tools to v1.33.0
mockery to v2.53.3
coredns to v.1.12.1
Shoutouts
Ryota: Now that Kubernetes v1.33 is officially out, the Release Team Subteam Leads — , rayandas(Docs), Wendy Ha (Release Signal), Dipesh (Enhancements), and Ryota (Comms) — want to send a huge shoutout to our amazing Release Lead Nina Polshakova
via Last Week in Kubernetes Development https://lwkd.info/
April 30, 2025 at 05:00PM
Kubernetes v1.33: Image Volumes graduate to beta!
https://kubernetes.io/blog/2025/04/29/kubernetes-v1-33-image-volume-beta/
Image Volumes were introduced as an Alpha feature with the Kubernetes v1.31 release as part of KEP-4639. In Kubernetes v1.33, this feature graduates to beta.
Please note that the feature is still disabled by default, because not all container runtimes have full support for it. CRI-O supports the initial feature since version v1.31 and will add support for Image Volumes as beta in v1.33. containerd merged support for the alpha feature which will be part of the v2.1.0 release and is working on beta support as part of PR #11578.
What's new
The major change for the beta graduation of Image Volumes is the support for subPath and subPathExpr mounts for containers via spec.containers[*].volumeMounts.[subPath,subPathExpr]. This allows end-users to mount a certain subdirectory of an image volume, which is still mounted as readonly (noexec). This means that non-existing subdirectories cannot be mounted by default. As for other subPath and subPathExpr values, Kubernetes will ensure that there are no absolute path or relative path components part of the specified sub path. Container runtimes are also required to double check those requirements for safety reasons. If a specified subdirectory does not exist within a volume, then runtimes should fail on container creation and provide user feedback by using existing kubelet events.
Besides that, there are also three new kubelet metrics available for image volumes:
kubelet_image_volume_requested_total: Outlines the number of requested image volumes.
kubelet_image_volume_mounted_succeed_total: Counts the number of successful image volume mounts.
kubelet_image_volume_mounted_errors_total: Accounts the number of failed image volume mounts.
To use an existing subdirectory for a specific image volume, just use it as subPath (or subPathExpr) value of the containers volumeMounts:
apiVersion: v1 kind: Pod metadata: name: image-volume spec: containers:
- name: shell command: ["sleep", "infinity"] image: debian volumeMounts:
- name: volume mountPath: /volume subPath: dir volumes:
- name: volume image: reference: quay.io/crio/artifact:v2 pullPolicy: IfNotPresent
Then, create the pod on your cluster:
kubectl apply -f image-volumes-subpath.yaml
Now you can attach to the container:
kubectl attach -it image-volume bash
And check the content of the file from the dir sub path in the volume:
cat /volume/file
The output will be similar to:
1
Thank you for reading through the end of this blog post! SIG Node is proud and happy to deliver this feature graduation as part of Kubernetes v1.33.
As writer of this blog post, I would like to emphasize my special thanks to all involved individuals out there!
If you would like to provide feedback or suggestions feel free to reach out to SIG Node using the Kubernetes Slack (#sig-node) channel or the SIG Node mailing list.
Further reading
Use an Image Volume With a Pod
image volume overview
via Kubernetes Blog https://kubernetes.io/
April 29, 2025 at 02:30PM
From Fragile to Faultless: Kubernetes Self-Healing In Practice, with Grzegorz Głąb
Discover how to build resilient Kubernetes environments at scale with practical automation strategies from an engineer who's tackled complex production challenges.
Grzegorz Głąb, Kubernetes Engineer at Cloud Kitchens, shares his team's journey developing a comprehensive self-healing framework. He explains how they addressed issues ranging from spot node preemptions to network packet drops caused by unbalanced IRQs, providing concrete examples of automation that prevents downtime and improves reliability.
You will learn:
How managed Kubernetes services like AKS provide benefits but require customization for specific use cases
The architecture of an effective self-healing framework using DaemonSets and deployments with Kubernetes-native components
Practical solutions for common challenges like StatefulSet pods stuck on unreachable nodes and cleaning up orphaned pods
Techniques for workload-level automation, including throttling CPU-hungry pods and automating diagnostic data collection
Sponsor
This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/yg_fkP0LN
Interested in sponsoring an episode? Learn more.
via KubeFM https://kube.fm
April 29, 2025 at 06:00AM