54837 bookmarks

Custom sorting

Claude Code for VS Code - Visual Studio Marketplace

Extension for Visual Studio Code - Claude Code for VS Code: Harness the power of Claude Code without leaving your IDE

·marketplace.visualstudio.com·Sep 30, 2025

Claude Code for VS Code - Visual Studio Marketplace

Enabling Claude Code to work more autonomously \ Anthropic

Introducing Claude Code upgrades: native VS Code extension, terminal UX updates, and checkpoints for autonomous development. Handle complex tasks with confidence.

·anthropic.com·Sep 30, 2025

Enabling Claude Code to work more autonomously \ Anthropic

Scaling CI horizontally with Buildkite Kubernetes and multiple pipelines with Ben Poland

Scaling CI horizontally with Buildkite, Kubernetes, and multiple pipelines, with Ben Poland

https://ku.bz/klBmzMY5-

Ben Poland walks through Faire's complete CI transformation, from a single Jenkins instance struggling with thousands of lines of Groovy to a distributed Buildkite system running across multiple Kubernetes clusters.

He details the technical challenges of running CI workloads at scale, including API rate limiting, etcd pressure points, and the trade-offs of splitting monolithic pipelines into service-scoped ones.

You will learn:

How to architect CI systems that match team ownership and eliminate shared failure points across services

Kubernetes scaling patterns for CI workloads, including multi-cluster strategies, predictive node provisioning, and handling API throttling

Performance optimization techniques like Git mirroring, node-level caching, and spot instance management for variable CI demands

Migration strategies and lessons learned from moving away from monolithic CI, including proof-of-concept approaches and avoiding the sunk cost fallacy

Sponsor

This episode is brought to you by Testkube—where teams run millions of performance tests in real Kubernetes infrastructure. From air-gapped environments to massive scale deployments, orchestrate every testing tool in one platform. Check it out at testkube.io

More info

Find all the links and info for this episode here: https://ku.bz/klBmzMY5-

Interested in sponsoring an episode? Learn more.

via KubeFM https://kube.fm

September 30, 2025 at 06:00AM

·kube.fm·Sep 30, 2025

Scaling CI horizontally with Buildkite Kubernetes and multiple pipelines with Ben Poland

Good analysis of the GenAI and GPU market | OpenNvidia could become the AI generation's WinTel

Opinion: Duo could dominate in the same way Microsoft and Intel ruled PCs for decades

·theregister.com·Sep 30, 2025

Good analysis of the GenAI and GPU market | OpenNvidia could become the AI generation's WinTel

A Framework for Rapidly Developing and Deploying Protection Against Large Language Model Attacks

The widespread adoption of Large Language Models (LLMs) has revolutionized AI deployment, enabling autonomous and semi-autonomous applications across industries through intuitive language...

·arxiv.org·Sep 29, 2025

A Framework for Rapidly Developing and Deploying Protection Against Large Language Model Attacks

AI & DevOps Toolkit - How I Tamed Chaotic AI Coding with Simple Workflow Commands - https://www.youtube.com/watch?v=LUFJuj1yIik

How I Tamed Chaotic AI Coding with Simple Workflow Commands

Tired of AI coding agents that jump between tasks chaotically and lose track of context? This video demonstrates a complete systematic workflow for AI-assisted development that keeps both you and your AI agent focused and organized from initial idea through production deployment.

I'll walk you through my entire PRD-based development system, showing real implementation of a complex feature from start to finish. You'll see how to create comprehensive technical requirements with AI analysis, track progress systematically, handle inevitable plan changes, prioritize tasks intelligently, and complete features with full traceability. The workflow uses simple MCP commands like /prd-create, /prd-next, /prd-update-progress, and /prd-done to guide systematic development without requiring complex external tools. By the end, you'll understand how to transform chaotic AI coding sessions into structured, professional development workflows that actually ship reliable software.

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: OutSkill 👉 Grab your free seat to the 2-Day AI Mastermind: https://link.outskill.com/AIDOS2 🔐 100% Discount for the first 1000 people 💥 Dive deep into AI and Learn Automations, Build AI Agents, Make videos & images – all for free! 🎁 Bonuses worth $5100+ if you join and attend ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

AICoding #PRDWorkflow #ClaudeCode

Consider joining the channel: https://www.youtube.com/c/devopstoolkit/join

▬▬▬▬▬▬ 🔗 Additional Info 🔗 ▬▬▬▬▬▬ ➡ Transcript and commands: https://devopstoolkit.live/development/how-i-tamed-chaotic-ai-coding-with-simple-workflow-commands 🔗 DevOps AI Toolkit: https://github.com/vfarcic/dot-ai 🎬 Stop Wasting Time: Turn AI Prompts and Context Into Production Code: https://youtu.be/XwWCFINXIoU

▬▬▬▬▬▬ 💰 Sponsorships 💰 ▬▬▬▬▬▬ If you are interested in sponsoring this channel, please visit https://devopstoolkit.live/sponsor for more information. Alternatively, feel free to contact me over Twitter or LinkedIn (see below).

▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/

▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox

▬▬▬▬▬▬ ⏱ Timecodes ⏱ ▬▬▬▬▬▬ 00:00 Introduction 01:50 AI Development Workflow 05:03 Outskill (sponsor) 06:25 Create PRDs with AI 12:27 Find Active PRDs with AI 14:17 Start PRD Implementation with AI 18:21 Track Development Progress with AI 20:50 AI Task Prioritization 22:41 Update PRD Decisions with AI 24:56 Complete PRD Workflow with AI 28:44 Key Takeaways

via YouTube https://www.youtube.com/watch?v=LUFJuj1yIik

·youtube.com·Sep 29, 2025

AI & DevOps Toolkit - How I Tamed Chaotic AI Coding with Simple Workflow Commands - https://www.youtube.com/watch?v=LUFJuj1yIik

week one

It’s officially been one week of unemployment. I’m floored by how many people subscribed, so I just wanted to start by saying thank you truly from the bottom of my heart. A lot of the people who subscribed have been watching me grow for the better part of

·kyliebytes.com·Sep 27, 2025

week one

Last Week in Kubernetes Development - Week Ending September 21 2025

Week Ending September 21, 2025

https://lwkd.info/2025/20250925

Developer News

Ray Wainman shared that he is stepping down as co-lead of SIG Autoscaling, and Adrian Moisey will step into the role alongside Jack Francis.

From the SIG K8s Infra leaders Davanum Srinivas (@dims) and Benjamin Elder (@bentheelder) are stepping down and nominating Ciprian Hacman (@hakman) and Dylan Page (@GenPage) as new chairs.

Release Schedule

Next Deadline: PRR Freeze, October 9

The Kubernetes v1.35 release cycle has officially started and we are now collecting enhancements. Work with your SIG leads to get a lead-opted-in label for your KEPs to get them added to the v1.35 cycle.

Please note that the PRR Freeze is a hard deadline starting v1.35. You can read more about the PRR Freeze deadline here and the exception process here.

Other Merges

Replace HandleCrash with HandleCrashWithContext in apiserver — adds contextual logging

Add case-insensitive DNS subdomain validation via k8s-long-name-caseless format — lets long names be validated without forcing lower case

Enable declarative validation for DeviceClass type in resource APIs and resource APIs (v1, v1beta1, v1beta2) — validation-gen tags + tests.

Ensure cacher and etcd3 use consistent key schema requirements

Add RunWithContext variant to EstablishingController — enables context-aware cancellation and richer logging for controller actions

Use iifname in kube-proxy’s nftables mode for interface matching — improves correct filtering by interface name.

Add k8s-label-key & k8s-label-value formats for declarative validation — enables using those formats in +k8s:format= tags so label keys/values are validated automatically

Honor KUBEADM_UPGRADE_DRYRUN_DIR during kubeadm upgrades

Replace WaitForNamedCacheSync with WaitForNamedCacheSyncWithContext in pkg/controller/ and pkg/controller/garbagecollector

Add fine-grained metrics to distinguish declarative validationmismatches & panics — includes a validation_identifier label for better diagnostics.

Add metric for StatefulSet MaxUnavailable violations — tracks when availability drops below spec’s threshold

Enforce API conventions for Conditions fields — ensures metav1.Condition is used and markers/tags follow standard format

Make admission & pod-security admission checks respect emulation version

Add proper goroutine management in kube-controller-manager to prevent leaks

Update MutatingAdmissionPolicy storage version to use v1beta1

Promotions

Graduate ControlPlaneKubeletLocalMode to GA in kubeadm

Deprecated

Set the deprecated version to 1.34.0 for apiserver_storage_objects metric

Remove automaxprocs workaround now that Go 1.25 manages GOMAXPROCS automatically

Version Updates

golangci-lint to v2.4.0

go language version upgraded to v1.25

system-validators to v1.11.1

Bump Go to 1.25.1, update dependencies & distroless iptables images

Subprojects and Dependency Updates

etcd v3.6.5 fixes lease renewals, snapshot/defrag corruption, removes a flag, builds with Go 1.24.7

kubebuilder v4.9.0 upgrades deps, updates Helm CRDs, fixes Docker builds and CRD handling

prometheus v3.6.0 adds PromQL duration funcs, new TSDB blocks API, OTLP/tracing tweaks, bug fixes

vertical-pod-autoscaler v1.5.0 makes In-Place Updates Beta, deprecates Auto mode, adds metrics, supports K8s 1.34

via Last Week in Kubernetes Development https://lwkd.info/

September 25, 2025 at 07:19PM

·lwkd.info·Sep 25, 2025

Last Week in Kubernetes Development - Week Ending September 21 2025

Announcing Changed Block Tracking API support (alpha)

https://kubernetes.io/blog/2025/09/25/csi-changed-block-tracking/

We're excited to announce the alpha support for a changed block tracking mechanism. This enhances the Kubernetes storage ecosystem by providing an efficient way for CSI storage drivers to identify changed blocks in PersistentVolume snapshots. With a driver that can use the feature, you could benefit from faster and more resource-efficient backup operations.

If you're eager to try this feature, you can skip to the Getting Started section.

What is changed block tracking?

Changed block tracking enables storage systems to identify and track modifications at the block level between snapshots, eliminating the need to scan entire volumes during backup operations. The improvement is a change to the Container Storage Interface (CSI), and also to the storage support in Kubernetes itself. With the alpha feature enabled, your cluster can:

Identify allocated blocks within a CSI volume snapshot

Determine changed blocks between two snapshots of the same volume

Streamline backup operations by focusing only on changed data blocks

For Kubernetes users managing large datasets, this API enables significantly more efficient backup processes. Backup applications can now focus only on the blocks that have changed, rather than processing entire volumes.

Note: As of now, the Changed Block Tracking API is supported only for block volumes and not for file volumes. CSI drivers that manage file-based storage systems will not be able to implement this capability.

Benefits of changed block tracking support in Kubernetes

As Kubernetes adoption grows for stateful workloads managing critical data, the need for efficient backup solutions becomes increasingly important. Traditional full backup approaches face challenges with:

Long backup windows: Full volume backups can take hours for large datasets, making it difficult to complete within maintenance windows.

High resource utilization: Backup operations consume substantial network bandwidth and I/O resources, especially for large data volumes and data-intensive applications.

Increased storage costs: Repetitive full backups store redundant data, causing storage requirements to grow linearly even when only a small percentage of data actually changes between backups.

The Changed Block Tracking API addresses these challenges by providing native Kubernetes support for incremental backup capabilities through the CSI interface.

Key components

The implementation consists of three primary components:

CSI SnapshotMetadata Service API: An API, offered by gRPC, that provides volume snapshot and changed block data.

SnapshotMetadataService API: A Kubernetes CustomResourceDefinition (CRD) that advertises CSI driver metadata service availability and connection details to cluster clients.

External Snapshot Metadata Sidecar: An intermediary component that connects CSI drivers to backup applications via a standardized gRPC interface.

Implementation requirements

Storage provider responsibilities

If you're an author of a storage integration with Kubernetes and want to support the changed block tracking feature, you must implement specific requirements:

Implement CSI RPCs: Storage providers need to implement the SnapshotMetadata service as defined in the CSI specifications protobuf. This service requires server-side streaming implementations for the following RPCs:

GetMetadataAllocated: For identifying allocated blocks in a snapshot

GetMetadataDelta: For determining changed blocks between two snapshots

Storage backend capabilities: Ensure the storage backend has the capability to track and report block-level changes.

Deploy external components: Integrate with the external-snapshot-metadata sidecar to expose the snapshot metadata service.

Register custom resource: Register the SnapshotMetadataService resource using a CustomResourceDefinition and create a SnapshotMetadataService custom resource that advertises the availability of the metadata service and provides connection details.

Support error handling: Implement proper error handling for these RPCs according to the CSI specification requirements.

Backup solution responsibilities

A backup solution looking to leverage this feature must:

Set up authentication: The backup application must provide a Kubernetes ServiceAccount token when using the Kubernetes SnapshotMetadataService API. Appropriate access grants, such as RBAC RoleBindings, must be established to authorize the backup application ServiceAccount to obtain such tokens.

Implement streaming client-side code: Develop clients that implement the streaming gRPC APIs defined in the schema.proto file. Specifically:

Implement streaming client code for GetMetadataAllocated and GetMetadataDelta methods

Handle server-side streaming responses efficiently as the metadata comes in chunks

Process the SnapshotMetadataResponse message format with proper error handling

The external-snapshot-metadata GitHub repository provides a convenient iterator support package to simplify client implementation.

Handle large dataset streaming: Design clients to efficiently handle large streams of block metadata that could be returned for volumes with significant changes.

Optimize backup processes: Modify backup workflows to use the changed block metadata to identify and only transfer changed blocks to make backups more efficient, reducing both backup duration and resource consumption.

Getting started

To use changed block tracking in your cluster:

Ensure your CSI driver supports volume snapshots and implements the snapshot metadata capabilities with the required external-snapshot-metadata sidecar

Make sure the SnapshotMetadataService custom resource is registered using CRD

Verify the presence of a SnapshotMetadataService custom resource for your CSI driver

Create clients that can access the API using appropriate authentication (via Kubernetes ServiceAccount tokens)

The API provides two main functions:

GetMetadataAllocated: Lists blocks allocated in a single snapshot

GetMetadataDelta: Lists blocks changed between two snapshots

What’s next?

Depending on feedback and adoption, the Kubernetes developers hope to push the CSI Snapshot Metadata implementation to Beta in the future releases.

Where can I learn more?

For those interested in trying out this new feature:

Official Kubernetes CSI Developer Documentation

The enhancement proposal for the snapshot metadata feature.

GitHub repository for implementation and release status of external-snapshot-metadata

Complete gRPC protocol definitions for snapshot metadata API: schema.proto

Example snapshot metadata client implementation: snapshot-metadata-lister

End-to-end example with csi-hostpath-driver: example documentation

How do I get involved?

This project, like all of Kubernetes, is the result of hard work by many contributors from diverse backgrounds working together. On behalf of SIG Storage, I would like to offer a huge thank you to the contributors who helped review the design and implementation of the project, including but not limited to the following:

Ben Swartzlander (bswartz)

Carl Braganza (carlbraganza)

Daniil Fedotov (hairyhum)

Ivan Sim (ihcsim)

Nikhil Ladha (Nikhil-Ladha)

Prasad Ghangal (PrasadG193)

Praveen M (iPraveenParihar)

Rakshith R (Rakshith-R)

Xing Yang (xing-yang)

Thank also to everyone who has contributed to the project, including others who helped review the KEP and the CSI spec PR

For those interested in getting involved with the design and development of CSI or any part of the Kubernetes Storage system, join the Kubernetes Storage Special Interest Group (SIG). We always welcome new contributors.

The SIG also holds regular Data Protection Working Group meetings. New attendees are welcome to join our discussions.

via Kubernetes Blog https://kubernetes.io/

September 25, 2025 at 09:00AM

·kubernetes.io·Sep 25, 2025

Announcing Changed Block Tracking API support (alpha)

CHAOSScon Africa 2025 Recap - CHAOSS

CHAOSScon Africa 2025 brought together open source enthusiasts in Africa and beyond to share insights on community health, metrics, data, and inclusivity in open scource.

·chaoss.community·Sep 23, 2025

CHAOSScon Africa 2025 Recap - CHAOSS

We’re living in a golden age of affordable mechanical keyboards

All these boards are bangers.

·theverge.com·Sep 23, 2025

We’re living in a golden age of affordable mechanical keyboards

Worried About Phone Searches? 1Password's Travel Mode Can Clean Up Your Data

Travel Mode not only hides your most sensitive data—it acts as if that data never existed in the first place.

·wired.com·Sep 23, 2025

Worried About Phone Searches? 1Password's Travel Mode Can Clean Up Your Data

Cloud models · Ollama Blog

Cloud models are now in preview, letting you run larger models with fast, datacenter-grade hardware. You can keep using your local tools while running larger models that wouldn’t fit on a personal computer.

·ollama.com·Sep 23, 2025

Cloud models · Ollama Blog

Not Every Problem Needs Kubernetes with Danyl Novhorodov

Not Every Problem Needs Kubernetes, with Danyl Novhorodov

https://ku.bz/BYhFw8RwW

Danyl Novhorodov, a veteran .NET engineer and architect at Eneco, presents his controversial thesis that 90% of teams don't actually need Kubernetes. He walks through practical decision-making frameworks, explores powerful alternatives like BEAM runtimes and Actor models, and explains why starting with modular monoliths often beats premature microservices adoption.

You will learn:

The COST decision framework - How to evaluate infrastructure choices based on Complexity, Ownership, Skills, and Time rather than industry hype

Platform engineering vs. managed services - How to honestly assess whether your team can compete with AWS, Azure, and Google's managed container platforms

Evolutionary architecture approach - Why modular monoliths with clear boundaries often provide better foundations than distributed systems from day one

Sponsor

More info

Find all the links and info for this episode here: https://ku.bz/BYhFw8RwW

Interested in sponsoring an episode? Learn more.

via KubeFM https://kube.fm

September 23, 2025 at 06:00AM

·kube.fm·Sep 23, 2025

Not Every Problem Needs Kubernetes with Danyl Novhorodov

Kubernetes v1.34: Pod Level Resources Graduated to Beta

https://kubernetes.io/blog/2025/09/22/kubernetes-v1-34-pod-level-resources/

On behalf of the Kubernetes community, I am thrilled to announce that the Pod Level Resources feature has graduated to Beta in the Kubernetes v1.34 release and is enabled by default! This significant milestone introduces a new layer of flexibility for defining and managing resource allocation for your Pods. This flexibility stems from the ability to specify CPU and memory resources for the Pod as a whole. Pod level resources can be combined with the container-level specifications to express the exact resource requirements and limits your application needs.

Pod-level specification for resources

Until recently, resource specifications that applied to Pods were primarily defined at the individual container level. While effective, this approach sometimes required duplicating or meticulously calculating resource needs across multiple containers within a single Pod. As a beta feature, Kubernetes allows you to specify the CPU, memory and hugepages resources at the Pod-level. This means you can now define resource requests and limits for an entire Pod, enabling easier resource sharing without requiring granular, per-container management of these resources where it's not needed.

Why does Pod-level specification matter?

This feature enhances resource management in Kubernetes by offering flexible resource management at both the Pod and container levels.

It provides a consolidated approach to resource declaration, reducing the need for meticulous, per-container management, especially for Pods with multiple containers.

Pod-level resources enable containers within a pod to share unused resoures amongst themselves, promoting efficient utilization within the pod. For example, it prevents sidecar containers from becoming performance bottlenecks. Previously, a sidecar (e.g., a logging agent or service mesh proxy) hitting its individual CPU limit could be throttled and slow down the entire Pod, even if the main application container had plenty of spare CPU. With pod-level resources, the sidecar and the main container can share Pod's resource budget, ensuring smooth operation during traffic spikes - either the whole Pod is throttled or all containers work.

When both pod-level and container-level resources are specified, pod-level requests and limits take precedence. This gives you – and cluster administrators - a powerful way to enforce overall resource boundaries for your Pods.

For scheduling, if a pod-level request is explicitly defined, the scheduler uses that specific value to find a suitable node, insteaf of the aggregated requests of the individual containers. At runtime, the pod-level limit acts as a hard ceiling for the combined resource usage of all containers. Crucially, this pod-level limit is the absolute enforcer; even if the sum of the individual container limits is higher, the total resource consumption can never exceed the pod-level limit.

Pod-level resources are prioritized in influencing the Quality of Service (QoS) class of the Pod.

For Pods running on Linux nodes, the Out-Of-Memory (OOM) score adjustment calculation considers both pod-level and container-level resources requests.

Pod-level resources are designed to be compatible with existing Kubernetes functionalities, ensuring a smooth integration into your workflows.

How to specify resources for an entire Pod

Using PodLevelResources feature gate requires Kubernetes v1.34 or newer for all cluster components, including the control plane and every node. This feature gate is in beta and enabled by default in v1.34.

Example manifest

You can specify CPU, memory and hugepages resources directly in the Pod spec manifest at the resources field for the entire Pod.

Here’s an example demonstrating a Pod with both CPU and memory requests and limits defined at the Pod level:

apiVersion: v1 kind: Pod metadata: name: pod-resources-demo namespace: pod-resources-example spec:

The 'resources' field at the Pod specification level defines the overall

resource budget for all containers within this Pod combined.

resources: # Pod-level resources

'limits' specifies the maximum amount of resources the Pod is allowed to use.

The sum of the limits of all containers in the Pod cannot exceed these values.

limits: cpu: "1" # The entire Pod cannot use more than 1 CPU core. memory: "200Mi" # The entire Pod cannot use more than 200 MiB of memory.

'requests' specifies the minimum amount of resources guaranteed to the Pod.

This value is used by the Kubernetes scheduler to find a node with enough capacity.

requests: cpu: "1" # The Pod is guaranteed 1 CPU core when scheduled. memory: "100Mi" # The Pod is guaranteed 100 MiB of memory when scheduled. containers:

name: main-app-container image: nginx ... # This container has no resource requests or limits specified.
name: auxiliary-container image: fedora command: ["sleep", "inf"] ... # This container has no resource requests or limits specified.

In this example, the pod-resources-demo Pod as a whole requests 1 CPU and 100 MiB of memory, and is limited to 1 CPU and 200 MiB of memory. The containers within will operate under these overall Pod-level constraints, as explained in the next section.

Interaction with container-level resource requests or limits

When both pod-level and container-level resources are specified, pod-level requests and limits take precedence. This means the node allocates resources based on the pod-level specifications.

Consider a Pod with two containers where pod-level CPU and memory requests and limits are defined, and only one container has its own explicit resource definitions:

apiVersion: v1 kind: Pod metadata: name: pod-resources-demo namespace: pod-resources-example spec: resources: limits: cpu: "1" memory: "200Mi" requests: cpu: "1" memory: "100Mi" containers:

name: main-app-container image: nginx resources: requests: cpu: "0.5" memory: "50Mi"
name: auxiliary-container image: fedora command: [ "sleep", "inf"] # This container has no resource requests or limits specified.

Pod-Level Limits: The pod-level limits (cpu: "1", memory: "200Mi") establish an absolute boundary for the entire Pod. The sum of resources consumed by all its containers is enforced at this ceiling and cannot be surpassed.

Resource Sharing and Bursting: Containers can dynamically borrow any unused capacity, allowing them to burst as needed, so long as the Pod's aggregate usage stays within the overall limit.

Pod-Level Requests: The pod-level requests (cpu: "1", memory: "100Mi") serve as the foundational resource guarantee for the entire Pod. This value informs the scheduler's placement decision and represents the minimum resources the Pod can rely on during node-level contention.

Container-Level Requests: Container-level requests create a priority system within the Pod's guaranteed budget. Because main-app-container has an explicit request (cpu: "0.5", memory: "50Mi"), it is given precedence for its share of resources under resource pressure over the auxiliary-container, which has no such explicit claim.

Limitations

First of all, in-place resize of pod-level resources is not supported for Kubernetes v1.34 (or earlier). Attempting to modify the pod-level resource limits or requests on a running Pod results in an error: the resize is rejected. The v1.34 implementation of Pod level resources focuses on allowing initial declaration of an overall resource envelope, that applies to the entire Pod. That is distinct from in-place pod resize, which (despite what the name might suggest) allows you to make dynamic adjustments to container resource requests and limits, within a running Pod, and potentially without a container restart. In-place resizing is also not yet a stable feature; it graduated to Beta in the v1.33 release.

Only CPU, memory, and hugepages resources can be specified at pod-level.

Pod-level resources are not supported for Windows pods. If the Pod specification explicitly targets Windows (e.g., by setting spec.os.name: "windows"), the API server will reject the Pod during the validation step. If the Pod is not explicitly marked for Windows but is scheduled to a Windows node (e.g., via a nodeSelector), the Kubelet on that Windows node will reject the Pod during its admission process.

The Topology Manager, Memory Manager and CPU Manager do not align pods and containers based on pod-level resources as these resource managers don't currently support pod-level resources.

Getting started and providing feedback

Ready to explore Pod Level Resources feature? You'll need a Kubernetes cluster running version 1.34 or later. Remember to enable the PodLevelResources feature gate across your control plane and all nodes.

As this feature moves through Beta, your feedback is invaluable. Please report any issues or share your experiences via the standard Kubernetes communication channels:

Slack: #sig-node

Mailing list

Open Community Issues/PRs

via Kubernetes Blog https://kubernetes.io/

September 22, 2025 at 02:30PM

·kubernetes.io·Sep 22, 2025

Kubernetes v1.34: Pod Level Resources Graduated to Beta

Blog: Spotlight on the Kubernetes Steering Committee

https://www.kubernetes.dev/blog/2025/09/22/k8s-steering-spotlight-2025/

This interview was conducted in August 2024, and due to the dynamic nature of the Steering Committee membership and election process it might not represent the actual composition accurately. The topics covered are, however, overwhelmingly relevant to understand its scope of work. As we approach the Steering Committee elections, it provides useful insights into the workings of the Committee.

The Kubernetes Steering Committee is the backbone of the Kubernetes project, ensuring that its vibrant community and governance structures operate smoothly and effectively. While the technical brilliance of Kubernetes is often spotlighted through its Special Interest Groups (SIGs) and Working Groups (WGs), the unsung heroes quietly steering the ship are the members of the Steering Committee. They tackle complex organizational challenges, empower contributors, and foster the thriving open source ecosystem that Kubernetes is celebrated for.

But what does it really take to lead one of the world’s largest open source communities? What are the hidden challenges, and what drives these individuals to dedicate their time and effort to such an impactful role? In this exclusive conversation, we sit down with current Steering Committee (SC) members — Ben, Nabarun, Paco, Patrick, and Maciej — to uncover the rewarding, and sometimes demanding, realities of steering Kubernetes. From their personal journeys and motivations to the committee’s vital responsibilities and future outlook, this Spotlight offers a rare behind-the-scenes glimpse into the people who keep Kubernetes on course.

Introductions

Sandipan: Can you tell us a little bit about yourself?

Ben: Hi, I’m Benjamin Elder, also known as BenTheElder. I started in Kubernetes as a Google Summer of Code student in 2015 and have been working at Google in the space since 2017. I have contributed a lot to many areas but especially build, CI, test tooling, etc. My favorite project so far was building KIND. I have been on the release team, a chair of SIG Testing, and currently a tech lead of SIG Testing and SIG K8s Infra.

Nabarun: Hi, I am Nabarun from India. I have been working on Kubernetes since 2019. I have been contributing across multiple areas in Kubernetes: SIG ContribEx (where I am also a chair), API Machinery, Architecture, and SIG Release, where I contributed to several releases including being the Release Team Lead of Kubernetes 1.21.

Paco: I am Paco from China. I worked as an open source team lead in DaoCloud, Shanghai. In the community, I participate mainly in kubeadm, SIG Node and SIG Testing. Besides, I helped in KCD China and was co-chair of the recent KubeCon+CloudNativeCon China 2024 in Hong Kong.

Patrick: Hello! I’m Patrick. I’ve contributed to Kubernetes since 2018. I started in SIG Storage and then got involved in more and more areas. Nowadays, I am a SIG Testing tech lead, logging infrastructure maintainer, organizer of the Structured Logging and Device Management working groups, contributor in SIG Scheduling, and of course member of the Steering Committee. My main focus area currently is Dynamic Resource Allocation (DRA), a new API for accelerators.

Maciej: Hey, my name is Maciej and I’ve been working on Kubernetes since late 2014 in various areas, including controllers, apiserver and kubectl. Aside from being part of the Steering Committee, I’m also helping guide SIG CLI, SIG Apps and WG Batch.

About the Steering Committee

Sandipan: What does Steering do?

Ben: The charter is the definitive answer, but I see Steering as helping resolve Kubernetes-organization-level “people problems” (as opposed to technical problems), such as clarifying project governance and liaising with the Cloud Native Computing Foundation (for example, to request additional resources and support) and other CNCF projects.

Maciej: Our charter nicely describes all the responsibilities. In short, we make sure the project runs smoothly by supporting our maintainers and contributors in their daily tasks.

Patrick: Ideally, we don’t do anything 😀 All of the day-to-day business has been delegated to SIGs and WGs. Steering gets involved when something pops up where it isn’t obvious who should handle it or when conflicts need to be resolved.

**Sandipan: And how is Steering different from SIGs?

Ben: From a governance perspective: Steering delegates all of the ownership of subprojects to the SIGs and/or committees (Security Response, Code Of Conduct, etc.). They’re very different. The SIGs own pieces of the project, and Steering handles some of the overarching people and policy issues. You’ll find all of the software development, releasing, communications and documentation work happening in the SIGs and committees.

Maciej: SIGs or WGs are primarily concerned with the technical direction of a particular area in Kubernetes. Steering, on the other hand, is primarily concerned with ensuring all the SIGs, WGs, and most importantly maintainers have everything they need to run the project smoothly. This includes anything from ensuring financing of our CI systems, through governance structures and policies all the way to supporting individual maintainers in various inquiries.

**Sandipan: You’ve mentioned projects, could you give us an example of a project Steering has worked on recently?

Ben: We’ve been discussing the logistics to sync a better definition of the project’s official maintainers to the CNCF, which are used, for example, to vote for the Technical Oversight Committee (TOC). Currently that list is the Steering Committee, with SIG Contributor Experience and Infra + Release leads having access to the CNCF service desk. This isn’t well standardized yet across CNCF projects but I think it’s important.

Maciej: For the past year I’ve been sitting on the SC, I believe the majority of tasks we’ve been involved in were around providing letters supporting visa applications. Also, like every year, we’ve been helping all the SIGs and WGs with their annual reports.

Patrick: Apparently it has been a quiet year since Maciej and I joined the Steering Committee at the end of 2023. That’s exactly how it should be.

Sandipan: Do you have any examples of projects that came to Steering, which you then redirected to SIGs?

Ben: We often get requests for test/build related resources that we redirect to SIG K8s Infra + SIG Testing, or more specifically about releasing for subprojects that we redirect to SIG K8s Infra / SIG Release.

The road to the Steering Committee

Sandipan: What motivated you to be part of the Steering Committee? What has your journey been like?

Ben: I had a few people reach out and prompt me to run, but I was motivated by my passion for this community and the project. I think we have something really special going here and I care deeply about the ongoing success. I’ve been involved in this space my whole career and while there’s always rough edges, this community has been really supportive and I hope we can keep it that way.

Paco: After my journey to the Kubernetes Contributor Summit EU 2023, I met and chatted with many maintainers and members there, and attended the steering AMA there for the first time as there hadn’t been a contributor summit in China since 2019, and I started to connect with contributors in China to make it later the year. Through conversations at KCS EU and with local contributors, I realized that it is quite important to make it easy to start a contributor journey for APAC contributors and want to attract more contributors to the community. Then, I was elected just after the KCS CN 2023.

Patrick: I had done a lot of technical work, of which some affects and (hopefully) benefits all contributors to Kubernetes (linting and testing) and users (better log output). I saw joining the Steering Committee as an opportunity to help also with the organizational aspects of running a big open source project.

Maciej: I’ve been going through the idea of running for SC for a while now. My biggest drive was conversations with various members of our community. Eventually last year, I decided to follow their advice, and got elected :-)

Sandipan: What is your favorite part of being part of Steering?

Ben: When we get to help contributors directly. For example, sometimes extensive contributors reach out for an official letter from Steering explaining their contribution and its value for visa support. When we get to just purely help out Kubernetes contributors, that’s my favorite part.

Patrick: It’s a good place to learn more about how the project is actually run, directly from the other great people who are doing it.

Maciej: The same thing as with the project — it’s always the people that surround us, that give us opportunities to collaborate and create something interesting and exciting.

Sandipan: What do you think is most challenging about being part of Steering?

Ben: I think we’ve all spent a lot of time grappling with the sustainability issues in the project and not having a single great answer to solve them. A lot of people are working on these problems but we have limited time and resources. We’ve officially delegated most of this (for example, to SIGs Contributor Experience and K8s Infra), but I think we all still consider it very important and deserving of more time and energy, yet we only have so much and the answers are not obvious. The balancing act is hard.

Paco: Sustainability of contributors and maintainers is one of the most challenging aspects to me. I am constantly advocating for OSS users and employers to join the community. Community is a place that developers can learn from each other, discuss issues they encounter, and share their experience or solutions. Ensuring everyone in the community to feel supported and valued is crucial for the long-term health of the project.

Patrick: There is documentation about how things are done,

·kubernetes.dev·Sep 22, 2025

Blog: Spotlight on the Kubernetes Steering Committee

AI & DevOps Toolkit - Teaching AI Your Company Policies: Vector Search Enforcement - https://www.youtube.com/watch?v=hLK9j2cn6c0

Teaching AI Your Company Policies: Vector Search + Enforcement

Ever wondered why AI keeps failing at simple infrastructure tasks? The problem isn't AI itself - it's that AI doesn't know your company's policies. Most organizations have their rules scattered across wikis, Slack messages, and locked in people's heads, making it impossible for AI to make compliant decisions.

This video demonstrates a different approach: extracting tribal knowledge from your brain and turning it into both AI-searchable policies and automatic Kubernetes enforcement. Using a guided workflow, we'll create database regional compliance policies that simultaneously feed semantic search for AI guidance and generate Kyverno policies for cluster enforcement. Watch as AI learns to proactively recommend compliant configurations while Kubernetes blocks any attempts to violate your rules - creating a dual strategy that works whether someone follows the guidance or tries to bypass it entirely.

KubernetesPolicies #DevOpsAI #Kyverno

Consider joining the channel: https://www.youtube.com/c/devopstoolkit/join

▬▬▬▬▬▬ 🔗 Additional Info 🔗 ▬▬▬▬▬▬ ➡ Transcript and commands: https://devopstoolkit.live/ai/teaching-ai-your-company-policies-vector-search-+-enforcement 🔗 DevOps AI Toolkit: https://github.com/vfarcic/dot-ai 🎬 Stop Blaming AI: Vector DBs + RAG = Game Changer: https://youtu.be/zqpJr1qZhTg

▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/

▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox

▬▬▬▬▬▬ ⏱ Timecodes ⏱ ▬▬▬▬▬▬ 00:00 What are Policies? 05:36 AI Policy Extraction 13:36 Policy Enforcement Demo 17:51 Dual Policy Strategy

via YouTube https://www.youtube.com/watch?v=hLK9j2cn6c0

·youtube.com·Sep 22, 2025

AI & DevOps Toolkit - Teaching AI Your Company Policies: Vector Search Enforcement - https://www.youtube.com/watch?v=hLK9j2cn6c0

Kubernetes v1.34: Recovery From Volume Expansion Failure (GA)

https://kubernetes.io/blog/2025/09/19/kubernetes-v1-34-recover-expansion-failure/

Have you ever made a typo when expanding your persistent volumes in Kubernetes? Meant to specify 2TB but specified 20TiB? This seemingly innocuous problem was kinda hard to fix - and took the project almost 5 years to fix. Automated recovery from storage expansion has been around for a while in beta; however, with the v1.34 release, we have graduated this to general availability.

While it was always possible to recover from failing volume expansions manually, it usually required cluster-admin access and was tedious to do (See aformentioned link for more information).

What if you make a mistake and then realize immediately? With Kubernetes v1.34, you should be able to reduce the requested size of the PersistentVolumeClaim (PVC) and, as long as the expansion to previously requested size hadn't finished, you can amend the size requested. Kubernetes will automatically work to correct it. Any quota consumed by failed expansion will be returned to the user and the associated PersistentVolume should be resized to the latest size you specified.

I'll walk through an example of how all of this works.

Reducing PVC size to recover from failed expansion

Imagine that you are running out of disk space for one of your database servers, and you want to expand the PVC from previously specified 10TB to 100TB - but you make a typo and specify 1000TB.

kind: PersistentVolumeClaim apiVersion: v1 metadata: name: myclaim spec: accessModes:

ReadWriteOnce resources: requests: storage: 1000TB # newly specified size - but incorrect!

Now, you may be out of disk space on your disk array or simply ran out of allocated quota on your cloud-provider. But, assume that expansion to 1000TB is never going to succeed.

In Kubernetes v1.34, you can simply correct your mistake and request a new PVC size, that is smaller than the mistake, provided it is still larger than the original size of the actual PersistentVolume.

kind: PersistentVolumeClaim apiVersion: v1 metadata: name: myclaim spec: accessModes:

ReadWriteOnce resources: requests: storage: 100TB # Corrected size; has to be greater than 10TB. # You cannot shrink the volume below its actual size.

This requires no admin intervention. Even better, any surplus Kubernetes quota that you temporarily consumed will be automatically returned.

This fault recovery mechanism does have a caveat: whatever new size you specify for the PVC, it must be still higher than the original size in .status.capacity. Since Kubernetes doesn't support shrinking your PV objects, you can never go below the size that was originally allocated for your PVC request.

Improved error handling and observability of volume expansion

Implementing what might look like a relatively minor change also required us to almost fully redo how volume expansion works under the hood in Kubernetes. There are new API fields available in PVC objects which you can monitor to observe progress of volume expansion.

Improved observability of in-progress expansion

You can query .status.allocatedResourceStatus['storage'] of a PVC to monitor progress of a volume expansion operation. For a typical block volume, this should transition between ControllerResizeInProgress, NodeResizePending and NodeResizeInProgress and become nil/empty when volume expansion has finished.

If for some reason, volume expansion to requested size is not feasible it should accordingly be in states like - ControllerResizeInfeasible or NodeResizeInfeasible.

You can also observe size towards which Kubernetes is working by watching pvc.status.allocatedResources.

Improved error handling and reporting

Kubernetes should now retry your failed volume expansions at slower rate, it should make fewer requests to both storage system and Kubernetes apiserver.

Errors observerd during volume expansion are now reported as condition on PVC objects and should persist unlike events. Kubernetes will now populate pvc.status.conditions with error keys ControllerResizeError or NodeResizeError when volume expansion fails.

Fixes long standing bugs in resizing workflows

This feature also has allowed us to fix long standing bugs in resizing workflow such as Kubernetes issue #115294. If you observe anything broken, please report your bugs to https://github.com/kubernetes/kubernetes/issues, along with details about how to reproduce the problem.

Working on this feature through its lifecycle was challenging and it wouldn't have been possible to reach GA without feedback from @msau42, @jsafrane and @xing-yang.

All of the contributors who worked on this also appreciate the input provided by @thockin and @liggitt at various Kubernetes contributor summits.

via Kubernetes Blog https://kubernetes.io/

September 19, 2025 at 02:30PM

·kubernetes.io·Sep 19, 2025

Kubernetes v1.34: Recovery From Volume Expansion Failure (GA)

Last Week in Kubernetes Development - Week Ending September 14 2025

Week Ending September 14, 2025

https://lwkd.info/2025/20250918

Developer News

The Steering Committee Election is underway. Please make sure to vote before October 25th, and request an exception if you need one before October 20th.

The Kubernetes Steering Committee reaffirmed that SIG Release and the Release Team have full authority to enforce policies, deadlines, and requirements, including blocking releases if needed. Steering does not override release execution but will back policy updates and clearer communication to ensure safe, stable, and predictable releases.

A medium-severity vulnerability (CVE-2025-9708) affects the Kubernetes C# client ≤ v17.0.13, where improper certificate validation could enable man-in-the-middle attacks. Users are advised to upgrade to v17.0.14+ and review any custom CA usage in kubeconfig files. See the GitHub issue. for more details.

Release Schedule

Next Deadline: 1.35 Release Cycle Starts, September 15

Kubernetes 1.35 release cycle kicks off on Sept 15, targeting final release on Dec 17, 2025, with key milestones including Enhancements Freeze on Oct 16 and Code Freeze on Nov 6.

Patch releases v1.34.1, v1.33.5, v1.32.9, v1.31.13 were out last week, delivering the latest fix and updates.

KEP of the Week

KEP-3243: Respect PodTopologySpread after rolling upgrades

This KEP introduces a complementary field, MatchLabelKeys, in TopologySpreadConstraint to enhance pod topology spread. It allows users to specify only label keys, with kube-apiserver resolving their values from the incoming pod and merging them with the existing LabelSelector to identify the target pod group. This simplifies skew calculation, supports revision-level spreading during Deployment rollouts, and is also handled by kube-scheduler when used in cluster-level default constraints.

This KEP is tracked for beta in v1.34.

Other Merges

Remove container name from container event messages

Replace NewIndexerInformerWatcher with NewIndexerInformerWatcherWithLogger

Standardize not found error message of kubectl scale

validation-gen uses JSON names for error paths

Prevent ClusterIP load balancer loss with InternalTrafficPolicy: Local in kube-proxy

Avoid deadlock when gRPC connection to driver goes idle

validation-gen adds uuid format for string fields

client-go/cli-runtime fixes config override when ClientKey/ClientCertificate are set

Replace WaitForNamedCacheSync with WaitForNamedCacheSyncWithContext

Update PodObservedGenerationTracking description in OpenAPI

kubectl includes container fieldPath in event messages

StorageVersionMigrator adds discovery check to avoid stuck migrations

agnhost adds fake-registry-server for e2e image-pull tests

Add E2e test for cleaning of terminated containers

kube-apiserver protects against delete/finalizer race

Update pod resize test to accept new cpu.weight conversion

DRA accepts implicit device-class extended resource names even when extendedResourceName is set in the DeviceClass

Skip creating storage for non-stored and non-served versions

Allow OpenAPI model package names to be declared by APIs

kubelet fixes negative pod startup duration values

kube-scheduler statusz lists registered paths

applyconfiguration-gen preserves struct and field comments in generated code

Scheduler framework interfaces move to k8s.io/kube-scheduler

CRD validation ratchets the max selectableFields limit

apiserver storage only accesses keys under resourcePrefix

apiserver storage replace SetKeysFunc with EnableResourceSizeEstimation

Subprojects and Dependency Updates

grpc v1.75.0 introduces Spiffe verification, OTel C++ retry metrics, bug fixes, and Python and Ruby updates

nerdctl v2.1.4 adds manifest, export, import commands, improves networking, and drops containerd 1.6 support

vertical-pod-autoscaler v1.4.2 improves logging, fixes updater metrics, adjusts webhook CA, and falls back to eviction on failed updates

via Last Week in Kubernetes Development https://lwkd.info/

September 18, 2025 at 05:00PM

·lwkd.info·Sep 18, 2025

Last Week in Kubernetes Development - Week Ending September 14 2025

Kubernetes v1.34: DRA Consumable Capacity

https://kubernetes.io/blog/2025/09/18/kubernetes-v1-34-dra-consumable-capacity/

Dynamic Resource Allocation (DRA) is a Kubernetes API for managing scarce resources across Pods and containers. It enables flexible resource requests, going beyond simply allocating N number of devices to support more granular usage scenarios. With DRA, users can request specific types of devices based on their attributes, define custom configurations tailored to their workloads, and even share the same resource among multiple containers or Pods.

In this blog, we focus on the device sharing feature and dive into a new capability introduced in Kubernetes 1.34: DRA consumable capacity, which extends DRA to support finer-grained device sharing.

Background: device sharing via ResourceClaims

From the beginning, DRA introduced the ability for multiple Pods to share a device by referencing the same ResourceClaim. This design decouples resource allocation from specific hardware, allowing for more dynamic and reusable provisioning of devices.

In Kubernetes 1.33, the new support for partitionable devices allowed resource drivers to advertise slices of a device that are available, rather than exposing the entire device as an all-or-nothing resource. This enabled Kubernetes to model shareable hardware more accurately.

But there was still a missing piece: it didn't yet support scenarios where the device driver manages fine-grained, dynamic portions of a device resource — like network bandwidth — based on user demand, or to share those resources independently of ResourceClaims, which are restricted by their spec and namespace.

That’s where consumable capacity for DRA comes in.

Benefits of DRA consumable capacity support

Here's a taste of what you get in a cluster with the DRAConsumableCapacity feature gate enabled.

Device sharing across multiple ResourceClaims or DeviceRequests

Resource drivers can now support sharing the same device — or even a slice of a device — across multiple ResourceClaims or across multiple DeviceRequests.

This means that Pods from different namespaces can simultaneously share the same device, if permitted and supported by the specific DRA driver.

Device resource allocation

Kubernetes extends the allocation algorithm in the scheduler to support allocating a portion of a device's resources, as defined in the capacity field. The scheduler ensures that the total allocated capacity across all consumers never exceeds the device’s total capacity, even when shared across multiple ResourceClaims or DeviceRequests. This is very similar to the way the scheduler allows Pods and containers to share allocatable resources on Nodes; in this case, it allows them to share allocatable (consumable) resources on Devices.

This feature expands support for scenarios where the device driver is able to manage resources within a device and on a per-process basis — for example, allocating a specific amount of memory (e.g., 8 GiB) from a virtual GPU, or setting bandwidth limits on virtual network interfaces allocated to specific Pods. This aims to provide safe and efficient resource sharing.

DistinctAttribute constraint

This feature also introduces a new constraint: DistinctAttribute, which is the complement of the existing MatchAttribute constraint.

The primary goal of DistinctAttribute is to prevent the same underlying device from being allocated multiple times within a single ResourceClaim, which could happen since we are allocating shares (or subsets) of devices. This constraint ensures that each allocation refers to a distinct resource, even if they belong to the same device class.

It is useful for use cases such as allocating network devices connecting to different subnets to expand coverage or provide redundancy across failure domains.

How to use consumable capacity?

DRAConsumableCapacity is introduced as an alpha feature in Kubernetes 1.34. The feature gate DRAConsumableCapacity must be enabled in kubelet, kube-apiserver, kube-scheduler and kube-controller-manager.

--feature-gates=...,DRAConsumableCapacity=true

As a DRA driver developer

As a DRA driver developer writing in Golang, you can make a device within a ResourceSlice allocatable to multiple ResourceClaims (or devices.requests) by setting AllowMultipleAllocations to true.

Device { ... AllowMultipleAllocations: ptr.To(true), ... }

Additionally, you can define a policy to restrict how each device's Capacity should be consumed by each DeviceRequest by defining RequestPolicy field in the DeviceCapacity. The example below shows how to define a policy that requires a GPU with 40 GiB of memory to allocate at least 5 GiB per request, with each allocation in multiples of 5 GiB.

DeviceCapacity{ Value: resource.MustParse("40Gi"), RequestPolicy: &CapacityRequestPolicy{ Default: ptr.To(resource.MustParse("5Gi")), ValidRange: &CapacityRequestPolicyRange { Min: ptr.To(resource.MustParse("5Gi")), Step: ptr.To(resource.MustParse("5Gi")), } } }

This will be published to the ResourceSlice, as partially shown below:

apiVersion: resource.k8s.io/v1 kind: ResourceSlice ... spec: devices:

name: gpu0 allowMultipleAllocations: true capacity: memory: value: 40Gi requestPolicy: default: 5Gi validRange: min: 5Gi step: 5Gi

An allocated device with a specified portion of consumed capacity will have a ShareID field set in the allocation status.

claim.Status.Allocation.Devices.Results[i].ShareID

This ShareID allows the driver to distinguish between different allocations that refer to the same device or same statically-partitioned slice but come from different ResourceClaim requests.

It acts as a unique identifier for each shared slice, enabling the driver to manage and enforce resource limits independently across multiple consumers.

As a consumer

As a consumer (or user), the device resource can be requested with a ResourceClaim like this:

apiVersion: resource.k8s.io/v1 kind: ResourceClaim ... spec: devices: requests: # for devices

name: req0 exactly:
deviceClassName: resource.example.com capacity: requests: # for resources which must be provided by those devices memory: 10Gi

This configuration ensures that the requested device can provide at least 10GiB of memory.

Notably that any resource.example.com device that has at least 10GiB of memory can be allocated. If a device that does not support multiple allocations is chosen, the allocation would consume the entire device. To filter only devices that support multiple allocations, you can define a selector like this:

selectors:

cel: expression: |- device.allowMultipleAllocations == true

Integration with DRA device status

In device sharing, general device information is provided through the resource slice. However, some details are set dynamically after allocation. These can be conveyed using the .status.devices field of a ResourceClaim. That field is only published in clusters where the DRAResourceClaimDeviceStatus feature gate is enabled.

If you do have device status support available, a driver can expose additional device-specific information beyond the ShareID. One particularly useful use case is for virtual networks, where a driver can include the assigned IP address(es) in the status. This is valuable for both network service operations and troubleshooting.

You can find more information by watching our recording at: KubeCon Japan 2025 - Reimagining Cloud Native Networks: The Critical Role of DRA.

What can you do next?

Check out the CNI DRA Driver project for an example of DRA integration in Kubernetes networking. Try integrating with network resources like macvlan, ipvlan, or smart NICs.

Start enabling the DRAConsumableCapacity feature gate and experimenting with virtualized or partitionable devices. Specify your workloads with consumable capacity (for example: fractional bandwidth or memory).

Let us know your feedback:

✅ What worked well?

⚠️ What didn’t?

If you encountered issues to fix or opportunities to enhance, please file a new issue and reference KEP-5075 there, or reach out via Slack (#wg-device-management).

Conclusion

Consumable capacity support enhances the device sharing capability of DRA by allowing effective device sharing across namespaces, across claims, and tailored to each Pod’s actual needs. It also empowers drivers to enforce capacity limits, improves scheduling accuracy, and unlocks new use cases like bandwidth-aware networking and multi-tenant device sharing.

Try it out, experiment with consumable resources, and help shape the future of dynamic resource allocation in Kubernetes!

TerminalAgents #CodingAI #GPT5

Consider joining the channel: https://www.youtube.com/c/devopstoolkit/join

▬▬▬▬▬▬ 🔗 Additional Info 🔗 ▬▬▬▬▬▬ ➡ Transcript and commands: https://devopstoolkit.live/ai/terminal-agents-codex-vs.-crush-vs.-opencode-vs.-cursor-cli-vs.-claude-code

▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/

▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox

▬▬▬▬▬▬ ⏱ Timecodes ⏱ ▬▬▬▬▬▬ 00:00 Terminal-Based Coding Agents 01:09 Outskill (sponsor) 02:31 Why Terminal AI Agents Matter 06:54 Codex CLI - OpenAI's Terminal Agent 12:03 Charm Crush - Beautiful Terminal UI Agent 17:18 OpenCode - SST's Terminal Agent 20:13 Cursor CLI - From Cursor IDE Makers 24:10 Terminal AI Agents - Final Verdict

via YouTube https://www.youtube.com/watch?v=MXOP4WELkCc

·youtube.com·Sep 15, 2025

AI & DevOps Toolkit - Terminal Agents: Codex vs. Crush vs. OpenCode vs. Cursor CLI vs. Claude Code - https://www.youtube.com/watch?v=MXOP4WELkCc

The making of Flux: The origin

https://ku.bz/5Sf5wpd8y

This episode unpacks the technical and governance milestones that secured Flux's place in the cloud-native ecosystem, from a 45-minute production outage that led to the birth of GitOps to the CNCF process that defines project maturity and the handover of stewardship after Weaveworks' closure.

You will learn:

How a single incident pushed Weaveworks to adopt Git as the source of truth, creating the foundation of GitOps.

How Flux sustained continuity after Weaveworks shut down through community governance.

Where Flux is heading next with security guidance, Flux v2, and an enterprise-ready roadmap.

Sponsor

Join the Flux maintainers and community at FluxCon, November 11th in Salt Lake City—register here

More info

Find all the links and info for this episode here: https://ku.bz/5Sf5wpd8y

Interested in sponsoring an episode? Learn more.

via KubeFM https://kube.fm

September 15, 2025 at 06:00AM

·kube.fm·Sep 15, 2025

The making of Flux: The origin

Kubernetes v1.34: Autoconfiguration for Node Cgroup Driver Goes GA

https://kubernetes.io/blog/2025/09/12/kubernetes-v1-34-cri-cgroup-driver-lookup-now-ga/

Historically, configuring the correct cgroup driver has been a pain point for users running new Kubernetes clusters. On Linux systems, there are two different cgroup drivers: cgroupfs and systemd. In the past, both the kubelet and CRI implementation (like CRI-O or containerd) needed to be configured to use the same cgroup driver, or else the kubelet would misbehave without any explicit error message. This was a source of headaches for many cluster admins. Now, we've (almost) arrived at the end of that headache.

Automated cgroup driver detection

In v1.28.0, the SIG Node community introduced the feature gate KubeletCgroupDriverFromCRI, which instructs the kubelet to ask the CRI implementation which cgroup driver to use. You can read more here. After many releases of waiting for each CRI implementation to have major versions released and packaged in major operating systems, this feature has gone GA as of Kubernetes 1.34.0.

In addition to setting the feature gate, a cluster admin needs to ensure their CRI implementation is new enough:

containerd: Support was added in v2.0.0

CRI-O: Support was added in v1.28.0

Announcement: Kubernetes is deprecating containerd v1.y support

While CRI-O releases versions that match Kubernetes versions, and thus CRI-O versions without this behavior are no longer supported, containerd maintains its own release cycle. containerd support for this feature is only in v2.0 and later, but Kubernetes 1.34 still supports containerd 1.7 and other LTS releases of containerd.

The Kubernetes SIG Node community has formally agreed upon a final support timeline for containerd v1.y. The last Kubernetes release to offer this support will be the last released version of v1.35, and support will be dropped in v1.36.0. To assist administrators in managing this future transition, a new detection mechanism is available. You are able to monitor the kubelet_cri_losing_support metric to determine if any nodes in your cluster are using a containerd version that will soon be outdated. The presence of this metric with a version label of 1.36.0 will indicate that the node's containerd runtime is not new enough for the upcoming requirements. Consequently, an administrator will need to upgrade containerd to v2.0 or a later version before, or at the same time as, upgrading the kubelet to v1.36.0.

via Kubernetes Blog https://kubernetes.io/

September 12, 2025 at 02:30PM

·kubernetes.io·Sep 13, 2025

Kubernetes v1.34: Autoconfiguration for Node Cgroup Driver Goes GA