1_r/devopsish

1_r/devopsish

54941 bookmarks
Custom sorting
Spotlight on Policy Working Group
Spotlight on Policy Working Group

Spotlight on Policy Working Group

https://kubernetes.io/blog/2025/10/18/wg-policy-spotlight-2025/

(Note: The Policy Working Group has completed its mission and is no longer active. This article reflects its work, accomplishments, and insights into how a working group operates.)

In the complex world of Kubernetes, policies play a crucial role in managing and securing clusters. But have you ever wondered how these policies are developed, implemented, and standardized across the Kubernetes ecosystem? To answer that, let's take a look back at the work of the Policy Working Group.

The Policy Working Group was dedicated to a critical mission: providing an overall architecture that encompasses both current policy-related implementations and future policy proposals in Kubernetes. Their goal was both ambitious and essential: to develop a universal policy architecture that benefits developers and end-users alike.

Through collaborative methods, this working group strove to bring clarity and consistency to the often complex world of Kubernetes policies. By focusing on both existing implementations and future proposals, they ensured that the policy landscape in Kubernetes remains coherent and accessible as the technology evolves.

This blog post dives deeper into the work of the Policy Working Group, guided by insights from its former co-chairs:

Jim Bugwadia

Poonam Lamba

Andy Suderman

Interviewed by Arujjwal Negi.

These co-chairs explained what the Policy Working Group was all about.

Introduction

Hello, thank you for the time! Let’s start with some introductions, could you tell us a bit about yourself, your role, and how you got involved in Kubernetes?

Jim Bugwadia: My name is Jim Bugwadia, and I am a co-founder and the CEO at Nirmata which provides solutions that automate security and compliance for cloud-native workloads. At Nirmata, we have been working with Kubernetes since it started in 2014. We initially built a Kubernetes policy engine in our commercial platform and later donated it to CNCF as the Kyverno project. I joined the CNCF Kubernetes Policy Working Group to help build and standardize various aspects of policy management for Kubernetes and later became a co-chair.

Andy Suderman: My name is Andy Suderman and I am the CTO of Fairwinds, a managed Kubernetes-as-a-Service provider. I began working with Kubernetes in 2016 building a web conferencing platform. I am an author and/or maintainer of several Kubernetes-related open-source projects such as Goldilocks, Pluto, and Polaris. Polaris is a JSON-schema-based policy engine, which started Fairwinds' journey into the policy space and my involvement in the Policy Working Group.

Poonam Lamba: My name is Poonam Lamba, and I currently work as a Product Manager for Google Kubernetes Engine (GKE) at Google. My journey with Kubernetes began back in 2017 when I was building an SRE platform for a large enterprise, using a private cloud built on Kubernetes. Intrigued by its potential to revolutionize the way we deployed and managed applications at the time, I dove headfirst into learning everything I could about it. Since then, I've had the opportunity to build the policy and compliance products for GKE. I lead and contribute to GKE CIS benchmarks. I am involved with the Gatekeeper project as well as I have contributed to Policy-WG for over 2 years and served as a co-chair for the group.

Responses to the following questions represent an amalgamation of insights from the former co-chairs.

About Working Groups

One thing even I am not aware of is the difference between a working group and a SIG. Can you help us understand what a working group is and how it is different from a SIG?

Unlike SIGs, working groups are temporary and focused on tackling specific, cross-cutting issues or projects that may involve multiple SIGs. Their lifespan is defined, and they disband once they've achieved their objective. Generally, working groups don't own code or have long-term responsibility for managing a particular area of the Kubernetes project.

(To know more about SIGs, visit the list of Special Interest Groups)

You mentioned that Working Groups involve multiple SIGS. What SIGS was the Policy WG closely involved with, and how did you coordinate with them?

The group collaborated closely with Kubernetes SIG Auth throughout our existence, and more recently, the group also worked with SIG Security since its formation. Our collaboration occurred in a few ways. We provided periodic updates during the SIG meetings to keep them informed of our progress and activities. Additionally, we utilize other community forums to maintain open lines of communication and ensured our work aligned with the broader Kubernetes ecosystem. This collaborative approach helped the group stay coordinated with related efforts across the Kubernetes community.

Policy WG

Why was the Policy Working Group created?

To enable a broad set of use cases, we recognize that Kubernetes is powered by a highly declarative, fine-grained, and extensible configuration management system. We've observed that a Kubernetes configuration manifest may have different portions that are important to various stakeholders. For example, some parts may be crucial for developers, while others might be of particular interest to security teams or address operational concerns. Given this complexity, we believe that policies governing the usage of these intricate configurations are essential for success with Kubernetes.

Our Policy Working Group was created specifically to research the standardization of policy definitions and related artifacts. We saw a need to bring consistency and clarity to how policies are defined and implemented across the Kubernetes ecosystem, given the diverse requirements and stakeholders involved in Kubernetes deployments.

Can you give me an idea of the work you did in the group?

We worked on several Kubernetes policy-related projects. Our initiatives included:

We worked on a Kubernetes Enhancement Proposal (KEP) for the Kubernetes Policy Reports API. This aims to standardize how policy reports are generated and consumed within the Kubernetes ecosystem.

We conducted a CNCF survey to better understand policy usage in the Kubernetes space. This helped gauge the practices and needs across the community at the time.

We wrote a paper that will guide users in achieving PCI-DSS compliance for containers. This is intended to help organizations meet important security standards in their Kubernetes environments.

We also worked on a paper highlighting how shifting security down can benefit organizations. This focuses on the advantages of implementing security measures earlier in the development and deployment process.

Can you tell us what were the main objectives of the Policy Working Group and some of your key accomplishments?

The charter of the Policy WG was to help standardize policy management for Kubernetes and educate the community on best practices.

To accomplish this we updated the Kubernetes documentation (Policies | Kubernetes), produced several whitepapers (Kubernetes Policy Management, Kubernetes GRC), and created the Policy Reports API (API reference) which standardizes reporting across various tools. Several popular tools such as Falco, Trivy, Kyverno, kube-bench, and others support the Policy Report API. A major milestone for the Policy WG was promoting the Policy Reports API to a SIG-level API or finding it a stable home.

Beyond that, as ValidatingAdmissionPolicy and MutatingAdmissionPolicy approached GA in Kubernetes, a key goal of the WG was to guide and educate the community on the tradeoffs and appropriate usage patterns for these built-in API objects and other CNCF policy management solutions like OPA/Gatekeeper and Kyverno.

Challenges

What were some of the major challenges that the Policy Working Group worked on?

During our work in the Policy Working Group, we encountered several challenges:

One of the main issues we faced was finding time to consistently contribute. Given that many of us have other professional commitments, it can be difficult to dedicate regular time to the working group's initiatives.

Another challenge we experienced was related to our consensus-driven model. While this approach ensures that all voices are heard, it can sometimes lead to slower decision-making processes. We valued thorough discussion and agreement, but this can occasionally delay progress on our projects.

We've also encountered occasional differences of opinion among group members. These situations require careful navigation to ensure that we maintain a collaborative and productive environment while addressing diverse viewpoints.

Lastly, we've noticed that newcomers to the group may find it difficult to contribute effectively without consistent attendance at our meetings. The complex nature of our work often requires ongoing context, which can be challenging for those who aren't able to participate regularly.

Can you tell me more about those challenges? How did you discover each one? What has the impact been? What were some strategies you used to address them?

There are no easy answers, but having more contributors and maintainers greatly helps! Overall the CNCF community is great to work with and is very welcoming to beginners. So, if folks out there are hesitating to get involved, I highly encourage them to attend a WG or SIG meeting and just listen in.

It often takes a few meetings to fully understand the discussions, so don't feel discouraged if you don't grasp everything right away. We made a point to emphasize this and encouraged new members to review documentation as a starting point for getting involved.

Additionally, differences of opinion were valued and encouraged within the Policy-WG. We adhered to the CNCF core values and resolve disagreements by maintaining respect for one another. We also strove to timebox our decisions and assign clear responsibilities to keep things movin

·kubernetes.io·
Spotlight on Policy Working Group
Blog: Spotlight on Policy Working Group
Blog: Spotlight on Policy Working Group

Blog: Spotlight on Policy Working Group

https://www.kubernetes.dev/blog/2025/10/18/wg-policy-spotlight-2025/

(Note: The Policy Working Group has completed its mission and is no longer active. This article reflects its work, accomplishments, and insights into how a working group operates.)

In the complex world of Kubernetes, policies play a crucial role in managing and securing clusters. But have you ever wondered how these policies are developed, implemented, and standardized across the Kubernetes ecosystem? To answer that, let’s take a look back at the work of the Policy Working Group.

The Policy Working Group was dedicated to a critical mission: providing an overall architecture that encompasses both current policy-related implementations and future policy proposals in Kubernetes. Their goal was both ambitious and essential: to develop a universal policy architecture that benefits developers and end-users alike.

Through collaborative methods, this working group strove to bring clarity and consistency to the often complex world of Kubernetes policies. By focusing on both existing implementations and future proposals, they ensured that the policy landscape in Kubernetes remains coherent and accessible as the technology evolves.

This blog post dives deeper into the work of the Policy Working Group, guided by insights from its former co-chairs:

Jim Bugwadia

Poonam Lamba

Andy Suderman

Interviewed by Arujjwal Negi.

These co-chairs explained what the Policy Working Group was all about.

Introduction

Hello, thank you for the time! Let’s start with some introductions, could you tell us a bit about yourself, your role, and how you got involved in Kubernetes?

Jim Bugwadia: My name is Jim Bugwadia, and I am a co-founder and the CEO at Nirmata which provides solutions that automate security and compliance for cloud-native workloads. At Nirmata, we have been working with Kubernetes since it started in 2014. We initially built a Kubernetes policy engine in our commercial platform and later donated it to CNCF as the Kyverno project. I joined the CNCF Kubernetes Policy Working Group to help build and standardize various aspects of policy management for Kubernetes and later became a co-chair.

Andy Suderman: My name is Andy Suderman and I am the CTO of Fairwinds, a managed Kubernetes-as-a-Service provider. I began working with Kubernetes in 2016 building a web conferencing platform. I am an author and/or maintainer of several Kubernetes-related open-source projects such as Goldilocks, Pluto, and Polaris. Polaris is a JSON-schema-based policy engine, which started Fairwinds’ journey into the policy space and my involvement in the Policy Working Group.

Poonam Lamba: My name is Poonam Lamba, and I currently work as a Product Manager for Google Kubernetes Engine (GKE) at Google. My journey with Kubernetes began back in 2017 when I was building an SRE platform for a large enterprise, using a private cloud built on Kubernetes. Intrigued by its potential to revolutionize the way we deployed and managed applications at the time, I dove headfirst into learning everything I could about it. Since then, I’ve had the opportunity to build the policy and compliance products for GKE. I lead and contribute to GKE CIS benchmarks. I am involved with the Gatekeeper project as well as I have contributed to Policy-WG for over 2 years and served as a co-chair for the group.

Responses to the following questions represent an amalgamation of insights from the former co-chairs.

About Working Groups

One thing even I am not aware of is the difference between a working group and a SIG. Can you help us understand what a working group is and how it is different from a SIG?

Unlike SIGs, working groups are temporary and focused on tackling specific, cross-cutting issues or projects that may involve multiple SIGs. Their lifespan is defined, and they disband once they’ve achieved their objective. Generally, working groups don’t own code or have long-term responsibility for managing a particular area of the Kubernetes project.

(To know more about SIGs, visit the list of Special Interest Groups)

You mentioned that Working Groups involve multiple SIGS. What SIGS was the Policy WG closely involved with, and how did you coordinate with them?

The group collaborated closely with Kubernetes SIG Auth throughout our existence, and more recently, the group also worked with SIG Security since its formation. Our collaboration occurred in a few ways. We provided periodic updates during the SIG meetings to keep them informed of our progress and activities. Additionally, we utilize other community forums to maintain open lines of communication and ensured our work aligned with the broader Kubernetes ecosystem. This collaborative approach helped the group stay coordinated with related efforts across the Kubernetes community.

Policy WG

Why was the Policy Working Group created?

To enable a broad set of use cases, we recognize that Kubernetes is powered by a highly declarative, fine-grained, and extensible configuration management system. We’ve observed that a Kubernetes configuration manifest may have different portions that are important to various stakeholders. For example, some parts may be crucial for developers, while others might be of particular interest to security teams or address operational concerns. Given this complexity, we believe that policies governing the usage of these intricate configurations are essential for success with Kubernetes.

Our Policy Working Group was created specifically to research the standardization of policy definitions and related artifacts. We saw a need to bring consistency and clarity to how policies are defined and implemented across the Kubernetes ecosystem, given the diverse requirements and stakeholders involved in Kubernetes deployments.

Can you give me an idea of the work you did in the group?

We worked on several Kubernetes policy-related projects. Our initiatives included:

We worked on a Kubernetes Enhancement Proposal (KEP) for the Kubernetes Policy Reports API. This aims to standardize how policy reports are generated and consumed within the Kubernetes ecosystem.

We conducted a CNCF survey to better understand policy usage in the Kubernetes space. This helped gauge the practices and needs across the community at the time.

We wrote a paper that will guide users in achieving PCI-DSS compliance for containers. This is intended to help organizations meet important security standards in their Kubernetes environments.

We also worked on a paper highlighting how shifting security down can benefit organizations. This focuses on the advantages of implementing security measures earlier in the development and deployment process.

Can you tell us what were the main objectives of the Policy Working Group and some of your key accomplishments?

The charter of the Policy WG was to help standardize policy management for Kubernetes and educate the community on best practices.

To accomplish this we updated the Kubernetes documentation (Policies | Kubernetes), produced several whitepapers (Kubernetes Policy Management, Kubernetes GRC), and created the Policy Reports API (API reference) which standardizes reporting across various tools. Several popular tools such as Falco, Trivy, Kyverno, kube-bench, and others support the Policy Report API. A major milestone for the Policy WG was promoting the Policy Reports API to a SIG-level API or finding it a stable home.

Beyond that, as ValidatingAdmissionPolicy and MutatingAdmissionPolicy approached GA in Kubernetes, a key goal of the WG was to guide and educate the community on the tradeoffs and appropriate usage patterns for these built-in API objects and other CNCF policy management solutions like OPA/Gatekeeper and Kyverno.

Challenges

What were some of the major challenges that the Policy Working Group worked on?

During our work in the Policy Working Group, we encountered several challenges:

One of the main issues we faced was finding time to consistently contribute. Given that many of us have other professional commitments, it can be difficult to dedicate regular time to the working group’s initiatives.

Another challenge we experienced was related to our consensus-driven model. While this approach ensures that all voices are heard, it can sometimes lead to slower decision-making processes. We valued thorough discussion and agreement, but this can occasionally delay progress on our projects.

We’ve also encountered occasional differences of opinion among group members. These situations require careful navigation to ensure that we maintain a collaborative and productive environment while addressing diverse viewpoints.

Lastly, we’ve noticed that newcomers to the group may find it difficult to contribute effectively without consistent attendance at our meetings. The complex nature of our work often requires ongoing context, which can be challenging for those who aren’t able to participate regularly.

Can you tell me more about those challenges? How did you discover each one? What has the impact been? What were some strategies you used to address them?

There are no easy answers, but having more contributors and maintainers greatly helps! Overall the CNCF community is great to work with and is very welcoming to beginners. So, if folks out there are hesitating to get involved, I highly encourage them to attend a WG or SIG meeting and just listen in.

It often takes a few meetings to fully understand the discussions, so don’t feel discouraged if you don’t grasp everything right away. We made a point to emphasize this and encouraged new members to review documentation as a starting point for getting involved.

Additionally, differences of opinion were valued and encouraged within the Policy-WG. We adhered to the CNCF core values and resolve disagreements by maintaining respect for one another. We also strove to timebox our decisions and assign clear responsibilities to keep t

·kubernetes.dev·
Blog: Spotlight on Policy Working Group
Last Week in Kubernetes Development - Week Ending October 12 2025
Last Week in Kubernetes Development - Week Ending October 12 2025

Week Ending October 12, 2025

https://lwkd.info/2025/20251015

Developer News

The ballots for the Steering Committee Elections are due on October 24th. If you haven’t already, submit your Steering votes. If you have contributed to Kubernetes in the last year but haven’t met the eligibility requirements, you will need to submit an exception request to vote in the steering election, the deadline for which is October 22nd.

The CFP for Maintainer Summit: KubeCon + CloudNativeCon Europe 2026 is open. Please send in your submissions before 14th December 2025.

SIG-Testing is continuing to improve alpha/beta feature coverage, including moving kind-beta-features to release blocking and several other beta jobs to release-informing.

Release Schedule

Next Deadline: Docs Deadline for placeholder PRs, October 23

We are in PRR freeze. Enhancements Freeze will begin this week (16th October). If you are going to miss the deadline, please file an Exception.

Patch releases have been delayed until 22nd October.

Featured PRs

134433 : kubeadm print errors during control-plane-wait retries

This PR improves troubleshooting during control plane startup by ensuring that errors encountered while waiting for control plane components are printed during each retry at log verbosity level 5. Previously, these errors were not shown, which made it harder to identify issues when components failed to become ready. With this change, administrators can now see the actual errors without additional steps, making failure causes more visible and debugging faster.

KEP of the Week

KEP-4622: New TopologyManager Policy which configure the value of maxAllowableNUMANodes

This KEP introduces a new TopologyManager policy option called max-allowable-numa-nodes, allowing users to configure the maximum number of NUMA nodes supported by the TopologyManager. Previously, this value was hardcoded to 8 as a temporary measure to prevent state explosion. By making it configurable, the KEP enables better support for high-end CPUs with more than 8 NUMA nodes, without changing existing TopologyManager policies or addressing broader resource management aspects.

This KEP is tracked as stable in v1.35

Other Merges

Enforce valid label-key format in device tolerations

Add declarative validation and path normalization for ResourceClaim fields

Remove runtime gogo protobuf dependencies from Kubernetes API types

Fix IPv6 allocator for /64 CIDRs

Add -n shorthand flag for kubectl config set-context

Add k8s.update flag to enable validation rules just for updates

Prevent panic when creating an invalid CronJob schedule

Stop calling --chunk-size beta, it’s been around since 2017

Make sure that the eviction controller knows about NoExecute device tolerations

APIApprovalController can run with contextual logging

kubeadm: show control plane retry errors

ResourceClaim: ensure that fields don’t exceed list limits, that shareID is validated, and that it supports the immutable tag and long name format

Add test for endpoint/endpointslice headless label propagation

Maybe don’t let folks create ResourceQuotas with request > limit

kubectl gets -n shorthand for --namespace

Set FeatureGates simultaneously during tests to avoid dependency problems

DeviceRequests exactly and firstAvailable shortcut some logic

Refactor away most of the dependencies on the unmaintained gogo protobuf library

Allocate within IPv6 subnets correctly

resource.k8s.io v1 API is now the default

Prometheus client can handle deprecated/missing metrics

APIserver will abort startup due to invalid CA configuration

Subprojects and Dependency Updates

headlamp v0.36.0 adds EndpointSlice support, label-based search, and clipboard copy for resource names.

cloud-provider-openstack v1.34.1 updates test dependencies and fixes build-script issues across OCCM and CSI plugins. Multiple Helm charts were also updated.

csi-driver-nfs v4.12.1 updates CSI release tools and documentation for NFS volumes.

csi-driver-smb v1.19.1 updates CSI release tooling and improves maintenance scripts.

kubespray v2.29.0 adds new configuration options, supports Kubernetes v1.33.1 and Debian 13 Trixie, and upgrades major components

prometheus v3.7.0 adds experimental anchored and smoothed rate functions, introduces NHCB, improves rule evaluation and TSDB logging, and deprecates several remote-write metrics.

via Last Week in Kubernetes Development https://lwkd.info/

October 15, 2025 at 06:00PM

·lwkd.info·
Last Week in Kubernetes Development - Week Ending October 12 2025
AWS Deprecates Two Dozen Services (Most of Which You've Never Heard Of)
AWS Deprecates Two Dozen Services (Most of Which You've Never Heard Of)
AWS has done its quarterly housecleaning / "Googling" of its services, and deprecated what appears at first glance to be a startlingly long list. However, going through them put my mind at ease, and I'm hoping this post can do the same for you.
·lastweekinaws.com·
AWS Deprecates Two Dozen Services (Most of Which You've Never Heard Of)
The Data Engineer's guide to optimizing Kubernetes with Niels Claeys
The Data Engineer's guide to optimizing Kubernetes with Niels Claeys

The Data Engineer's guide to optimizing Kubernetes, with Niels Claeys

https://ku.bz/hGRfkzDJW

Niels Claeys shares how his team at DataMinded built Conveyor, a data platform processing up to 1.5 million core hours monthly. He explains the specific optimizations they discovered through production experience, from scheduler changes that immediately reduce costs by 10-15% to achieving 97% spot instance usage without reliability issues.

You will learn:

Why the default Kubernetes scheduler wastes money on batch workloads and how switching from "least allocated" to "most allocated" scheduling enables faster scale-down and better resource utilization

How to achieve 97% spot instance adoption through strategic instance type diversification, region selection, and Spark-specific techniques

Node pool design principles that balance Kubernetes overhead with workload efficiency

Platform-specific gotchas like AWS cross-AZ data transfer costs that can spike bills unexpectedly

Sponsor

This episode is brought to you by Testkube—where teams run millions of performance tests in real Kubernetes infrastructure. From air-gapped environments to massive scale deployments, orchestrate every testing tool in one platform. Check it out at testkube.io

More info

Find all the links and info for this episode here: https://ku.bz/hGRfkzDJW

Interested in sponsoring an episode? Learn more.

via KubeFM https://kube.fm

October 14, 2025 at 02:00AM

·kube.fm·
The Data Engineer's guide to optimizing Kubernetes with Niels Claeys
How to use your network to get a job
How to use your network to get a job
Common advice for job seekers is to "use your network" but what does that mean, exactly? Let's break...
·dev.to·
How to use your network to get a job
DevOps & AI Toolkit - Why Your Infrastructure AI Sucks (And How to Fix It) - https://www.youtube.com/watch?v=Ma3gKmuXahc
DevOps & AI Toolkit - Why Your Infrastructure AI Sucks (And How to Fix It) - https://www.youtube.com/watch?v=Ma3gKmuXahc

Why Your Infrastructure AI Sucks (And How to Fix It)

Discover why your AI agent is completely failing at infrastructure management and learn to build an AI-powered Internal Developer Platform that actually works. Most organizations are treating AI like a search engine, asking vague questions and getting generic answers that break in production. This video reveals the five critical components that transform useless AI into intelligent infrastructure automation.

You'll learn to build capabilities discovery using Vector databases for semantic search across Kubernetes resources, capture organizational patterns from tribal knowledge and documentation, create enforceable policies that guide AI toward compliance, implement proper context management to avoid the bloated mess most systems become, and design intelligent workflows that guide users to the right solutions instead of relying on guesswork. Watch as we demonstrate the complete transformation from a generic AI response to a fully functional PostgreSQL deployment that follows organizational patterns, enforces compliance policies, and deploys correctly the first time.

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: Tuple 🔗 https://tuple.app/DOT 👉 Promo code: DOT2025 ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

AIInfrastructure #InternalDeveloperPlatform #KubernetesAI

Consider joining the channel: https://www.youtube.com/c/devopstoolkit/join

▬▬▬▬▬▬ 🔗 Additional Info 🔗 ▬▬▬▬▬▬ ➡ Transcript and commands: https://devopstoolkit.live/internal-developer-platforms/why-your-infrastructure-ai-sucks-and-how-to-fix-it 🔗 DevOps AI Toolkit: https://github.com/vfarcic/dot-ai 🎬 Stop Blaming AI: Vector DBs + RAG = Game Changer: https://youtu.be/zqpJr1qZhTg 🎬 Why Kubernetes Discovery Sucks for AI (And How Vector DBs Fix It): https://youtu.be/MSNstHj4rmk

▬▬▬▬▬▬ 💰 Sponsorships 💰 ▬▬▬▬▬▬ If you are interested in sponsoring this channel, please visit https://devopstoolkit.live/sponsor for more information. Alternatively, feel free to contact me over Twitter or LinkedIn (see below).

▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/

▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox

▬▬▬▬▬▬ ⏱ Timecodes ⏱ ▬▬▬▬▬▬ 00:00 AI for Infrastructure Challenges 01:42 Tuple (sponsor) 03:16 Why Your AI Agent Is Useless 09:52 Kubernetes API Discovery That Actually Works 13:41 Organizational Knowledge AI Can Actually Use 17:49 Stop Breaking Production With AI 22:17 The Context Window Disaster Nobody Talks About 25:16 Smart Conversations That Get Results 29:34 Your Complete AI-Powered IDP Blueprint

via YouTube https://www.youtube.com/watch?v=Ma3gKmuXahc

·youtube.com·
DevOps & AI Toolkit - Why Your Infrastructure AI Sucks (And How to Fix It) - https://www.youtube.com/watch?v=Ma3gKmuXahc
The Making of Flux: The Scale a KubeFM Original Series
The Making of Flux: The Scale a KubeFM Original Series

The Making of Flux: The Scale, a KubeFM Original Series

https://ku.bz/tWcHlJm7M

In this episode, Philippe Ensarguet, VP of Software Engineering at Orange, and Arnab Chatterjee, Global Head of Container & AI Platforms at Nomura, share how large enterprises are adopting Flux to drive reliable, compliant, and scalable platforms.

How Orange uses Flux to manage bare-metal Kubernetes through its SYLVR project.

Why Nomura relies on GitOps to balance agility with governance in financial services.

How Flux helps enterprises achieve resilience, compliance, and repeatability at scale.

Sponsor

Join the Flux maintainers and community at FluxCon, November 11th in Atlanta—register here

More info

Find all the links and info for this episode here: https://ku.bz/tWcHlJm7M

Interested in sponsoring an episode? Learn more.

via KubeFM https://kube.fm

October 13, 2025 at 06:00AM

·kube.fm·
The Making of Flux: The Scale a KubeFM Original Series
Last Week in Kubernetes Development - Week Ending October 5 2025
Last Week in Kubernetes Development - Week Ending October 5 2025

Week Ending October 5, 2025

https://lwkd.info/2025/20251010

Developer News

Joaquim Rocha has been nominated to be one of the new SIG UI leads. Congrats Joaquim!

Folks are discussing the deprecation of cgroups v1. Find the whole discussion in the mailing list here

There are some updates in the release informing and blocking jobs to improve alpha/beta coverage. Find the full list of jobs moved to release informing and blocking status here

Release Schedule

Next Deadline: Enhancements Freeze, October 16

All enhancements are expected to have met the requirements by the freeze. Those that don’t meet the requirements will be removed from the milestone and will require an Exception.

Kubernetes v1.35.0-alpha.1 is out!

The cherry-pick deadline for patch releases is Oct 10.

Steering Committee Election

The Steering Committee Election voting ends on Friday, 24th October, AoE. You can check your eligibility to vote in the voting app, and file an exception request by October 22 if you need an exception. Don’t forget to cast your votes if you haven’t already!

Featured PRs

133697: Codify feature gate dependencies

With this PR, feature gate dependencies can be explicitly declared and enforced. This has been ad-hoc or implicit in the past. Components will now refuse to start if a feature is enabled without its required dependencies. Feature Owners should review the backfilled dependencies, while users who manually toggle feature gates must ensure dependent features are also enabled—especially noting that AllAlpha=true now requires AllBeta=true or equivalent beta features to be set.

KEP of the Week

KEP 859: Include kubectl command metadata in http request headers

This KEP aims to add extra HTTP headers to kubectl requests sent to the Kubernetes apiserver. These headers would share details such as which kubectl command was used, the flags included, a session ID, and whether the command is deprecated. This would help cluster administrators understand how users interact with the cluster, making it easier to debug issues, track usage, and gather insights, without exposing any sensitive data.

This KEP is tracked for GA in v1.35

Other Merges

Disable SchedulerAsyncAPICalls feature gate to prevent scheduler performance issues under high API server load.

Add path normalization to error matcher for improved field validation.

DeviceClass now enforces a maximum of 32 selectors and configs via declarative validation.

Add declarative validation +k8s:maxItems tag to ResourceClaim

HPA controller now exposes desired_replicas metric to track scaling history.

Fix preemptor pod behavior to prevent endless scheduling loops during slow victim deletion.

Feature gate dependencies are now explicit and validated at startup, preventing enabling a feature if its dependencies are disabled.

kube-scheduler introduces lightweight AssumeCache in VolumeBinding plugin to fix occasional pod scheduling delays.

Version Updates

etcd to v3.6.5

Subprojects and Dependency Updates

cluster-api v1.11.2 extends Kubernetes support to v1.34 for both management and workload clusters, adds CoreDNS migration v1.0.28, and introduces Metal3 as an IPAM provider.

cluster-api v1.10.7 adds Kubernetes v1.33 compatibility and updates CoreDNS migration to v1.0.28.

coredns v1.13.1 updates Go to v1.25.2 to address security issues, improves performance, and enhances the sign plugin by rejecting invalid UTF-8 tokens.

coredns v1.13.0 introduces a new Nomad plugin, fixes Corefile loop and import issues, improves shutdown handling, and hardens gRPC and reload behavior.

containerd API v1.10.0-beta.1 adds a mount manager and aligns with containerd 2.2 APIs (pre-release).

kOps v1.34.0-beta.1 updates AWS and Azure components (VPC CNI v1.20.2, Cilium v1.18.2, Calico v3.30.3), upgrades etcd to v3.6.5, drops Canal support, and removes Kubernetes 1.28 compatibility.

autoscaler vertical-pod-autoscaler v1.5.1 updates the default VPA version and client-go dependency to improve stability.

autoscaler cluster-autoscaler-chart v0.1.1 introduces automatic resource adjustment for workloads through Helm.

csi-driver-nfs v4.12.0 updates Go to 1.24, fixes a goroutine leak, and adds support for creating multiple storage classes with Helm.

csi-driver-smb v1.19.0 improves secret handling with special characters, updates CSI sidecars and resizer to v1.14.0, and adds Helm support for multiple storage classes.

headlamp v0.36.0 adds support for EndpointSlice resources, label-based search, and clipboard copy for resource names. It improves table sorting memory, standardizes resource naming, and enhances Helm charts with optional PodDisruptionBudget, backend TLS termination, and security context updates. The release also fixes several UI issues, improves plugin management, and updates shipped Prometheus and App Catalog plugins.

Shoutouts

Drew Hagen – I’d like to take a moment to acknowledge @Matteo for the seriously impressive leadership of a newer release branch management shadow program for the 1.34 release, and all the amazing work putting together strong documentation for branch management!! I remember my experience releasing alpha 3 being very clear what to do and going really smooth. Very little tribal knowledge. And we did most releases async, which I think speaks to how strong this handbook is. I thank you for still being around to observe and help, even if it meant some later nights in your time zone. @xmudrii @jimangel Great work! Y’all have set the foundation for many more cycles to come. Thank you for all of your patience, guidance and support. It was really great learning and working with you all @Angelos Kolaitis @satyampsoni

via Last Week in Kubernetes Development https://lwkd.info/

October 10, 2025 at 09:03AM

·lwkd.info·
Last Week in Kubernetes Development - Week Ending October 5 2025
SYNOLOGY SUPPORT SEAGATE & WD AGAIN - TOO LITTLE, TOO LATE?
SYNOLOGY SUPPORT SEAGATE & WD AGAIN - TOO LITTLE, TOO LATE?
Synology (FINALLY) Gives In to 3rd Party HDD Support in 2025 PLUS Series NAS 7/10/25 - Updated with information supplied by Synology on how verifications and product ranges will support different HDD/SSD in DSM 7.3 Of all the stories of 2025, very few had the level of impact on the NAS industry th
·nascompares.com·
SYNOLOGY SUPPORT SEAGATE & WD AGAIN - TOO LITTLE, TOO LATE?
Web KAT Attack! Launch Trailer
Web KAT Attack! Launch Trailer
Our first game built with Godot. Web-KAT Attack a straight forward hi-score attack Twin-Stick shooter available now on itch.io: https://thehungrybuppis.itch....
·youtube.com·
Web KAT Attack! Launch Trailer
Red Hat GitLab Data Breach: The Crimson Collective's Attack
Red Hat GitLab Data Breach: The Crimson Collective's Attack
This breach exposed 570GB of data from 28,000 repositories, affecting 800+ organizations. Crimson Collective leaked Customer Engagement Reports containing credentials, API keys, and infrastructure details from major enterprises.
·blog.gitguardian.com·
Red Hat GitLab Data Breach: The Crimson Collective's Attack
CHAOSScast Episode 120: Practitioner Guides: #5 Demonstrating Organizational Value
CHAOSScast Episode 120: Practitioner Guides: #5 Demonstrating Organizational Value
In this episode of CHAOSScast, Harmony Elendu hosts a discussion with Dawn Foster and Bob Killen to discuss their extensive experience in open source and detail the motivations behind the creation of the CHAOSS Practitioner Guides. These guides aim to help practitioners navigate the overwhelming amount of data related to open source projects and understand how to improve project health and sustainability. The discussion covers strategies for communicating the business value of open source efforts to leadership, framing contributions in a way that resonates with organizational priorities, and prioritizing investments in critical projects. Press download now!
·podcast.chaoss.community·
CHAOSScast Episode 120: Practitioner Guides: #5 Demonstrating Organizational Value
DevOps & AI Toolkit - Ep36 - Ask Me Anything About Anything - https://www.youtube.com/watch?v=iZoTwl8BWCI
DevOps & AI Toolkit - Ep36 - Ask Me Anything About Anything - https://www.youtube.com/watch?v=iZoTwl8BWCI

Ep36 - Ask Me Anything About Anything

There are no restrictions in this AMA session. You can ask anything about DevOps, AI, Cloud, Kubernetes, Platform Engineering, containers, or anything else.

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: Octopus 🔗 Enterprise Support for Argo: https://octopus.com/support/enterprise-argo-support ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/

▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox

via YouTube https://www.youtube.com/watch?v=iZoTwl8BWCI

·youtube.com·
DevOps & AI Toolkit - Ep36 - Ask Me Anything About Anything - https://www.youtube.com/watch?v=iZoTwl8BWCI
Asked to do something illegal at work? Here’s what these software engineers did
Asked to do something illegal at work? Here’s what these software engineers did
At FTX, Frank, and Pollen, software engineers were asked to do something potentially illegal, or to go along with what looked like fraud. They obliged in two out of three cases, landed in hot water, and now face jail time. A reminder why it’s never a good idea to go along with such requests.
·blog.pragmaticengineer.com·
Asked to do something illegal at work? Here’s what these software engineers did
How We Integrated Native macOS Workloads with Kubernetes with Vitalii Horbachov
How We Integrated Native macOS Workloads with Kubernetes with Vitalii Horbachov

How We Integrated Native macOS Workloads with Kubernetes, with Vitalii Horbachov

https://ku.bz/q_JS76SvM

Vitalii Horbachov explains how Agoda built macOS VZ Kubelet, a custom solution that registers macOS hosts as Kubernetes nodes and spins up macOS VMs using Apple's native virtualization framework. He details their journey from managing 200 Mac minis with bash scripts to a Kubernetes-native approach that handles 20,000 iOS tests at scale.

You will learn:

How to build hybrid runtime pods that combine macOS VMs with Docker sidecar containers for complex CI/CD workflows

Custom OCI image format implementation for managing 55-60GB macOS VM images with layered copy-on-write disks and digest validation

Networking and security challenges including Apple entitlements, direct NIC access, and implementing kubectl exec over SSH

Real-world adoption considerations including MDM-based host lifecycle management and the build vs. buy decision for Apple infrastructure at scale

Sponsor

This episode is brought to you by Testkube—where teams run millions of performance tests in real Kubernetes infrastructure. From air-gapped environments to massive scale deployments, orchestrate every testing tool in one platform. Check it out at testkube.io

More info

Find all the links and info for this episode here: https://ku.bz/q_JS76SvM

Interested in sponsoring an episode? Learn more.

via KubeFM https://kube.fm

October 07, 2025 at 06:00AM

·kube.fm·
How We Integrated Native macOS Workloads with Kubernetes with Vitalii Horbachov
The YAML Games | A KubeCon Quiz Series
The YAML Games | A KubeCon Quiz Series
Join us for four epic battles of wits at KubeCon where you'll face impossible questions and win exclusive swag.
·yaml.games·
The YAML Games | A KubeCon Quiz Series
Introducing Headlamp Plugin for Karpenter - Scaling and Visibility
Introducing Headlamp Plugin for Karpenter - Scaling and Visibility

Introducing Headlamp Plugin for Karpenter - Scaling and Visibility

https://kubernetes.io/blog/2025/10/06/introducing-headlamp-plugin-for-karpenter/

Headlamp is an open‑source, extensible Kubernetes SIG UI project designed to let you explore, manage, and debug cluster resources.

Karpenter is a Kubernetes Autoscaling SIG node provisioning project that helps clusters scale quickly and efficiently. It launches new nodes in seconds, selects appropriate instance types for workloads, and manages the full node lifecycle, including scale-down.

The new Headlamp Karpenter Plugin adds real-time visibility into Karpenter’s activity directly from the Headlamp UI. It shows how Karpenter resources relate to Kubernetes objects, displays live metrics, and surfaces scaling events as they happen. You can inspect pending pods during provisioning, review scaling decisions, and edit Karpenter-managed resources with built-in validation. The Karpenter plugin was made as part of a LFX mentor project.

The Karpenter plugin for Headlamp aims to make it easier for Kubernetes users and operators to understand, debug, and fine-tune autoscaling behavior in their clusters. Now we will give a brief tour of the Headlamp plugin.

Map view of Karpenter Resources and how they relate to Kubernetes resources

Easily see how Karpenter Resources like NodeClasses, NodePool and NodeClaims connect with core Kubernetes resources like Pods, Nodes etc.

Visualization of Karpenter Metrics

Get instant insights of Resource Usage v/s Limits, Allowed disruptions, Pending Pods, Provisioning Latency and many more .

Scaling decisions

Shows which instances are being provisioned for your workloads and understand the reason behind why Karpenter made those choices. Helpful while debugging.

Config editor with validation support

Make live edits to Karpenter configurations. The editor includes diff previews and resource validation for safer adjustments.

Real time view of Karpenter resources

View and track Karpenter specific resources in real time such as “NodeClaims” as your cluster scales up and down.

Dashboard for Pending Pods

View all pending pods with unmet scheduling requirements/Failed Scheduling highlighting why they couldn't be scheduled.

Karpenter Providers

This plugin should work with most Karpenter providers, but has only so far been tested on the ones listed in the table. Additionally, each provider gives some extra information, and the ones in the table below are displayed by the plugin.

Provider Name

Tested

Extra provider specific info supported

AWS

Azure

AlibabaCloud

Bizfly Cloud

Cluster API

GCP

Proxmox

Oracle Cloud Infrastructure (OCI)

Please submit an issue if you test one of the untested providers or if you want support for this provider (PRs also gladly accepted).

How to use

Please see the plugins/karpenter/README.md for instructions on how to use.

Feedback and Questions

Please submit an issue if you use Karpenter and have any other ideas or feedback. Or come to the Kubernetes slack headlamp channel for a chat.

via Kubernetes Blog https://kubernetes.io/

October 05, 2025 at 08:00PM

·kubernetes.io·
Introducing Headlamp Plugin for Karpenter - Scaling and Visibility
DevOps & AI Toolkit - Kubernetes Controllers Deep Dive: How They Really Work - https://www.youtube.com/watch?v=kss081c8EqY
DevOps & AI Toolkit - Kubernetes Controllers Deep Dive: How They Really Work - https://www.youtube.com/watch?v=kss081c8EqY

Kubernetes Controllers Deep Dive: How They Really Work

Most people using Kubernetes know how to write YAML and run kubectl apply, but when things break, they're completely lost. The secret they're missing? Understanding controllers - the beating heart that makes Kubernetes actually work. Controllers are what automatically restart your crashed pods, scale your applications, and make custom resources feel native to the platform.

This video dives deep into the real mechanics of how Kubernetes controllers operate. You'll discover how controllers consume and emit events to coordinate with each other, how the reconciliation loop continuously maintains your desired state, and how the Watch API efficiently streams changes without overwhelming the system. We'll explore custom resource definitions that extend Kubernetes, controller communication patterns, and the event-driven architecture that makes everything self-healing. Whether you're debugging cluster issues or building your own controllers, this knowledge will transform how you think about Kubernetes from just throwing YAML at the wall to truly understanding the orchestration engine underneath.

KubernetesControllers #Kubernetes #DevOpsEngineering

Consider joining the channel: https://www.youtube.com/c/devopstoolkit/join

▬▬▬▬▬▬ 🔗 Additional Info 🔗 ▬▬▬▬▬▬ ➡ Transcript and commands: https://devopstoolkit.live/kubernetes/kubernetes-controllers-deep-dive-how-they-really-work 🔗 Kubernetes: https://kubernetes.io

▬▬▬▬▬▬ 💰 Sponsorships 💰 ▬▬▬▬▬▬ If you are interested in sponsoring this channel, please visit https://devopstoolkit.live/sponsor for more information. Alternatively, feel free to contact me over Twitter or LinkedIn (see below).

▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/

▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox

▬▬▬▬▬▬ ⏱ Timecodes ⏱ ▬▬▬▬▬▬ 00:00 Kubernetes Controllers Deep Dive 01:18 Kubernetes Control Loops Explained 04:12 How Kubernetes Controllers Watch Events 07:35 Kubernetes Event Emission 11:56 Kubernetes Reconciliation Loop 17:12 Kubernetes Watch API 21:01 Kubernetes Custom Resource Definitions (CRDs) 21:13 Kubernetes Controller Communication 25:22 Kubernetes Controllers Mastery

via YouTube https://www.youtube.com/watch?v=kss081c8EqY

·youtube.com·
DevOps & AI Toolkit - Kubernetes Controllers Deep Dive: How They Really Work - https://www.youtube.com/watch?v=kss081c8EqY