1_r/devopsish

1_r/devopsish

54496 bookmarks
Custom sorting
Super-Scaling Open Policy Agent with Batch Queries with Nicholaos Mouzourakis
Super-Scaling Open Policy Agent with Batch Queries with Nicholaos Mouzourakis

Super-Scaling Open Policy Agent with Batch Queries, with Nicholaos Mouzourakis

https://ku.bz/S-2vQ_j-4

Dive into the technical challenges of scaling authorization in Kubernetes with this in-depth conversation about Open Policy Agent (OPA).

Nicholaos Mouzourakis, Staff Product Security Engineer at Gusto, explains how his team re-architected Kubernetes native authorization using OPA to support scale, latency guarantees, and audit requirements across services. He shares detailed insights about their journey optimizing OPA performance through batch queries and solving unexpected interactions between Kubernetes resource limits and Go's runtime behavior.

You will learn:

Why traditional authorization approaches (code-driven and data-driven) fall short in microservice architectures, and how OPA provides a more flexible, decoupled solution

How batch authorization can improve performance by up to 18x by reducing network round-trips

The unexpected interaction between Kubernetes CPU limits and Go's thread management (GOMAXPROCS) that can severely impact OPA performance

Practical deployment strategies for OPA in production environments, including considerations for sidecars, daemon sets, and WASM modules

Sponsor

This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.

More info

Find all the links and info for this episode here: https://ku.bz/S-2vQ_j-4

Interested in sponsoring an episode? Learn more.

via KubeFM https://kube.fm

May 13, 2025 at 06:00AM

·kube.fm·
Super-Scaling Open Policy Agent with Batch Queries with Nicholaos Mouzourakis
Multiple Security Issues in Screen
Multiple Security Issues in Screen
Screen is the traditional terminal multiplexer software used on Linux and Unix systems. We found a local root exploit in Screen 5.0.0 affecting Arch Linux and NetBSD, as well as a couple of other issues that partly also affect older Screen versions, which are still found in the majority of distributions.
·security.opensuse.org·
Multiple Security Issues in Screen
Kubernetes v1.33: Image Pull Policy the way you always thought it worked!
Kubernetes v1.33: Image Pull Policy the way you always thought it worked!

Kubernetes v1.33: Image Pull Policy the way you always thought it worked!

https://kubernetes.io/blog/2025/05/12/kubernetes-v1-33-ensure-secret-pulled-images-alpha/

Image Pull Policy the way you always thought it worked!

Some things in Kubernetes are surprising, and the way imagePullPolicy behaves might be one of them. Given Kubernetes is all about running pods, it may be peculiar to learn that there has been a caveat to restricting pod access to authenticated images for over 10 years in the form of issue 18787! It is an exciting release when you can resolve a ten-year-old issue.

Note: Throughout this blog post, the term "pod credentials" will be used often. In this context, the term generally encapsulates the authentication material that is available to a pod to authenticate a container image pull.

IfNotPresent, even if I'm not supposed to have it

The gist of the problem is that the imagePullPolicy: IfNotPresent strategy has done precisely what it says, and nothing more. Let's set up a scenario. To begin, Pod A in Namespace X is scheduled to Node 1 and requires image Foo from a private repository. For it's image pull authentication material, the pod references Secret 1 in its imagePullSecrets. Secret 1 contains the necessary credentials to pull from the private repository. The Kubelet will utilize the credentials from Secret 1 as supplied by Pod A and it will pull container image Foo from the registry. This is the intended (and secure) behavior.

But now things get curious. If Pod B in Namespace Y happens to also be scheduled to Node 1, unexpected (and potentially insecure) things happen. Pod B may reference the same private image, specifying the IfNotPresent image pull policy. Pod B does not reference Secret 1 (or in our case, any secret) in its imagePullSecrets. When the Kubelet tries to run the pod, it honors the IfNotPresent policy. The Kubelet sees that the image Foo is already present locally, and will provide image Foo to Pod B. Pod B gets to run the image even though it did not provide credentials authorizing it to pull the image in the first place.

Using a private image pulled by a different pod

While IfNotPresent should not pull image Foo if it is already present on the node, it is an incorrect security posture to allow all pods scheduled to a node to have access to previously pulled private image. These pods were never authorized to pull the image in the first place.

IfNotPresent, but only if I am supposed to have it

In Kubernetes v1.33, we - SIG Auth and SIG Node - have finally started to address this (really old) problem and getting the verification right! The basic expected behavior is not changed. If an image is not present, the Kubelet will attempt to pull the image. The credentials each pod supplies will be utilized for this task. This matches behavior prior to 1.33.

If the image is present, then the behavior of the Kubelet changes. The Kubelet will now verify the pod's credentials before allowing the pod to use the image.

Performance and service stability have been a consideration while revising the feature. Pods utilizing the same credential will not be required to re-authenticate. This is also true when pods source credentials from the same Kubernetes Secret object, even when the credentials are rotated.

Never pull, but use if authorized

The imagePullPolicy: Never option does not fetch images. However, if the container image is already present on the node, any pod attempting to use the private image will be required to provide credentials, and those credentials require verification.

Pods utilizing the same credential will not be required to re-authenticate. Pods that do not supply credentials previously used to successfully pull an image will not be allowed to use the private image.

Always pull, if authorized

The imagePullPolicy: Always has always worked as intended. Each time an image is requested, the request goes to the registry and the registry will perform an authentication check.

In the past, forcing the Always image pull policy via pod admission was the only way to ensure that your private container images didn't get reused by other pods on nodes which already pulled the images.

Fortunately, this was somewhat performant. Only the image manifest was pulled, not the image. However, there was still a cost and a risk. During a new rollout, scale up, or pod restart, the image registry that provided the image MUST be available for the auth check, putting the image registry in the critical path for stability of services running inside of the cluster.

How it all works

The feature is based on persistent, file-based caches that are present on each of the nodes. The following is a simplified description of how the feature works. For the complete version, please see KEP-2535.

The process of requesting an image for the first time goes like this:

A pod requesting an image from a private registry is scheduled to a node.

The image is not present on the node.

The Kubelet makes a record of the intention to pull the image.

The Kubelet extracts credentials from the Kubernetes Secret referenced by the pod as an image pull secret, and uses them to pull the image from the private registry.

After the image has been successfully pulled, the Kubelet makes a record of the successful pull. This record includes details about credentials used (in the form of a hash) as well as the Secret from which they originated.

The Kubelet removes the original record of intent.

The Kubelet retains the record of successful pull for later use.

When future pods scheduled to the same node request the previously pulled private image:

The Kubelet checks the credentials that the new pod provides for the pull.

If the hash of these credentials, or the source Secret of the credentials match the hash or source Secret which were recorded for a previous successful pull, the pod is allowed to use the previously pulled image.

If the credentials or their source Secret are not found in the records of successful pulls for that image, the Kubelet will attempt to use these new credentials to request a pull from the remote registry, triggering the authorization flow.

Try it out

In Kubernetes v1.33 we shipped the alpha version of this feature. To give it a spin, enable the KubeletEnsureSecretPulledImages feature gate for your 1.33 Kubelets.

You can learn more about the feature and additional optional configuration on the concept page for Images in the official Kubernetes documentation.

What's next?

In future releases we are going to:

Make this feature work together with Projected service account tokens for Kubelet image credential providers which adds a new, workload-specific source of image pull credentials.

Write a benchmarking suite to measure the performance of this feature and assess the impact of any future changes.

Implement an in-memory caching layer so that we don't need to read files for each image pull request.

Add support for credential expirations, thus forcing previously validated credentials to be re-authenticated.

How to get involved

Reading KEP-2535 is a great way to understand these changes in depth.

If you are interested in further involvement, reach out to us on the #sig-auth-authenticators-dev channel on Kubernetes Slack (for an invitation, visit https://slack.k8s.io/). You are also welcome to join the bi-weekly SIG Auth meetings, held every other Wednesday.

via Kubernetes Blog https://kubernetes.io/

May 12, 2025 at 02:30PM

·kubernetes.io·
Kubernetes v1.33: Image Pull Policy the way you always thought it worked!
DevOps Toolkit - Claude Code: AI Agent for DevOps SRE and Platform Engineering - https://www.youtube.com/watch?v=h-6LP133o6w
DevOps Toolkit - Claude Code: AI Agent for DevOps SRE and Platform Engineering - https://www.youtube.com/watch?v=h-6LP133o6w

Claude Code: AI Agent for DevOps, SRE, and Platform Engineering

Discover the ultimate AI agent for DevOps, SRE, and Platform Engineering! This video explores Claude Code from Anthropic, comparing it to popular tools like GitHub Copilot and Cursor. Learn how Claude Code excels in terminal-based operations, understanding complex project structures, and executing commands with precision. See examples of its capabilities in setting up environments, running tests, and analyzing code. Uncover the pros and cons, including its superior performance and potential cost considerations. This video offers valuable insights into the future of AI in software engineering.

AICodeAssistant, #DevOpsTools, #SoftwareEngineering

Consider joining the channel: https://www.youtube.com/c/devopstoolkit/join

▬▬▬▬▬▬ 🔗 Additional Info 🔗 ▬▬▬▬▬▬ ➡ Transcript and commands: https://devopstoolkit.live/ai/claude-code-ai-agent-for-devops-sre-and-platform-engineering 🔗 Anthropic: https://anthropic.com

▬▬▬▬▬▬ 💰 Sponsorships 💰 ▬▬▬▬▬▬ If you are interested in sponsoring this channel, please visit https://devopstoolkit.live/sponsor for more information. Alternatively, feel free to contact me over Twitter or LinkedIn (see below).

▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/

▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox

▬▬▬▬▬▬ ⏱ Timecodes ⏱ ▬▬▬▬▬▬ 00:00 The Best AI Agent for Software Engineers 01:57 Claude Code AI Agent in Action 13:02 Claude Code AI Agent Pros and Cons

via YouTube https://www.youtube.com/watch?v=h-6LP133o6w

·youtube.com·
DevOps Toolkit - Claude Code: AI Agent for DevOps SRE and Platform Engineering - https://www.youtube.com/watch?v=h-6LP133o6w
What is Hollow Core Fiber (HCF)?
What is Hollow Core Fiber (HCF)?
Stay informed with HOLIGHT Fiber Optic's latest blog posts and insights. Discover industry news, product updates, and valuable information in our Posts section.
What is Hollow Core Fiber (HCF)?
·holightoptic.com·
What is Hollow Core Fiber (HCF)?
Kubernetes v1.33: Streaming List responses
Kubernetes v1.33: Streaming List responses

Kubernetes v1.33: Streaming List responses

https://kubernetes.io/blog/2025/05/09/kubernetes-v1-33-streaming-list-responses/

Managing Kubernetes cluster stability becomes increasingly critical as your infrastructure grows. One of the most challenging aspects of operating large-scale clusters has been handling List requests that fetch substantial datasets - a common operation that could unexpectedly impact your cluster's stability.

Today, the Kubernetes community is excited to announce a significant architectural improvement: streaming encoding for List responses.

The problem: unnecessary memory consumption with large resources

Current API response encoders just serialize an entire response into a single contiguous memory and perform one ResponseWriter.Write call to transmit data to the client. Despite HTTP/2's capability to split responses into smaller frames for transmission, the underlying HTTP server continues to hold the complete response data as a single buffer. Even as individual frames are transmitted to the client, the memory associated with these frames cannot be freed incrementally.

When cluster size grows, the single response body can be substantial - like hundreds of megabytes in size. At large scale, the current approach becomes particularly inefficient, as it prevents incremental memory release during transmission. Imagining that when network congestion occurs, that large response body’s memory block stays active for tens of seconds or even minutes. This limitation leads to unnecessarily high and prolonged memory consumption in the kube-apiserver process. If multiple large List requests occur simultaneously, the cumulative memory consumption can escalate rapidly, potentially leading to an Out-of-Memory (OOM) situation that compromises cluster stability.

The encoding/json package uses sync.Pool to reuse memory buffers during serialization. While efficient for consistent workloads, this mechanism creates challenges with sporadic large List responses. When processing these large responses, memory pools expand significantly. But due to sync.Pool's design, these oversized buffers remain reserved after use. Subsequent small List requests continue utilizing these large memory allocations, preventing garbage collection and maintaining persistently high memory consumption in the kube-apiserver even after the initial large responses complete.

Additionally, Protocol Buffers are not designed to handle large datasets. But it’s great for handling individual messages within a large data set. This highlights the need for streaming-based approaches that can process and transmit large collections incrementally rather than as monolithic blocks.

As a general rule of thumb, if you are dealing in messages larger than a megabyte each, it may be time to consider an alternate strategy.

From https://protobuf.dev/programming-guides/techniques/

Streaming encoder for List responses

The streaming encoding mechanism is specifically designed for List responses, leveraging their common well-defined collection structures. The core idea focuses exclusively on the Items field within collection structures, which represents the bulk of memory consumption in large responses. Rather than encoding the entire Items array as one contiguous memory block, the new streaming encoder processes and transmits each item individually, allowing memory to be freed progressively as frame or chunk is transmitted. As a result, encoding items one by one significantly reduces the memory footprint required by the API server.

With Kubernetes objects typically limited to 1.5 MiB (from ETCD), streaming encoding keeps memory consumption predictable and manageable regardless of how many objects are in a List response. The result is significantly improved API server stability, reduced memory spikes, and better overall cluster performance - especially in environments where multiple large List operations might occur simultaneously.

To ensure perfect backward compatibility, the streaming encoder validates Go struct tags rigorously before activation, guaranteeing byte-for-byte consistency with the original encoder. Standard encoding mechanisms process all fields except Items, maintaining identical output formatting throughout. This approach seamlessly supports all Kubernetes List types—from built-in *List objects to Custom Resource UnstructuredList objects - requiring zero client-side modifications or awareness that the underlying encoding method has changed.

Performance gains you'll notice

Reduced Memory Consumption: Significantly lowers the memory footprint of the API server when handling large list requests, especially when dealing with large resources.

Improved Scalability: Enables the API server to handle more concurrent requests and larger datasets without running out of memory.

Increased Stability: Reduces the risk of OOM kills and service disruptions.

Efficient Resource Utilization: Optimizes memory usage and improves overall resource efficiency.

Benchmark results

To validate results Kubernetes has introduced a new list benchmark which executes concurrently 10 list requests each returning 1GB of data.

The benchmark has showed 20x improvement, reducing memory usage from 70-80GB to 3GB.

List benchmark memory usage

via Kubernetes Blog https://kubernetes.io/

May 09, 2025 at 02:30PM

·kubernetes.io·
Kubernetes v1.33: Streaming List responses
Kubernetes 1.33: Volume Populators Graduate to GA
Kubernetes 1.33: Volume Populators Graduate to GA

Kubernetes 1.33: Volume Populators Graduate to GA

https://kubernetes.io/blog/2025/05/08/kubernetes-v1-33-volume-populators-ga/

Kubernetes volume populators are now generally available (GA)! The AnyVolumeDataSource feature gate is treated as always enabled for Kubernetes v1.33, which means that users can specify any appropriate custom resource as the data source of a PersistentVolumeClaim (PVC).

An example of how to use dataSourceRef in PVC:

apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc1 spec: ... dataSourceRef: apiGroup: provider.example.com kind: Provider name: provider1

What is new

There are four major enhancements from beta.

Populator Pod is optional

During the beta phase, contributors to Kubernetes identified potential resource leaks with PersistentVolumeClaim (PVC) deletion while volume population was in progress; these leaks happened due to limitations in finalizer handling. Ahead of the graduation to general availability, the Kubernetes project added support to delete temporary resources (PVC prime, etc.) if the original PVC is deleted.

To accommodate this, we've introduced three new plugin-based functions:

PopulateFn(): Executes the provider-specific data population logic.

PopulateCompleteFn(): Checks if the data population operation has finished successfully.

PopulateCleanupFn(): Cleans up temporary resources created by the provider-specific functions after data population is completed

A provider example is added in lib-volume-populator/example.

Mutator functions to modify the Kubernetes resources

For GA, the CSI volume populator controller code gained a MutatorConfig, allowing the specification of mutator functions to modify Kubernetes resources. For example, if the PVC prime is not an exact copy of the PVC and you need provider-specific information for the driver, you can include this information in the optional MutatorConfig. This allows you to customize the Kubernetes objects in the volume populator.

Flexible metric handling for providers

Our beta phase highlighted a new requirement: the need to aggregate metrics not just from lib-volume-populator, but also from other components within the provider's codebase.

To address this, SIG Storage introduced a provider metric manager. This enhancement delegates the implementation of metrics logic to the provider itself, rather than relying solely on lib-volume-populator. This shift provides greater flexibility and control over metrics collection and aggregation, enabling a more comprehensive view of provider performance.

Clean up for temporary resources

During the beta phase, we identified potential resource leaks with PersistentVolumeClaim (PVC) deletion while volume population was in progress, due to limitations in finalizer handling. We have improved the populator to support the deletion of temporary resources (PVC prime, etc.) if the original PVC is deleted in this GA release.

How to use it

To try it out, please follow the steps in the previous beta blog.

Future directions and potential feature requests

For next step, there are several potential feature requests for volume populator:

Multi sync: the current implementation is a one-time unidirectional sync from source to destination. This can be extended to support multiple syncs, enabling periodic syncs or allowing users to sync on demand

Bidirectional sync: an extension of multi sync above, but making it bidirectional between source and destination

Populate data with priorities: with a list of different dataSourceRef, populate based on priorities

Populate data from multiple sources of the same provider: populate multiple different sources to one destination

Populate data from multiple sources of the different providers: populate multiple different sources to one destination, pipelining different resources’ population

To ensure we're building something truly valuable, Kubernetes SIG Storage would love to hear about any specific use cases you have in mind for this feature. For any inquiries or specific questions related to volume populator, please reach out to the SIG Storage community.

via Kubernetes Blog https://kubernetes.io/

May 08, 2025 at 02:30PM

·kubernetes.io·
Kubernetes 1.33: Volume Populators Graduate to GA
Last Week in Kubernetes Development - Week Ending May 4 2025
Last Week in Kubernetes Development - Week Ending May 4 2025

Week Ending May 4, 2025

https://lwkd.info/2025/20250508

Developer News

Joel Speed is being nominated as a technical lead for SIG Cloud Provider. This was discussed in the April 23, 2025 SIG Cloud Meeting. Joel has been active in SIG Cloud Provider for about four years.

As development is being planned for the various SIGs for Kubernetes v1.34, Dims is requesting all contributors to evaluate the current state of all feature gates and see if progress can be made on moving them forward. Paco and Baofa created a Google Sheet a few months ago to help get clarity on the state of the feature gates.

The WG Node Lifecycle has recveid tons of great feedback from the community and are co-ordinating with the stakeholder SIGs. The next step for the Working Group is to vote on the time and schedule the first meeting.. The first two meetings will be used to finalize the WG proposal and ensure that the goals are well defined and prioritized.

Your SIG has 1 week left to propose a project for an LFX intern for this term. If someone has time to mentor, please pitch a project.

Release Schedule

Next Deadline: Release cycle begins soon

Interested in being part of the release team? Now’s your chance, apply to be a release team shadow. Applications are due May 18th.

Cherry-picks for the next set of Patch Releases are due May 9th.

Featured PRs

131627: kube-apiserver to treat error decoding a mutating webhook patch as error calling the webhook

kube-apiserver now treats webhook patch decode failures as webhook call errors; This makes debugging easier by treating bad webhook patches as webhook errors instead of server errors.

131586: Completion enabled for aliases defined in kuberc

kubectl enables completion for aliases in .kuberc; makes CLI shortcuts easier to use by allowing shell autocompletion for custom command aliases.

KEP of the Week

KEP 4818: Allow zero value for Sleep Action of PreStop Hook

This KEP is built on KEP-3960, which introduced the sleep action for the PreStop hook, by allowing a duration of 0 seconds. Previously disallowed, this value is valid in Go’s time.After(0) and acts as a no-op. The change enabled users to define PreStop hooks with sleep: 0s, useful for opting out of default webhook-injected sleeps without triggering validation errors.

This KEP was implemented in Kubernetes 1.33.

Other Merges

CEL UnstructuredToVal and TypedToVal has() tests expanded

Zero-value metadata.creationTimestamp values are now omitted and no longer serialize an explicit null in JSON, YAML and CBOR output

kubeadm to use named ports for coredns probe

DRA introduces special handling for updates involving a single resource slice

Structured authentication config adds support for CEL expressions with escaped names

Reading of disk geometry before calling expansion for ext and xfs filesystems disabled

Declarative validation simplifies handling of subresources

Fixed a bug in CEL’s common.UnstructuredToVal to respect nil fields

Fixes for bad handling of pointers and aliases in validation

Windows memory pressure eviction test stabilized

New ContainerIter utility added for ranging over pod containers

DRA: Improvements to resource slice publishing

kube-proxy –proxy-mode nftables to not log a bunch of errors when run on a machine with no ipvs support

Request#RequestURI to honor configured context root

ToUnstructured to match stdlib omitempty and anonymous behavior

Version Updates

CNI plugins to v1.7.1

golangci-lint to v2

via Last Week in Kubernetes Development https://lwkd.info/

May 08, 2025 at 03:30PM

·lwkd.info·
Last Week in Kubernetes Development - Week Ending May 4 2025
Major University Open Source Lab Faces Shutdown - Techstrong IT
Major University Open Source Lab Faces Shutdown - Techstrong IT
One of the open source sector's most significant incubators, Oregon State University’s (OSU) Open Source Lab (OSL), is facing budget cuts and may soon shutdown. OSL is in financial peril due to a decline in corporate donations and the Trump administration’s cutbacks on federal funding for higher education. “Unless we secure $250,000
·techstrong.it·
Major University Open Source Lab Faces Shutdown - Techstrong IT
SK Telecom scrambles to restore trust after massive data breach
SK Telecom scrambles to restore trust after massive data breach
South Korea’s leading mobile carrier SK Telecom is facing mounting fallout from a recent hacking incident, with more than 70,000 users switching to rival providers in just two days after the company began offering free USIM card replacements. Amid growing concerns that the data breach could spill over into the financial sector, South Korean financial authorities on Wednesday launched an emergency response team and tightened security protocols. According to industry sources, 35,902 SK Telecom use
·koreaherald.com·
SK Telecom scrambles to restore trust after massive data breach
How Not to Disagree
How Not to Disagree
Disagree & Commit is easy to say but hard to do. Here is what happens if you do not.
·boz.com·
How Not to Disagree
Kubernetes v1.33: From Secrets to Service Accounts: Kubernetes Image Pulls Evolved
Kubernetes v1.33: From Secrets to Service Accounts: Kubernetes Image Pulls Evolved

Kubernetes v1.33: From Secrets to Service Accounts: Kubernetes Image Pulls Evolved

https://kubernetes.io/blog/2025/05/07/kubernetes-v1-33-wi-for-image-pulls/

Kubernetes has steadily evolved to reduce reliance on long-lived credentials stored in the API. A prime example of this shift is the transition of Kubernetes Service Account (KSA) tokens from long-lived, static tokens to ephemeral, automatically rotated tokens with OpenID Connect (OIDC)-compliant semantics. This advancement enables workloads to securely authenticate with external services without needing persistent secrets.

However, one major gap remains: image pull authentication. Today, Kubernetes clusters rely on image pull secrets stored in the API, which are long-lived and difficult to rotate, or on node-level kubelet credential providers, which allow any pod running on a node to access the same credentials. This presents security and operational challenges.

To address this, Kubernetes is introducing Service Account Token Integration for Kubelet Credential Providers, now available in alpha. This enhancement allows credential providers to use pod-specific service account tokens to obtain registry credentials, which kubelet can then use for image pulls — eliminating the need for long-lived image pull secrets.

The problem with image pull secrets

Currently, Kubernetes administrators have two primary options for handling private container image pulls:

Image pull secrets stored in the Kubernetes API

These secrets are often long-lived because they are hard to rotate.

They must be explicitly attached to a service account or pod.

Compromise of a pull secret can lead to unauthorized image access.

Kubelet credential providers

These providers fetch credentials dynamically at the node level.

Any pod running on the node can access the same credentials.

There’s no per-workload isolation, increasing security risks.

Neither approach aligns with the principles of least privilege or ephemeral authentication, leaving Kubernetes with a security gap.

The solution: Service Account token integration for Kubelet credential providers

This new enhancement enables kubelet credential providers to use workload identity when fetching image registry credentials. Instead of relying on long-lived secrets, credential providers can use service account tokens to request short-lived credentials tied to a specific pod’s identity.

This approach provides:

Workload-specific authentication: Image pull credentials are scoped to a particular workload.

Ephemeral credentials: Tokens are automatically rotated, eliminating the risks of long-lived secrets.

Seamless integration: Works with existing Kubernetes authentication mechanisms, aligning with cloud-native security best practices.

How it works

  1. Service Account tokens for credential providers

Kubelet generates short-lived, automatically rotated tokens for service accounts if the credential provider it communicates with has opted into receiving a service account token for image pulls. These tokens conform to OIDC ID token semantics and are provided to the credential provider as part of the CredentialProviderRequest. The credential provider can then use this token to authenticate with an external service.

  1. Image registry authentication flow

When a pod starts, the kubelet requests credentials from a credential provider.

If the credential provider has opted in, the kubelet generates a service account token for the pod.

The service account token is included in the CredentialProviderRequest, allowing the credential provider to authenticate and exchange it for temporary image pull credentials from a registry (e.g. AWS ECR, GCP Artifact Registry, Azure ACR).

The kubelet then uses these credentials to pull images on behalf of the pod.

Benefits of this approach

Security: Eliminates long-lived image pull secrets, reducing attack surfaces.

Granular Access Control: Credentials are tied to individual workloads rather than entire nodes or clusters.

Operational Simplicity: No need for administrators to manage and rotate image pull secrets manually.

Improved Compliance: Helps organizations meet security policies that prohibit persistent credentials in the cluster.

What's next?

For Kubernetes v1.34, we expect to ship this feature in beta while continuing to gather feedback from users.

In the coming releases, we will focus on:

Implementing caching mechanisms to improve performance for token generation.

Giving more flexibility to credential providers to decide how the registry credentials returned to the kubelet are cached.

Making the feature work with Ensure Secret Pulled Images to ensure pods that use an image are authorized to access that image when service account tokens are used for authentication.

You can learn more about this feature on the service account token for image pulls page in the Kubernetes documentation.

You can also follow along on the KEP-4412 to track progress across the coming Kubernetes releases.

Try it out

To try out this feature:

Ensure you are running Kubernetes v1.33 or later.

Enable the ServiceAccountTokenForKubeletCredentialProviders feature gate on the kubelet.

Ensure credential provider support: Modify or update your credential provider to use service account tokens for authentication.

Update the credential provider configuration to opt into receiving service account tokens for the credential provider by configuring the tokenAttributes field.

Deploy a pod that uses the credential provider to pull images from a private registry.

We would love to hear your feedback on this feature. Please reach out to us on the

sig-auth-authenticators-dev

channel on Kubernetes Slack (for an invitation, visit https://slack.k8s.io/).

How to get involved

If you are interested in getting involved in the development of this feature, sharing feedback, or participating in any other ongoing SIG Auth projects, please reach out on the

sig-auth

channel on Kubernetes Slack.

You are also welcome to join the bi-weekly SIG Auth meetings, held every other Wednesday.

via Kubernetes Blog https://kubernetes.io/

May 07, 2025 at 02:30PM

·kubernetes.io·
Kubernetes v1.33: From Secrets to Service Accounts: Kubernetes Image Pulls Evolved
Major Linux & Open Source Sponsor Needs Your Help
Major Linux & Open Source Sponsor Needs Your Help
The Oregon State University Open Source Lab is an incredibly valuable resource in the FOSS world but recently they've announced they have some serious money ...
·youtube.com·
Major Linux & Open Source Sponsor Needs Your Help
DevOps Toolkit - Ep21 - Ask Me Anything About Anything with Scott Rosenberg - https://www.youtube.com/watch?v=TWKmRwBaEEU
DevOps Toolkit - Ep21 - Ask Me Anything About Anything with Scott Rosenberg - https://www.youtube.com/watch?v=TWKmRwBaEEU

Ep21 - Ask Me Anything About Anything with Scott Rosenberg

There are no restrictions in this AMA session. You can ask anything about DevOps, Cloud, Kubernetes, Platform Engineering, containers, or anything else. We'll have special guests Scott Rosenberg and Ramiro Berrelleza to help us out.

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: Codefresh 🔗 Codefresh GitOps Cloud: https://codefresh.io ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/

▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox

via YouTube https://www.youtube.com/watch?v=TWKmRwBaEEU

·youtube.com·
DevOps Toolkit - Ep21 - Ask Me Anything About Anything with Scott Rosenberg - https://www.youtube.com/watch?v=TWKmRwBaEEU
Kubernetes v1.33: Fine-grained SupplementalGroups Control Graduates to Beta
Kubernetes v1.33: Fine-grained SupplementalGroups Control Graduates to Beta

Kubernetes v1.33: Fine-grained SupplementalGroups Control Graduates to Beta

https://kubernetes.io/blog/2025/05/06/kubernetes-v1-33-fine-grained-supplementalgroups-control-beta/

The new field, supplementalGroupsPolicy, was introduced as an opt-in alpha feature for Kubernetes v1.31 and has graduated to beta in v1.33; the corresponding feature gate (SupplementalGroupsPolicy) is now enabled by default. This feature enables to implement more precise control over supplemental groups in containers that can strengthen the security posture, particularly in accessing volumes. Moreover, it also enhances the transparency of UID/GID details in containers, offering improved security oversight.

Please be aware that this beta release contains some behavioral breaking change. See The Behavioral Changes Introduced In Beta and Upgrade Considerations sections for details.

Motivation: Implicit group memberships defined in /etc/group in the container image

Although the majority of Kubernetes cluster admins/users may not be aware, kubernetes, by default, merges group information from the Pod with information defined in /etc/group in the container image.

Let's see an example, below Pod manifest specifies runAsUser=1000, runAsGroup=3000 and supplementalGroups=4000 in the Pod's security context.

apiVersion: v1 kind: Pod metadata: name: implicit-groups spec: securityContext: runAsUser: 1000 runAsGroup: 3000 supplementalGroups: [4000] containers:

  • name: ctr image: registry.k8s.io/e2e-test-images/agnhost:2.45 command: [ "sh", "-c", "sleep 1h" ] securityContext: allowPrivilegeEscalation: false

What is the result of id command in the ctr container? The output should be similar to this:

uid=1000 gid=3000 groups=3000,4000,50000

Where does group ID 50000 in supplementary groups (groups field) come from, even though 50000 is not defined in the Pod's manifest at all? The answer is /etc/group file in the container image.

Checking the contents of /etc/group in the container image should show below:

user-defined-in-image:x:1000: group-defined-in-image:x:50000:user-defined-in-image

This shows that the container's primary user 1000 belongs to the group 50000 in the last entry.

Thus, the group membership defined in /etc/group in the container image for the container's primary user is implicitly merged to the information from the Pod. Please note that this was a design decision the current CRI implementations inherited from Docker, and the community never really reconsidered it until now.

What's wrong with it?

The implicitly merged group information from /etc/group in the container image poses a security risk. These implicit GIDs can't be detected or validated by policy engines because there's no record of them in the Pod manifest. This can lead to unexpected access control issues, particularly when accessing volumes (see kubernetes/kubernetes#112879 for details) because file permission is controlled by UID/GIDs in Linux.

Fine-grained supplemental groups control in a Pod: supplementaryGroupsPolicy

To tackle the above problem, Pod's .spec.securityContext now includes supplementalGroupsPolicy field.

This field lets you control how Kubernetes calculates the supplementary groups for container processes within a Pod. The available policies are:

Merge: The group membership defined in /etc/group for the container's primary user will be merged. If not specified, this policy will be applied (i.e. as-is behavior for backward compatibility).

Strict: Only the group IDs specified in fsGroup, supplementalGroups, or runAsGroup are attached as supplementary groups to the container processes. Group memberships defined in /etc/group for the container's primary user are ignored.

Let's see how Strict policy works. Below Pod manifest specifies supplementalGroupsPolicy: Strict:

apiVersion: v1 kind: Pod metadata: name: strict-supplementalgroups-policy spec: securityContext: runAsUser: 1000 runAsGroup: 3000 supplementalGroups: [4000] supplementalGroupsPolicy: Strict containers:

  • name: ctr image: registry.k8s.io/e2e-test-images/agnhost:2.45 command: [ "sh", "-c", "sleep 1h" ] securityContext: allowPrivilegeEscalation: false

The result of id command in the ctr container should be similar to this:

uid=1000 gid=3000 groups=3000,4000

You can see Strict policy can exclude group 50000 from groups!

Thus, ensuring supplementalGroupsPolicy: Strict (enforced by some policy mechanism) helps prevent the implicit supplementary groups in a Pod.

Note: A container with sufficient privileges can change its process identity. The supplementalGroupsPolicy only affect the initial process identity. See the following section for details.

Attached process identity in Pod status

This feature also exposes the process identity attached to the first container process of the container via .status.containerStatuses[].user.linux field. It would be helpful to see if implicit group IDs are attached.

... status: containerStatuses:

  • name: ctr user: linux: gid: 3000 supplementalGroups:
  • 3000
  • 4000 uid: 1000 ...

Note: Please note that the values in status.containerStatuses[].user.linux field is the firstly attached process identity to the first container process in the container. If the container has sufficient privilege to call system calls related to process identity (e.g. setuid(2), setgid(2) or setgroups(2), etc.), the container process can change its identity. Thus, the actual process identity will be dynamic.

Strict Policy requires newer CRI versions

Actually, CRI runtime (e.g. containerd, CRI-O) plays a core role for calculating supplementary group ids to be attached to the containers. Thus, SupplementalGroupsPolicy=Strict requires a CRI runtime that support this feature (SupplementalGroupsPolicy: Merge can work with the CRI runtime which does not support this feature because this policy is fully backward compatible policy).

Here are some CRI runtimes that support this feature, and the versions you need to be running:

containerd: v2.0 or later

CRI-O: v1.31 or later

And, you can see if the feature is supported in the Node's .status.features.supplementalGroupsPolicy field.

apiVersion: v1 kind: Node ... status: features: supplementalGroupsPolicy: true

The behavioral changes introduced in beta

In the alpha release, when a Pod with supplementalGroupsPolicy: Strict was scheduled to a node that did not support the feature (i.e., .status.features.supplementalGroupsPolicy=false), the Pod's supplemental groups policy silently fell back to Merge.

In v1.33, this has entered beta to enforce the policy more strictly, where kubelet rejects pods whose nodes cannot ensure the specified policy. If your pod is rejected, you will see warning events with reason=SupplementalGroupsPolicyNotSupported like below:

apiVersion: v1 kind: Event ... type: Warning reason: SupplementalGroupsPolicyNotSupported message: "SupplementalGroupsPolicy=Strict is not supported in this node" involvedObject: apiVersion: v1 kind: Pod ...

Upgrade consideration

If you're already using this feature, especially the supplementalGroupsPolicy: Strict policy, we assume that your cluster's CRI runtimes already support this feature. In that case, you don't need to worry about the pod rejections described above.

However, if your cluster:

uses the supplementalGroupsPolicy: Strict policy, but

its CRI runtimes do NOT yet support the feature (i.e., .status.features.supplementalGroupsPolicy=false),

you need to prepare the behavioral changes (pod rejection) when upgrading your cluster.

We recommend several ways to avoid unexpected pod rejections:

Upgrading your cluster's CRI runtimes together with kubernetes or before the upgrade

Putting some label to your nodes describing CRI runtime supports this feature or not and also putting label selector to pods with Strict policy to select such nodes (but, you will need to monitor the number of Pending pods in this case instead of pod rejections).

Getting involved

This feature is driven by the SIG Node community. Please join us to connect with the community and share your ideas and feedback around the above feature and beyond. We look forward to hearing from you!

How can I learn more?

Configure a Security Context for a Pod or Container for the further details of supplementalGroupsPolicy

KEP-3619: Fine-grained SupplementalGroups control

via Kubernetes Blog https://kubernetes.io/

May 06, 2025 at 02:30PM

·kubernetes.io·
Kubernetes v1.33: Fine-grained SupplementalGroups Control Graduates to Beta