Suggested Reads

Suggested Reads

54794 bookmarks
Newest
Container Runtime Interface streaming explained
Container Runtime Interface streaming explained

Container Runtime Interface streaming explained

https://kubernetes.io/blog/2024/05/01/cri-streaming-explained/

The Kubernetes Container Runtime Interface (CRI) acts as the main connection between the kubelet and the Container Runtime. Those runtimes have to provide a gRPC server which has to fulfill a Kubernetes defined Protocol Buffer interface. This API definition evolves over time, for example when contributors add new features or fields are going to become deprecated.

In this blog post, I'd like to dive into the functionality and history of three extraordinary Remote Procedure Calls (RPCs), which are truly outstanding in terms of how they work: Exec, Attach and PortForward.

Exec can be used to run dedicated commands within the container and stream the output to a client like kubectl or crictl. It also allows interaction with that process using standard input (stdin), for example if users want to run a new shell instance within an existing workload.

Attach streams the output of the currently running process via standard I/O from the container to the client and also allows interaction with them. This is particularly useful if users want to see what is going on in the container and be able to interact with the process.

PortForward can be utilized to forward a port from the host to the container to be able to interact with it using third party network tools. This allows it to bypass Kubernetes services for a certain workload and interact with its network interface.

What is so special about them?

All RPCs of the CRI either use the gRPC unary calls for communication or the server side streaming feature (only GetContainerEvents right now). This means that mainly all RPCs retrieve a single client request and have to return a single server response. The same applies to Exec, Attach, and PortForward, where their protocol definition looks like this:

// Exec prepares a streaming endpoint to execute a command in the container. rpc Exec(ExecRequest) returns (ExecResponse) {}

// Attach prepares a streaming endpoint to attach to a running container. rpc Attach(AttachRequest) returns (AttachResponse) {}

// PortForward prepares a streaming endpoint to forward ports from a PodSandbox. rpc PortForward(PortForwardRequest) returns (PortForwardResponse) {}

The requests carry everything required to allow the server to do the work, for example, the ContainerId or command (Cmd) to be run in case of Exec. More interestingly, all of their responses only contain a url:

message ExecResponse { // Fully qualified URL of the exec streaming server. string url = 1; }

message AttachResponse { // Fully qualified URL of the attach streaming server. string url = 1; }

message PortForwardResponse { // Fully qualified URL of the port-forward streaming server. string url = 1; }

Why is it implemented like that? Well, the original design document for those RPCs even predates Kubernetes Enhancements Proposals (KEPs) and was originally outlined back in 2016. The kubelet had a native implementation for Exec, Attach, and PortForward before the initiative to bring the functionality to the CRI started. Before that, everything was bound to Docker or the later abandoned container runtime rkt.

The CRI related design document also elaborates on the option to use native RPC streaming for exec, attach, and port forward. The downsides outweighed this approach: the kubelet would still create a network bottleneck and future runtimes would not be free in choosing the server implementation details. Also, another option that the Kubelet implements a portable, runtime-agnostic solution has been abandoned over the final one, because this would mean another project to maintain which nevertheless would be runtime dependent.

This means, that the basic flow for Exec, Attach and PortForward was proposed to look like this:

sequenceDiagram participant crictl participant kubectl participant API as API Server participant kubelet participant runtime as Container Runtime participant streaming as Streaming Server alt Client alternatives Note over kubelet,runtime: Container Runtime Interface (CRI) kubectl->>API: exec, attach, port-forward API->>kubelet: kubelet->>runtime: Exec, Attach, PortForward else Note over crictl,runtime: Container Runtime Interface (CRI) crictl->>runtime: Exec, Attach, PortForward end runtime->>streaming: New Session streaming->>runtime: HTTP endpoint (URL) alt Client alternatives runtime->>kubelet: Response URL kubelet->>API: API-->>streaming: Connection upgrade (SPDY or WebSocket) streaming-)API: Stream data API-)kubectl: Stream data else runtime->>crictl: Response URL crictl-->>streaming: Connection upgrade (SPDY or WebSocket) streaming-)crictl: Stream data end

Clients like crictl or the kubelet (via kubectl) request a new exec, attach or port forward session from the runtime using the gRPC interface. The runtime implements a streaming server that also manages the active sessions. This streaming server provides an HTTP endpoint for the client to connect to. The client upgrades the connection to use the SPDY streaming protocol or (in the future) to a WebSocket connection and starts to stream the data back and forth.

This implementation allows runtimes to have the flexibility to implement Exec, Attach and PortForward the way they want, and also allows a simple test path. Runtimes can change the underlying implementation to support any kind of feature without having a need to modify the CRI at all.

Many smaller enhancements to this overall approach have been merged into Kubernetes in the past years, but the general pattern has always stayed the same. The kubelet source code transformed into a reusable library, which is nowadays usable from container runtimes to implement the basic streaming capability.

How does the streaming actually work?

At a first glance, it looks like all three RPCs work the same way, but that's not the case. It's possible to group the functionality of Exec and Attach, while PortForward follows a distinct internal protocol definition.

Exec and Attach

Kubernetes defines Exec and Attach as remote commands, where its protocol definition exists in five different versions:

#

Version

Note

1

channel.k8s.io

Initial (unversioned) SPDY sub protocol (#13394, #13395)

2

v2.channel.k8s.io

Resolves the issues present in the first version (#15961)

3

v3.channel.k8s.io

Adds support for resizing container terminals (#25273)

4

v4.channel.k8s.io

Adds support for exit codes using JSON errors (#26541)

5

v5.channel.k8s.io

Adds support for a CLOSE signal (#119157)

On top of that, there is an overall effort to replace the SPDY transport protocol using WebSockets as part KEP #4006. Runtimes have to satisfy those protocols over their life cycle to stay up to date with the Kubernetes implementation.

Let's assume that a client uses the latest (v5) version of the protocol as well as communicating over WebSockets. In that case, the general flow would be:

The client requests an URL endpoint for Exec or Attach using the CRI.

The server (runtime) validates the request, inserts it into a connection tracking cache, and provides the HTTP endpoint URL for that request.

The client connects to that URL, upgrades the connection to establish a WebSocket, and starts to stream data.

In the case of Attach, the server has to stream the main container process data to the client.

In the case of Exec, the server has to create the subprocess command within the container and then streams the output to the client.

If stdin is required, then the server needs to listen for that as well and redirect it to the corresponding process.

Interpreting data for the defined protocol is fairly simple: The first byte of every input and output packet defines the actual stream:

First Byte

Type

Description

0

standard input

Data streamed from stdin

1

standard output

Data streamed to stdout

2

standard error

Data streamed to stderr

3

stream error

A streaming error occurred

4

stream resize

A terminal resize event

255

stream close

Stream should be closed (for WebSockets)

How should runtimes now implement the streaming server methods for Exec and Attach by using the provided kubelet library? The key is that the streaming server implementation in the kubelet outlines an interface called Runtime which has to be fulfilled by the actual container runtime if it wants to use that library:

// Runtime is the interface to execute the commands and provide the streams. type Runtime interface { Exec(ctx context.Context, containerID string, cmd []string, in io.Reader, out, err io.WriteCloser, tty bool, resize <-chan remotecommand.TerminalSize) error Attach(ctx context.Context, containerID string, in io.Reader, out, err io.WriteCloser, tty bool, resize <-chan remotecommand.TerminalSize) error PortForward(ctx context.Context, podSandboxID string, port int32, stream io.ReadWriteCloser) error }

Everything related to the protocol interpretation is already in place and runtimes only have to implement the actual Exec and Attach logic. For example, the container runtime CRI-O does it like this pseudo code:

func (s StreamService) Exec( ctx context.Context, containerID string, cmd []string, stdin io.Reader, stdout, stderr io.WriteCloser, tty bool, resizeChan <-chan remotecommand.TerminalSize, ) error { // Retrieve the container by the provided containerID // …

// Update the container status and verify that the workload is running // …

// Execute the command and stream the data return s.runtimeServer.Runtime().ExecContainer( s.ctx, c, cmd, stdin, stdout, stderr, tty, resizeChan, ) }

PortForward

Forwarding ports to a container works a bit differently when comparing it to streaming IO data from a workload. The server still has to provide a URL endpoint for the client to connect to, but then the container runtime has to enter the network namespace of the container, allocate the port as well as stream the data back and forth. There is n

·kubernetes.io·
Container Runtime Interface streaming explained
Marp: Markdown Presentation Ecosystem
Marp: Markdown Presentation Ecosystem

Marp: Markdown Presentation Ecosystem

Marp:Markdown Presentation Ecosystem Find Marp tools on GitHub! Create beautiful slide decks using an intuitive Markdown experience Marp (also known as the…

April 30, 2024 at 02:43PM

via Instapaper

·marp.app·
Marp: Markdown Presentation Ecosystem
I’m a sucker for cheat sheets | LLM Cheatsheet: Top 15 LLM Terms You Need to Know in 2024 — The Cloud Girl
I’m a sucker for cheat sheets | LLM Cheatsheet: Top 15 LLM Terms You Need to Know in 2024 — The Cloud Girl
Large Language Models (LLMs) are revolutionizing the way we interact with technology. But with all this innovation comes a new vocabulary! Fear not, fellow AI enthusiasts, for this blog is your decoder ring to the fascinating world of LLM lingo. Let's dive into some essential terms:
·thecloudgirl.dev·
I’m a sucker for cheat sheets | LLM Cheatsheet: Top 15 LLM Terms You Need to Know in 2024 — The Cloud Girl
The Verge hires Robison to cover artificial intelligence - Talking Biz News
The Verge hires Robison to cover artificial intelligence - Talking Biz News

The Verge hires Robison to cover artificial intelligence - Talking Biz News

Kylie Robison Kylie Robison is joining as senior AI reporter, where she’ll lead the technology publication’s coverage of artificial intelligence. She will start…

April 30, 2024 at 01:28PM

via Instapaper

·talkingbiznews.com·
The Verge hires Robison to cover artificial intelligence - Talking Biz News
How an empty S3 bucket can make your AWS bill explode
How an empty S3 bucket can make your AWS bill explode

How an empty S3 bucket can make your AWS bill explode

A few weeks ago, I began working on the PoC of a document indexing system for my client. I created a single S3 bucket in the eu-west-1 region and uploaded some…

April 30, 2024 at 10:31AM

via Instapaper

·medium.com·
How an empty S3 bucket can make your AWS bill explode
The hyper-clouds are open source's friends
The hyper-clouds are open source's friends

The hyper-clouds are open source's friends

Opinion One of the knee-jerk arguments made by companies abandoning their open source roots is that they can't make money because the bad hyper-cloud companies…

April 30, 2024 at 10:16AM

via Instapaper

·theregister.com·
The hyper-clouds are open source's friends
Atmosphere Verified Operating System
Atmosphere Verified Operating System
Atmosphere is a full-featured microkernel developed in Rust and verified with Verus. Conceptually Atmosphere is similar to the line of L4 microkernels. Atmosphere pushes most kernel functionality to user-space, e.g., device drivers, network stack, file systems, etc. The microkernel supports a minimal set of mechanisms to implement address spaces, page-tables, coarse-grained memory management, and threads of execution that together with address spaces implement an abstraction of a process. Each process has a page table and a collection of schedulable threads.
·mars-research.github.io·
Atmosphere Verified Operating System
Kubernetes 1.30: Preventing unauthorized volume mode conversion moves to GA
Kubernetes 1.30: Preventing unauthorized volume mode conversion moves to GA

Kubernetes 1.30: Preventing unauthorized volume mode conversion moves to GA

https://kubernetes.io/blog/2024/04/30/prevent-unauthorized-volume-mode-conversion-ga/

With the release of Kubernetes 1.30, the feature to prevent the modification of the volume mode of a PersistentVolumeClaim that was created from an existing VolumeSnapshot in a Kubernetes cluster, has moved to GA!

The problem

The Volume Mode of a PersistentVolumeClaim refers to whether the underlying volume on the storage device is formatted into a filesystem or presented as a raw block device to the Pod that uses it.

Users can leverage the VolumeSnapshot feature, which has been stable since Kubernetes v1.20, to create a PersistentVolumeClaim (shortened as PVC) from an existing VolumeSnapshot in the Kubernetes cluster. The PVC spec includes a dataSource field, which can point to an existing VolumeSnapshot instance. Visit Create a PersistentVolumeClaim from a Volume Snapshot for more details on how to create a PVC from an existing VolumeSnapshot in a Kubernetes cluster.

When leveraging the above capability, there is no logic that validates whether the mode of the original volume, whose snapshot was taken, matches the mode of the newly created volume.

This presents a security gap that allows malicious users to potentially exploit an as-yet-unknown vulnerability in the host operating system.

There is a valid use case to allow some users to perform such conversions. Typically, storage backup vendors convert the volume mode during the course of a backup operation, to retrieve changed blocks for greater efficiency of operations. This prevents Kubernetes from blocking the operation completely and presents a challenge in distinguishing trusted users from malicious ones.

Preventing unauthorized users from converting the volume mode

In this context, an authorized user is one who has access rights to perform update or patch operations on VolumeSnapshotContents, which is a cluster-level resource.

It is up to the cluster administrator to provide these rights only to trusted users or applications, like backup vendors. Users apart from such authorized ones will never be allowed to modify the volume mode of a PVC when it is being created from a VolumeSnapshot.

To convert the volume mode, an authorized user must do the following:

Identify the VolumeSnapshot that is to be used as the data source for a newly created PVC in the given namespace.

Identify the VolumeSnapshotContent bound to the above VolumeSnapshot.

kubectl describe volumesnapshot -n <namespace> <name>

Add the annotation snapshot.storage.kubernetes.io/allow-volume-mode-change: "true" to the above VolumeSnapshotContent. The VolumeSnapshotContent annotations must include one similar to the following manifest fragment:

kind: VolumeSnapshotContent metadata: annotations:

  • snapshot.storage.kubernetes.io/allow-volume-mode-change: "true" ...

Note: For pre-provisioned VolumeSnapshotContents, you must take an extra step of setting spec.sourceVolumeMode field to either Filesystem or Block, depending on the mode of the volume from which this snapshot was taken.

An example is shown below:

apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshotContent metadata: annotations:

  • snapshot.storage.kubernetes.io/allow-volume-mode-change: "true" name: <volume-snapshot-content-name> spec: deletionPolicy: Delete driver: hostpath.csi.k8s.io source: snapshotHandle: <snapshot-handle> sourceVolumeMode: Filesystem volumeSnapshotRef: name: <volume-snapshot-name> namespace: <namespace>

Repeat steps 1 to 3 for all VolumeSnapshotContents whose volume mode needs to be converted during a backup or restore operation. This can be done either via software with credentials of an authorized user or manually by the authorized user(s).

If the annotation shown above is present on a VolumeSnapshotContent object, Kubernetes will not prevent the volume mode from being converted. Users should keep this in mind before they attempt to add the annotation to any VolumeSnapshotContent.

Action required

The prevent-volume-mode-conversion feature flag is enabled by default in the external-provisioner v4.0.0 and external-snapshotter v7.0.0. Volume mode change will be rejected when creating a PVC from a VolumeSnapshot unless the steps described above have been performed.

What's next

To determine which CSI external sidecar versions support this feature, please head over to the CSI docs page. For any queries or issues, join Kubernetes on Slack and create a thread in the #csi or #sig-storage channel. Alternately, create an issue in the CSI external-snapshotter repository.

via Kubernetes Blog https://kubernetes.io/

April 29, 2024 at 08:00PM

·kubernetes.io·
Kubernetes 1.30: Preventing unauthorized volume mode conversion moves to GA
Earth Formation Site
Earth Formation Site
It's not far from the sign marking the exact latitude and longitude of the Earth's core.
·xkcd.com·
Earth Formation Site
Cultivating a culture of lifelong learning
Cultivating a culture of lifelong learning
In a world where new technologies, market trends, and ways of working emerge at a rapid pace, organizations that prioritize lifelong learning are better positioned to navigate them successfully.
·chieflearningofficer.com·
Cultivating a culture of lifelong learning
How Platform Engineering Compares to Running a Restaurant
How Platform Engineering Compares to Running a Restaurant

How Platform Engineering Compares to Running a Restaurant

Dive into the fascinating world of platform engineering while we draw parallels between the complex operations of a bustling eatery and the intricate processes of platform engineering. Just as a successful restaurant relies on a harmonious blend of ingredients, staff, and ambiance to delight customers, platform engineering integrates various technologies, teams, and practices to deliver robust software solutions. Join us as we explore the similarities in skill sets in both fields.

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: DoubleCloud 🔗 https://double.cloud 🔗 Save time & costs by streamlining data pipelines with zero-maintenance open-source solutions. From ingestion to visualization: all integrated, fully managed, and highly reliable, so your engineers will love working with data. ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

PlatformEngineering #InternalDeveloperPlatform #IDP

Consider joining the channel: https://www.youtube.com/c/devopstoolkit/join

▬▬▬▬▬▬ 💰 Sponsorships 💰 ▬▬▬▬▬▬ If you are interested in sponsoring this channel, please use https://calendar.app.google/Q9eaDUHN8ibWBaA7A to book a timeslot that suits you, and we'll go over the details. Or feel free to contact me over Twitter or LinkedIn (see below).

▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ Twitter: https://twitter.com/vfarcic ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/

▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox

▬▬▬▬▬▬ ⏱ Timecodes ⏱ ▬▬▬▬▬▬ 00:00 Platform Engineering vs. Restaurant 01:54 DoubleCloud (sponsor) 02:54 Platform Engineering vs. Restaurant (cont.)

via YouTube https://www.youtube.com/watch?v=vHQtWrqrFho

·youtube.com·
How Platform Engineering Compares to Running a Restaurant
How Burnout Became Normal — and How to Push Back Against It
How Burnout Became Normal — and How to Push Back Against It
Slowly but steadily, while we’ve been preoccupied with trying to meet demands that outstrip our resources, grappling with unfair treatment, or watching our working hours encroach upon our downtime, burnout has become the new baseline in many work environments. From the 40% of Gen Z workers who believe burnout is an inevitable part of success, to executives who believe high-pressure, “trial-by-fire” assignments are a required rite of passage, to toxic hustle culture that pushes busyness as a badge of honor, too many of us now expect to feel overwhelmed, over-stressed, and eventually burned out at work. When pressures are mounting and your work environment continues to be stressful, it’s all the more important to take proactive steps to return to your personal sweet spot of stress and remain there as long as you can. The author presents several strategies.
·hbr.org·
How Burnout Became Normal — and How to Push Back Against It
Project Bluefin Tour on Framework Laptop
Project Bluefin Tour on Framework Laptop
Project Bluefin is a new Linux distribution designed for reliability, performance, and sustainability. Bluefin is built with the Cloud Native Desktop model. ...
·youtube.com·
Project Bluefin Tour on Framework Laptop
Project Bluefin Tour on Framework Laptop | LinkedIn
Project Bluefin Tour on Framework Laptop | LinkedIn
Project Bluefin is a new Linux distribution designed for reliability, performance, and sustainability. Bluefin is built with the Cloud Native Desktop model. Jorge Castro, the creator of Universal Blue, joins Chris Short as Chris sets up a Framework Laptop to help contributing to the project.
·linkedin.com·
Project Bluefin Tour on Framework Laptop | LinkedIn
Kubernetes 1.30: Multi-Webhook and Modular Authorization Made Much Easier
Kubernetes 1.30: Multi-Webhook and Modular Authorization Made Much Easier

Kubernetes 1.30: Multi-Webhook and Modular Authorization Made Much Easier

https://kubernetes.io/blog/2024/04/26/multi-webhook-and-modular-authorization-made-much-easier/

With Kubernetes 1.30, we (SIG Auth) are moving Structured Authorization Configuration to beta.

Today's article is about authorization: deciding what someone can and cannot access. Check a previous article from yesterday to find about what's new in Kubernetes v1.30 around authentication (finding out who's performing a task, and checking that they are who they say they are).

Introduction

Kubernetes continues to evolve to meet the intricate requirements of system administrators and developers alike. A critical aspect of Kubernetes that ensures the security and integrity of the cluster is the API server authorization. Until recently, the configuration of the authorization chain in kube-apiserver was somewhat rigid, limited to a set of command-line flags and allowing only a single webhook in the authorization chain. This approach, while functional, restricted the flexibility needed by cluster administrators to define complex, fine-grained authorization policies. The latest Structured Authorization Configuration feature (KEP-3221) aims to revolutionize this aspect by introducing a more structured and versatile way to configure the authorization chain, focusing on enabling multiple webhooks and providing explicit control mechanisms.

The Need for Improvement

Cluster administrators have long sought the ability to specify multiple authorization webhooks within the API Server handler chain and have control over detailed behavior like timeout and failure policy for each webhook. This need arises from the desire to create layered security policies, where requests can be validated against multiple criteria or sets of rules in a specific order. The previous limitations also made it difficult to dynamically configure the authorizer chain, leaving no room to manage complex authorization scenarios efficiently.

The Structured Authorization Configuration feature addresses these limitations by introducing a configuration file format to configure the Kubernetes API Server Authorization chain. This format allows specifying multiple webhooks in the authorization chain (all other authorization types are specified no more than once). Each webhook authorizer has well-defined parameters, including timeout settings, failure policies, and conditions for invocation with CEL rules to pre-filter requests before they are dispatched to webhooks, helping you prevent unnecessary invocations. The configuration also supports automatic reloading, ensuring changes can be applied dynamically without restarting the kube-apiserver. This feature addresses current limitations and opens up new possibilities for securing and managing Kubernetes clusters more effectively.

Sample Configurations

Here is a sample structured authorization configuration along with descriptions for all fields, their defaults, and possible values.

apiVersion: apiserver.config.k8s.io/v1beta1 kind: AuthorizationConfiguration authorizers:

  • type: Webhook # Name used to describe the authorizer # This is explicitly used in monitoring machinery for metrics # Note: # - Validation for this field is similar to how K8s labels are validated today. # Required, with no default name: webhook webhook: # The duration to cache 'authorized' responses from the webhook # authorizer. # Same as setting --authorization-webhook-cache-authorized-ttl flag # Default: 5m0s authorizedTTL: 30s # The duration to cache 'unauthorized' responses from the webhook # authorizer. # Same as setting --authorization-webhook-cache-unauthorized-ttl flag # Default: 30s unauthorizedTTL: 30s # Timeout for the webhook request # Maximum allowed is 30s. # Required, with no default. timeout: 3s # The API version of the authorization.k8s.io SubjectAccessReview to # send to and expect from the webhook. # Same as setting --authorization-webhook-version flag # Required, with no default # Valid values: v1beta1, v1 subjectAccessReviewVersion: v1 # MatchConditionSubjectAccessReviewVersion specifies the SubjectAccessReview # version the CEL expressions are evaluated against # Valid values: v1 # Required, no default value matchConditionSubjectAccessReviewVersion: v1 # Controls the authorization decision when a webhook request fails to # complete or returns a malformed response or errors evaluating # matchConditions. # Valid values: # - NoOpinion: continue to subsequent authorizers to see if one of # them allows the request # - Deny: reject the request without consulting subsequent authorizers # Required, with no default. failurePolicy: Deny connectionInfo: # Controls how the webhook should communicate with the server. # Valid values: # - KubeConfig: use the file specified in kubeConfigFile to locate the # server. # - InClusterConfig: use the in-cluster configuration to call the # SubjectAccessReview API hosted by kube-apiserver. This mode is not # allowed for kube-apiserver. type: KubeConfig # Path to KubeConfigFile for connection info # Required, if connectionInfo.Type is KubeConfig kubeConfigFile: /kube-system-authz-webhook.yaml # matchConditions is a list of conditions that must be met for a request to be sent to this # webhook. An empty list of matchConditions matches all requests. # There are a maximum of 64 match conditions allowed. # # The exact matching logic is (in order): # 1. If at least one matchCondition evaluates to FALSE, then the webhook is skipped. # 2. If ALL matchConditions evaluate to TRUE, then the webhook is called. # 3. If at least one matchCondition evaluates to an error (but none are FALSE): # - If failurePolicy=Deny, then the webhook rejects the request # - If failurePolicy=NoOpinion, then the error is ignored and the webhook is skipped matchConditions: # expression represents the expression which will be evaluated by CEL. Must evaluate to bool. # CEL expressions have access to the contents of the SubjectAccessReview in v1 version. # If version specified by subjectAccessReviewVersion in the request variable is v1beta1, # the contents would be converted to the v1 version before evaluating the CEL expression. # # Documentation on CEL: https://kubernetes.io/docs/reference/using-api/cel/ # # only send resource requests to the webhook
  • expression: has(request.resourceAttributes) # only intercept requests to kube-system
  • expression: request.resourceAttributes.namespace == 'kube-system' # don't intercept requests from kube-system service accounts
  • expression: !('system:serviceaccounts:kube-system' in request.user.groups)
  • type: Node name: node
  • type: RBAC name: rbac
  • type: Webhook name: in-cluster-authorizer webhook: authorizedTTL: 5m unauthorizedTTL: 30s timeout: 3s subjectAccessReviewVersion: v1 failurePolicy: NoOpinion connectionInfo: type: InClusterConfig

The following configuration examples illustrate real-world scenarios that need the ability to specify multiple webhooks with distinct settings, precedence order, and failure modes.

Protecting Installed CRDs

Ensuring of Custom Resource Definitions (CRDs) availability at cluster startup has been a key demand. One of the blockers of having a controller reconcile those CRDs is having a protection mechanism for them, which can be achieved through multiple authorization webhooks. This was not possible before as specifying multiple authorization webhooks in the Kubernetes API Server authorization chain was simply not possible. Now, with the Structured Authorization Configuration feature, administrators can specify multiple webhooks, offering a solution where RBAC falls short, especially when denying permissions to 'non-system' users for certain CRDs.

Assuming the following for this scenario:

The "protected" CRDs are installed.

They can only be modified by users in the group admin.

apiVersion: apiserver.config.k8s.io/v1beta1 kind: AuthorizationConfiguration authorizers:

  • type: Webhook name: system-crd-protector webhook: unauthorizedTTL: 30s timeout: 3s subjectAccessReviewVersion: v1 matchConditionSubjectAccessReviewVersion: v1 failurePolicy: Deny connectionInfo: type: KubeConfig kubeConfigFile: /files/kube-system-authz-webhook.yaml matchConditions: # only send resource requests to the webhook
  • expression: has(request.resourceAttributes) # only intercept requests for CRDs
  • expression: request.resourceAttributes.resource.resource = "customresourcedefinitions"
  • expression: request.resourceAttributes.resource.group = "" # only intercept update, patch, delete, or deletecollection requests
  • expression: request.resourceAttributes.verb in ['update', 'patch', 'delete','deletecollection']
  • type: Node
  • type: RBAC

Preventing unnecessarily nested webhooks

A system administrator wants to apply specific validations to requests before handing them off to webhooks using frameworks like Open Policy Agent. In the past, this would require running nested webhooks within the one added to the authorization chain to achieve the desired result. The Structured Authorization Configuration feature simplifies this process, offering a structured API to selectively trigger additional webhooks when needed. It also enables administrators to set distinct failure policies for each webhook, ensuring more consistent and predictable responses.

apiVersion: apiserver.config.k8s.io/v1beta1 kind: AuthorizationConfiguration authorizers:

  • type: Webhook name: system-crd-protector webhook: unauthorizedTTL: 30s timeout: 3s subjectAccessReviewVersion: v1 matchConditionSubjectAccessReviewVersion: v1 failurePolicy: Deny connectionInfo: type: KubeConfig kubeConfigFile: /files/kube-system-authz-webhook.yaml matchConditions: # only send resource requests to the webhook
  • expression: has(request.resourceAttributes) # only intercept requests for CRDs
  • expression: request.resourceAttributes.re
·kubernetes.io·
Kubernetes 1.30: Multi-Webhook and Modular Authorization Made Much Easier
Kubernetes 1.30: Structured Authentication Configuration Moves to Beta
Kubernetes 1.30: Structured Authentication Configuration Moves to Beta

Kubernetes 1.30: Structured Authentication Configuration Moves to Beta

https://kubernetes.io/blog/2024/04/25/structured-authentication-moves-to-beta/

With Kubernetes 1.30, we (SIG Auth) are moving Structured Authentication Configuration to beta.

Today's article is about authentication: finding out who's performing a task, and checking that they are who they say they are. Check back in tomorrow to find about what's new in Kubernetes v1.30 around authorization (deciding what someone can and can't access).

Motivation

Kubernetes has had a long-standing need for a more flexible and extensible authentication system. The current system, while powerful, has some limitations that make it difficult to use in certain scenarios. For example, it is not possible to use multiple authenticators of the same type (e.g., multiple JWT authenticators) or to change the configuration without restarting the API server. The Structured Authentication Configuration feature is the first step towards addressing these limitations and providing a more flexible and extensible way to configure authentication in Kubernetes.

What is structured authentication configuration?

Kubernetes v1.30 builds on the experimental support for configurating authentication based on a file, that was added as alpha in Kubernetes v1.30. At this beta stage, Kubernetes only supports configuring JWT authenticators, which serve as the next iteration of the existing OIDC authenticator. JWT authenticator is an authenticator to authenticate Kubernetes users using JWT compliant tokens. The authenticator will attempt to parse a raw ID token, verify it's been signed by the configured issuer.

The Kubernetes project added configuration from a file so that it can provide more flexibility than using command line options (which continue to work, and are still supported). Supporting a configuration file also makes it easy to deliver further improvements in upcoming releases.

Benefits of structured authentication configuration

Here's why using a configuration file to configure cluster authentication is a benefit:

Multiple JWT authenticators: You can configure multiple JWT authenticators simultaneously. This allows you to use multiple identity providers (e.g., Okta, Keycloak, GitLab) without needing to use an intermediary like Dex that handles multiplexing between multiple identity providers.

Dynamic configuration: You can change the configuration without restarting the API server. This allows you to add, remove, or modify authenticators without disrupting the API server.

Any JWT-compliant token: You can use any JWT-compliant token for authentication. This allows you to use tokens from any identity provider that supports JWT. The minimum valid JWT payload must contain the claims documented in structured authentication configuration page in the Kubernetes documentation.

CEL (Common Expression Language) support: You can use CEL to determine whether the token's claims match the user's attributes in Kubernetes (e.g., username, group). This allows you to use complex logic to determine whether a token is valid.

Multiple audiences: You can configure multiple audiences for a single authenticator. This allows you to use the same authenticator for multiple audiences, such as using a different OAuth client for kubectl and dashboard.

Using identity providers that don't support OpenID connect discovery: You can use identity providers that don't support OpenID Connect discovery. The only requirement is to host the discovery document at a different location than the issuer (such as locally in the cluster) and specify the issuer.discoveryURL in the configuration file.

How to use Structured Authentication Configuration

To use structured authentication configuration, you specify the path to the authentication configuration using the --authentication-config command line argument in the API server. The configuration file is a YAML file that specifies the authenticators and their configuration. Here is an example configuration file that configures two JWT authenticators:

apiVersion: apiserver.config.k8s.io/v1beta1 kind: AuthenticationConfiguration

Someone with a valid token from either of these issuers could authenticate

against this cluster.

jwt:

  • issuer: url: https://issuer1.example.com audiences:
    • audience1
    • audience2 audienceMatchPolicy: MatchAny claimValidationRules: expression: 'claims.hd == "example.com"' message: "the hosted domain name must be example.com" claimMappings: username: expression: 'claims.username' groups: expression: 'claims.groups' uid: expression: 'claims.uid' extra:
    • key: 'example.com/tenant' expression: 'claims.tenant' userValidationRules:
    • expression: "!user.username.startsWith('system:')" message: "username cannot use reserved system: prefix" # second authenticator that exposes the discovery document at a different location # than the issuer
  • issuer: url: https://issuer2.example.com discoveryURL: https://discovery.example.com/.well-known/openid-configuration audiences:
    • audience3
    • audience4 audienceMatchPolicy: MatchAny claimValidationRules: expression: 'claims.hd == "example.com"' message: "the hosted domain name must be example.com" claimMappings: username: expression: 'claims.username' groups: expression: 'claims.groups' uid: expression: 'claims.uid' extra:
    • key: 'example.com/tenant' expression: 'claims.tenant' userValidationRules:
    • expression: "!user.username.startsWith('system:')" message: "username cannot use reserved system: prefix"

Migration from command line arguments to configuration file

The Structured Authentication Configuration feature is designed to be backwards-compatible with the existing approach, based on command line options, for configuring the JWT authenticator. This means that you can continue to use the existing command-line options to configure the JWT authenticator. However, we (Kubernetes SIG Auth) recommend migrating to the new configuration file-based approach, as it provides more flexibility and extensibility.

Note

If you specify --authentication-config along with any of the --oidc-* command line arguments, this is a misconfiguration. In this situation, the API server reports an error and then immediately exits.

If you want to switch to using structured authentication configuration, you have to remove the --oidc-* command line arguments, and use the configuration file instead.

Here is an example of how to migrate from the command-line flags to the configuration file:

Command-line arguments

--oidc-issuer-url=https://issuer.example.com --oidc-client-id=example-client-id --oidc-username-claim=username --oidc-groups-claim=groups --oidc-username-prefix=oidc: --oidc-groups-prefix=oidc: --oidc-required-claim="hd=example.com" --oidc-required-claim="admin=true" --oidc-ca-file=/path/to/ca.pem

There is no equivalent in the configuration file for the --oidc-signing-algs. For Kubernetes v1.30, the authenticator supports all the asymmetric algorithms listed in oidc.go.

Configuration file

apiVersion: apiserver.config.k8s.io/v1beta1 kind: AuthenticationConfiguration jwt:

  • issuer: url: https://issuer.example.com audiences:
    • example-client-id certificateAuthority: <value is the content of file /path/to/ca.pem> claimMappings: username: claim: username prefix: "oidc:" groups: claim: groups prefix: "oidc:" claimValidationRules:
    • claim: hd requiredValue: "example.com"
    • claim: admin requiredValue: "true"

What's next?

For Kubernetes v1.31, we expect the feature to stay in beta while we get more feedback. In the coming releases, we want to investigate:

Making distributed claims work via CEL expressions.

Egress selector configuration support for calls to issuer.url and issuer.discoveryURL.

You can learn more about this feature on the structured authentication configuration page in the Kubernetes documentation. You can also follow along on the KEP-3331 to track progress across the coming Kubernetes releases.

Try it out

In this post, I have covered the benefits the Structured Authentication Configuration feature brings in Kubernetes v1.30. To use this feature, you must specify the path to the authentication configuration using the --authentication-config command line argument. From Kubernetes v1.30, the feature is in beta and enabled by default. If you want to keep using command line arguments instead of a configuration file, those will continue to work as-is.

We would love to hear your feedback on this feature. Please reach out to us on the

sig-auth-authenticators-dev

channel on Kubernetes Slack (for an invitation, visit https://slack.k8s.io/).

How to get involved

If you are interested in getting involved in the development of this feature, share feedback, or participate in any other ongoing SIG Auth projects, please reach out on the #sig-auth channel on Kubernetes Slack.

You are also welcome to join the bi-weekly SIG Auth meetings held every-other Wednesday.

via Kubernetes Blog https://kubernetes.io/

April 24, 2024 at 08:00PM

·kubernetes.io·
Kubernetes 1.30: Structured Authentication Configuration Moves to Beta