DIY: Create Your Own Cloud with Kubernetes (Part 1)

https://kubernetes.io/blog/2024/04/05/diy-create-your-own-cloud-with-kubernetes-part-1/

Author: Andrei Kvapil (Ænix)

At Ænix, we have a deep affection for Kubernetes and dream that all modern technologies will soon start utilizing its remarkable patterns.

Have you ever thought about building your own cloud? I bet you have. But is it possible to do this using only modern technologies and approaches, without leaving the cozy Kubernetes ecosystem? Our experience in developing Cozystack required us to delve deeply into it.

You might argue that Kubernetes is not intended for this purpose and why not simply use OpenStack for bare metal servers and run Kubernetes inside it as intended. But by doing so, you would simply shift the responsibility from your hands to the hands of OpenStack administrators. This would add at least one more huge and complex system to your ecosystem.

Why complicate things? - after all, Kubernetes already has everything needed to run tenant Kubernetes clusters at this point.

I want to share with you our experience in developing a cloud platform based on Kubernetes, highlighting the open-source projects that we use ourselves and believe deserve your attention.

In this series of articles, I will tell you our story about how we prepare managed Kubernetes from bare metal using only open-source technologies. Starting from the basic level of data center preparation, running virtual machines, isolating networks, setting up fault-tolerant storage to provisioning full-featured Kubernetes clusters with dynamic volume provisioning, load balancers, and autoscaling.

With this article, I start a series consisting of several parts:

Part 1: Preparing the groundwork for your cloud. Challenges faced during the preparation and operation of Kubernetes on bare metal and a ready-made recipe for provisioning infrastructure.

Part 2: Networking, storage, and virtualization. How to turn Kubernetes into a tool for launching virtual machines and what is needed for this.

Part 3: Cluster API and how to start provisioning Kubernetes clusters at the push of a button. How autoscaling works, dynamic provisioning of volumes, and load balancers.

I will try to describe various technologies as independently as possible, but at the same time, I will share our experience and why we came to one solution or another.

To begin with, let's understand the main advantage of Kubernetes and how it has changed the approach to using cloud resources.

It is important to understand that the use of Kubernetes in the cloud and on bare metal differs.

Kubernetes in the cloud

When you operate Kubernetes in the cloud, you don't worry about persistent volumes, cloud load balancers, or the process of provisioning nodes. All of this is handled by your cloud provider, who accepts your requests in the form of Kubernetes objects. In other words, the server side is completely hidden from you, and you don't really want to know how exactly the cloud provider implements as it's not in your area of responsibility.

A diagram showing cloud Kubernetes, with load balancing and storage done outside the cluster

Kubernetes offers convenient abstractions that work the same everywhere, allowing you to deploy your application on any Kubernetes in any cloud.

In the cloud, you very commonly have several separate entities: the Kubernetes control plane, virtual machines, persistent volumes, and load balancers as distinct entities. Using these entities, you can create highly dynamic environments.

Thanks to Kubernetes, virtual machines are now only seen as a utility entity for utilizing cloud resources. You no longer store data inside virtual machines. You can delete all your virtual machines at any moment and recreate them without breaking your application. The Kubernetes control plane will continue to hold information about what should run in your cluster. The load balancer will keep sending traffic to your workload, simply changing the endpoint to send traffic to a new node. And your data will be safely stored in external persistent volumes provided by cloud.

This approach is fundamental when using Kubernetes in clouds. The reason for it is quite obvious: the simpler the system, the more stable it is, and for this simplicity you go buying Kubernetes in the cloud.

Kubernetes on bare metal

Using Kubernetes in the clouds is really simple and convenient, which cannot be said about bare metal installations. In the bare metal world, Kubernetes, on the contrary, becomes unbearably complex. Firstly, because the entire network, backend storage, cloud balancers, etc. are usually run not outside, but inside your cluster. As result such a system is much more difficult to update and maintain.

A diagram showing bare metal Kubernetes, with load balancing and storage done inside the cluster

Judge for yourself: in the cloud, to update a node, you typically delete the virtual machine (or even use kubectl delete node) and you let your node management tooling create a new one, based on an immutable image. The new node will join the cluster and ”just work” as a node; following a very simple and commonly used pattern in the Kubernetes world. Many clusters order new virtual machines every few minutes, simply because they can use cheaper spot instances. However, when you have a physical server, you can't just delete and recreate it, firstly because it often runs some cluster services, stores data, and its update process is significantly more complicated.

There are different approaches to solving this problem, ranging from in-place updates, as done by kubeadm, kubespray, and k3s, to full automation of provisioning physical nodes through Cluster API and Metal3.

I like the hybrid approach offered by Talos Linux, where your entire system is described in a single configuration file. Most parameters of this file can be applied without rebooting or recreating the node, including the version of Kubernetes control-plane components. However, it still keeps the maximum declarative nature of Kubernetes. This approach minimizes unnecessary impact on cluster services when updating bare metal nodes. In most cases, you won't need to migrate your virtual machines and rebuild the cluster filesystem on minor updates.

Preparing a base for your future cloud

So, suppose you've decided to build your own cloud. To start somewhere, you need a base layer. You need to think not only about how you will install Kubernetes on your servers but also about how you will update and maintain it. Consider the fact that you will have to think about things like updating the kernel, installing necessary modules, as well packages and security patches. Now you have to think much more that you don't have to worry about when using a ready-made Kubernetes in the cloud.

Of course you can use standard distributions like Ubuntu or Debian, or you can consider specialized ones like Flatcar Container Linux, Fedora Core, and Talos Linux. Each has its advantages and disadvantages.

What about us? At Ænix, we use quite a few specific kernel modules like ZFS, DRBD, and OpenvSwitch, so we decided to go the route of forming a system image with all the necessary modules in advance. In this case, Talos Linux turned out to be the most convenient for us. For example, such a config is enough to build a system image with all the necessary kernel modules:

arch: amd64 platform: metal secureboot: false version: v1.6.4 input: kernel: path: /usr/install/amd64/vmlinuz initramfs: path: /usr/install/amd64/initramfs.xz baseInstaller: imageRef: ghcr.io/siderolabs/installer:v1.6.4 systemExtensions:

imageRef: ghcr.io/siderolabs/amd-ucode:20240115
imageRef: ghcr.io/siderolabs/amdgpu-firmware:20240115
imageRef: ghcr.io/siderolabs/bnx2-bnx2x:20240115
imageRef: ghcr.io/siderolabs/i915-ucode:20240115
imageRef: ghcr.io/siderolabs/intel-ice-firmware:20240115
imageRef: ghcr.io/siderolabs/intel-ucode:20231114
imageRef: ghcr.io/siderolabs/qlogic-firmware:20240115
imageRef: ghcr.io/siderolabs/drbd:9.2.6-v1.6.4
imageRef: ghcr.io/siderolabs/zfs:2.1.14-v1.6.4 output: kind: installer outFormat: raw

Then we use the docker command line tool to build an OS image:

cat config.yaml | docker run --rm -i -v /dev:/dev --privileged "ghcr.io/siderolabs/imager:v1.6.4" -

And as a result, we get a Docker container image with everything we need, which we can use to install Talos Linux on our servers. You can do the same; this image will contain all the necessary firmware and kernel modules.

But the question arises, how do you deliver the freshly formed image to your nodes?

I have been contemplating the idea of PXE booting for quite some time. For example, the Kubefarm project that I wrote an article about two years ago was entirely built using this approach. But unfortunately, it does help you to deploy your very first parent cluster that will hold the others. So now you have prepared a solution that will help you do this the same using PXE approach.

Essentially, all you need to do is run temporary DHCP and PXE servers inside containers. Then your nodes will boot from your image, and you can use a simple Debian-flavored script to help you bootstrap your nodes.

The source for that talos-bootstrap script is available on GitHub.

This script allows you to deploy Kubernetes on bare metal in five minutes and obtain a kubeconfig for accessing it. However, many unresolved issues still lie ahead.

Delivering system components

At this stage, you already have a Kubernetes cluster capable of running various workloads. However, it is not fully functional yet. In other words, you need to set up networking and storage, as well as install necessary cluster extensions, like KubeVirt to run virtual machines, as well the monitoring stack and other system-wide components.

Traditionally, this is solved by installing Helm charts into your cluster. You can do this by running helm install commands

#Kubernetes #K8s

·kubernetes.io·Apr 5, 2024

DIY: Create Your Own Cloud with Kubernetes (Part 1)

Week Ending March 31 2024

Week Ending March 31, 2024

https://lwkd.info/2024/20240402

Developer News

SIG-Release is looking for insights to make the official Kubernetes (container images, binaries/tarballs, and system packages) Artifacts more reliable and secure. Please share your feedback in this survey.

SIG Annual Reports are live, please make sure to submit your reports by May 1st.

Release Schedule

Next Deadline: Release Day, April 17th

Kubernetes 1.30.0-rc.0 and 1.30.0-rc.1 are live!

KEP of the Week

KEP 4381: Structured Parameters for Dynamic Resource Allocation

Dynamic Resource Allocation, which was added to Kubernetes as an alpha feature in 1.26 defines an alternative to the traditional device plugin API for requesting access to third-party resources. By default DRA uses parameters for resources that are completely opaque to core Kubernetes. This approach creates issues for higher level controllers like the Cluster Autoscaler that needs to make decisions for a group of Pods. Structured Parameters is an extension to DRA that takes care of this problem by making claim parameters less opaque.

This KEP focusses on defining the framework necessary to enable different structured models to be added to Kubernetes over time and does not define one of the models itself. This KEP is tracked for alpha release in the upcoming 1.30 release.

Subprojects and Dependency Updates

etcd to v3.5.13 Fix leases wrongly revoked by the leader

kubebuilder to v3.14.1 Upgrade controller-runtime from v0.17.0 to v0.17.2

prometheus to v2.51.1 Bugfix release

via Last Week in Kubernetes Development https://lwkd.info/

April 02, 2024 at 04:50AM

#Kubernetes #K8s

·lwkd.info·Apr 3, 2024

Week Ending March 31 2024

Introducing the Windows Operational Readiness Specification

https://kubernetes.io/blog/2024/04/03/intro-windows-ops-readiness/

Authors: Jay Vyas (Tesla), Amim Knabben (Broadcom), and Tatenda Zifudzi (AWS)

Since Windows support graduated to stable with Kubernetes 1.14 in 2019, the capability to run Windows workloads has been much appreciated by the end user community. The level of and availability of Windows workload support has consistently been a major differentiator for Kubernetes distributions used by large enterprises. However, with more Windows workloads being migrated to Kubernetes and new Windows features being continuously released, it became challenging to test Windows worker nodes in an effective and standardized way.

The Kubernetes project values the ability to certify conformance without requiring a closed-source license for a certified distribution or service that has no intention of offering Windows.

Some notable examples brought to the attention of SIG Windows were:

An issue with load balancer source address ranges functionality not operating correctly on Windows nodes, detailed in a GitHub issue: kubernetes/kubernetes#120033.

Reports of functionality issues with Windows features, such as “GMSA not working with containerd, discussed in microsoft/Windows-Containers#44.

Challenges developing networking policy tests that could objectively evaluate Container Network Interface (CNI) plugins across different operating system configurations, as discussed in kubernetes/kubernetes#97751.

SIG Windows therefore recognized the need for a tailored solution to ensure Windows nodes' operational readiness before their deployment into production environments. Thus, the idea to develop a Windows Operational Readiness Specification was born.

Can’t we just run the official Conformance tests?

The Kubernetes project contains a set of conformance tests, which are standardized tests designed to ensure that a Kubernetes cluster meets the required Kubernetes specifications.

However, these tests were originally defined at a time when Linux was the only operating system compatible with Kubernetes, and thus, they were not easily extendable for use with Windows. Given that Windows workloads, despite their importance, account for a smaller portion of the Kubernetes community, it was important to ensure that the primary conformance suite relied upon by many Kubernetes distributions to certify Linux conformance, didn't become encumbered with Windows specific features or enhancements such as GMSA or multi-operating system kube-proxy behavior.

Therefore, since there was a specialized need for Windows conformance testing, SIG Windows went down the path of offering Windows specific conformance tests through the Windows Operational Readiness Specification.

Can’t we just run the Kubernetes end-to-end test suite?

In the Linux world, tools such as Sonobuoy simplify execution of the conformance suite, relieving users from needing to be aware of Kubernetes' compilation paths or the semantics of Ginkgo tags.

Regarding needing to compile the Kubernetes tests, we realized that Windows users might similarly find the process of compiling and running the Kubernetes e2e suite from scratch similarly undesirable, hence, there was a clear need to provide a user-friendly, "push-button" solution that is ready to go. Moreover, regarding Ginkgo tags, applying conformance tests to Windows nodes through a set of Ginkgo tags would also be burdensome for any user, including Linux enthusiasts or experienced Windows system admins alike.

To bridge the gap and give users a straightforward way to confirm their clusters support a variety of features, the Kubernetes SIG for Windows found it necessary to therefore create the Windows Operational Readiness application. This application written in Go, simplifies the process to run the necessary Windows specific tests while delivering results in a clear, accessible format.

This initiative has been a collaborative effort, with contributions from different cloud providers and platforms, including Amazon, Microsoft, SUSE, and Broadcom.

A closer look at the Windows Operational Readiness Specification

The Windows Operational Readiness specification specifically targets and executes tests found within the Kubernetes repository in a more user-friendly way than simply targeting Ginkgo tags. It introduces a structured test suite that is split into sets of core and extended tests, with each set of tests containing categories directed at testing a specific area of testing, such as networking. Core tests target fundamental and critical functionalities that Windows nodes should support as defined by the Kubernetes specification. On the other hand, extended tests cover more complex features, more aligned with diving deeper into Windows-specific capabilities such as integrations with Active Directory. These goal of these tests is to be extensive, covering a wide array of Windows-specific capabilities to ensure compatibility with a diverse set of workloads and configurations, extending beyond basic requirements. Below is the current list of categories.

Category Name

Category Description

Core.Network

Tests minimal networking functionality (ability to access pod-by-pod IP.)

Core.Storage

Tests minimal storage functionality, (ability to mount a hostPath storage volume.)

Core.Scheduling

Tests minimal scheduling functionality, (ability to schedule a pod with CPU limits.)

Core.Concurrent

Tests minimal concurrent functionality, (the ability of a node to handle traffic to multiple pods concurrently.)

Extend.HostProcess

Tests features related to Windows HostProcess pod functionality.

Extend.ActiveDirectory

Tests features related to Active Directory functionality.

Extend.NetworkPolicy

Tests features related to Network Policy functionality.

Extend.Network

Tests advanced networking functionality, (ability to support IPv6)

Extend.Worker

Tests features related to Windows worker node functionality, (ability for nodes to access TCP and UDP services in the same cluster)

How to conduct operational readiness tests for Windows nodes

To run the Windows Operational Readiness test suite, refer to the test suite's README, which explains how to set it up and run it. The test suite offers flexibility in how you can execute tests, either using a compiled binary or a Sonobuoy plugin. You also have the choice to run the tests against the entire test suite or by specifying a list of categories. Cloud providers have the choice of uploading their conformance results, enhancing transparency and reliability.

Once you have checked out that code, you can run a test. For example, this sample command runs the tests from the Core.Concurrent category:

./op-readiness --kubeconfig $KUBE_CONFIG --category Core.Concurrent

As a contributor to Kubernetes, if you want to test your changes against a specific pull request using the Windows Operational Readiness Specification, use the following bot command in the new pull request.

/test operational-tests-capz-windows-2019

Looking ahead

We’re looking to improve our curated list of Windows-specific tests by adding new tests to the Kubernetes repository and also identifying existing test cases that can be targetted. The long term goal for the specification is to continually enhance test coverage for Windows worker nodes and improve the robustness of Windows support, facilitating a seamless experience across diverse cloud environments. We also have plans to integrate the Windows Operational Readiness tests into the official Kubernetes conformance suite.

If you are interested in helping us out, please reach out to us! We welcome help in any form, from giving once-off feedback to making a code contribution, to having long-term owners to help us drive changes. The Windows Operational Readiness specification is owned by the SIG Windows team. You can reach out to the team on the Kubernetes Slack workspace #sig-windows channel. You can also explore the Windows Operational Readiness test suite and make contributions directly to the GitHub repository.

Special thanks to Kulwant Singh (AWS), Pramita Gautam Rana (VMWare), Xinqi Li (Google) for their help in making notable contributions to the specification. Additionally, appreciation goes to James Sturtevant (Microsoft), Mark Rossetti (Microsoft), Claudiu Belu (Cloudbase Solutions) and Aravindh Puthiyaparambil (Softdrive Technologies Group Inc.) from the SIG Windows team for their guidance and support.

via Kubernetes Blog https://kubernetes.io/

April 02, 2024 at 08:00PM

#Kubernetes #K8s

·kubernetes.io·Apr 3, 2024

Introducing the Windows Operational Readiness Specification

Week Ending March 24 2024

Week Ending March 24, 2024

https://lwkd.info/2024/20240327

Developer News

Kubernetes Contributor Summit happened last week and was attended by more than 220 contributors. As an event the day before KubeCon EU 2024, we had multiple sessions around the Kubernetes project as well as Q&As with the CNCF team and the Kubernetes Steering committee. There also have been four unconference sessions for example, revisiting Kubernetes hardware resource model. A big thanks to the community organizers & volunteers. Pictures can be found here.

The CFPs for KubeCon + CloudNativeCon and Open Source Summit China is open at https://events.linuxfoundation.org/kubecon-cloudnativecon-open-source-summit-ai-dev-china/

Release Schedule

Next Deadline: Release Day, April 17th

Kubernetes 1.30.0-rc.0 is live!. Also, the docs freeze is now in effect!

KEP of the Week

KEP 2876: CRD Validation Expression Language

This KEP proposes adding Common Expression Language (CEL) to be integrated into CRDs so that validation can be done without the use of webhooks. It’s lightweight and can be run in the kube-apiserver. It also supports pre-parsing and typechecking of expressions, allowing syntax and type errors to be caught at CRD registration time.

This KEP graduated to stable in the v1.29 release.

Subprojects and Dependency Updates

ocicni to v0.4.2 Use ‘ifconfig -j’ to access jail network state

containerd to v1.7.14 Register imagePullThroughput and count with MiB. Move high volume event logs to Trace level also v1.6.30

nerdctl to v1.7.5 update containerd (1.7.14), slirp4netns (1.2.3), CNI plugins (1.4.1), RootlessKit (2.0.2), Kubo (0.27.0), imgcrypt (1.1.10)

etcd to v3.4.31 mvcc: print backend database size and size in use in compaction logs

prometheus to v2.51.0 Relabel rules for AlertManagerConfig; allows routing alerts to different alertmanagers

via Last Week in Kubernetes Development https://lwkd.info/

March 27, 2024 at 04:28PM

#Kubernetes #K8s

·lwkd.info·Mar 27, 2024

Week Ending March 24 2024

Using Go workspaces in Kubernetes

https://kubernetes.io/blog/2024/03/19/go-workspaces-in-kubernetes/

Author: Tim Hockin (Google)

The Go programming language has played a huge role in the success of Kubernetes. As Kubernetes has grown, matured, and pushed the bounds of what "regular" projects do, the Go project team has also grown and evolved the language and tools. In recent releases, Go introduced a feature called "workspaces" which was aimed at making projects like Kubernetes easier to manage.

We've just completed a major effort to adopt workspaces in Kubernetes, and the results are great. Our codebase is simpler and less error-prone, and we're no longer off on our own technology island.

GOPATH and Go modules

Kubernetes is one of the most visible open source projects written in Go. The earliest versions of Kubernetes, dating back to 2014, were built with Go 1.3. Today, 10 years later, Go is up to version 1.22 — and let's just say that a whole lot has changed.

In 2014, Go development was entirely based on GOPATH. As a Go project, Kubernetes lived by the rules of GOPATH. In the buildup to Kubernetes 1.4 (mid 2016), we introduced a directory tree called staging. This allowed us to pretend to be multiple projects, but still exist within one git repository (which had advantages for development velocity). The magic of GOPATH allowed this to work.

Kubernetes depends on several code-generation tools which have to find, read, and write Go code packages. Unsurprisingly, those tools grew to rely on GOPATH. This all worked pretty well until Go introduced modules in Go 1.11 (mid 2018).

Modules were an answer to many issues around GOPATH. They gave more control to projects on how to track and manage dependencies, and were overall a great step forward. Kubernetes adopted them. However, modules had one major drawback — most Go tools could not work on multiple modules at once. This was a problem for our code-generation tools and scripts.

Thankfully, Go offered a way to temporarily disable modules (GO111MODULE to the rescue). We could get the dependency tracking benefits of modules, but the flexibility of GOPATH for our tools. We even wrote helper tools to create fake GOPATH trees and played tricks with symlinks in our vendor directory (which holds a snapshot of our external dependencies), and we made it all work.

And for the last 5 years it has worked pretty well. That is, it worked well unless you looked too closely at what was happening. Woe be upon you if you had the misfortune to work on one of the code-generation tools, or the build system, or the ever-expanding suite of bespoke shell scripts we use to glue everything together.

The problems

Like any large software project, we Kubernetes developers have all learned to deal with a certain amount of constant low-grade pain. Our custom staging mechanism let us bend the rules of Go; it was a little clunky, but when it worked (which was most of the time) it worked pretty well. When it failed, the errors were inscrutable and un-Googleable — nobody else was doing the silly things we were doing. Usually the fix was to re-run one or more of the update-* shell scripts in our aptly named hack directory.

As time went on we drifted farther and farher from "regular" Go projects. At the same time, Kubernetes got more and more popular. For many people, Kubernetes was their first experience with Go, and it wasn't always a good experience.

Our eccentricities also impacted people who consumed some of our code, such as our client library and the code-generation tools (which turned out to be useful in the growing ecosystem of custom resources). The tools only worked if you stored your code in a particular GOPATH-compatible directory structure, even though GOPATH had been replaced by modules more than four years prior.

This state persisted because of the confluence of three factors:

Most of the time it only hurt a little (punctuated with short moments of more acute pain).

Kubernetes was still growing in popularity - we all had other, more urgent things to work on.

The fix was not obvious, and whatever we came up with was going to be both hard and tedious.

As a Kubernetes maintainer and long-timer, my fingerprints were all over the build system, the code-generation tools, and the hack scripts. While the pain of our mess may have been low on_average, I was one of the people who felt it regularly.

Enter workspaces

Along the way, the Go language team saw what we (and others) were doing and didn't love it. They designed a new way of stitching multiple modules together into a new workspace concept. Once enrolled in a workspace, Go tools had enough information to work in any directory structure and across modules, without GOPATH or symlinks or other dirty tricks.

When I first saw this proposal I knew that this was the way out. This was how to break the logjam. If workspaces was the technical solution, then I would put in the work to make it happen.

The work

Adopting workspaces was deceptively easy. I very quickly had the codebase compiling and running tests with workspaces enabled. I set out to purge the repository of anything GOPATH related. That's when I hit the first real bump - the code-generation tools.

We had about a dozen tools, totalling several thousand lines of code. All of them were built using an internal framework called gengo, which was built on Go's own parsing libraries. There were two main problems:

Those parsing libraries didn't understand modules or workspaces.

GOPATH allowed us to pretend that Go package paths and directories on disk were interchangeable in trivial ways. They are not.

Switching to a modules- and workspaces-aware parsing library was the first step. Then I had to make a long series of changes to each of the code-generation tools. Critically, I had to find a way to do it that was possible for some other person to review! I knew that I needed reviewers who could cover the breadth of changes and reviewers who could go into great depth on specific topics like gengo and Go's module semantics. Looking at the history for the areas I was touching, I asked Joe Betz and Alex Zielenski (SIG API Machinery) to go deep on gengo and code-generation, Jordan Liggitt (SIG Architecture and all-around wizard) to cover Go modules and vendoring and the hack scripts, and Antonio Ojea (wearing his SIG Testing hat) to make sure the whole thing made sense. We agreed that a series of small commits would be easiest to review, even if the codebase might not actually work at each commit.

Sadly, these were not mechanical changes. I had to dig into each tool to figure out where they were processing disk paths versus where they were processing package names, and where those were being conflated. I made extensive use of the delve debugger, which I just can't say enough good things about.

One unfortunate result of this work was that I had to break compatibility. The gengo library simply did not have enough information to process packages outside of GOPATH. After discussion with gengo and Kubernetes maintainers, we agreed to make gengo/v2. I also used this as an opportunity to clean up some of the gengo APIs and the tools' CLIs to be more understandable and not conflate packages and directories. For example you can't just string-join directory names and assume the result is a valid package name.

Once I had the code-generation tools converted, I shifted attention to the dozens of scripts in the hack directory. One by one I had to run them, debug, and fix failures. Some of them needed minor changes and some needed to be rewritten.

Along the way we hit some cases that Go did not support, like workspace vendoring. Kubernetes depends on vendoring to ensure that our dependencies are always available, even if their source code is removed from the internet (it has happened more than once!). After discussing with the Go team, and looking at possible workarounds, they decided the right path was to implement workspace vendoring.

The eventual Pull Request contained over 200 individual commits.

Results

Now that this work has been merged, what does this mean for Kubernetes users? Pretty much nothing. No features were added or changed. This work was not about fixing bugs (and hopefully none were introduced).

This work was mainly for the benefit of the Kubernetes project, to help and simplify the lives of the core maintainers. In fact, it would not be a lie to say that it was rather self-serving - my own life is a little bit better now.

This effort, while unusually large, is just a tiny fraction of the overall maintenance work that needs to be done. Like any large project, we have lots of "technical debt" — tools that made point-in-time assumptions and need revisiting, internal APIs whose organization doesn't make sense, code which doesn't follow conventions which didn't exist at the time, and tests which aren't as rigorous as they could be, just to throw out a few examples. This work is often called "grungy" or "dirty", but in reality it's just an indication that the project has grown and evolved. I love this stuff, but there's far more than I can ever tackle on my own, which makes it an interesting way for people to get involved. As our unofficial motto goes: "chop wood and carry water".

Kubernetes used to be a case-study of how not to do large-scale Go development, but now our codebase is simpler (and in some cases faster!) and more consistent. Things that previously seemed like they should work, but didn't, now behave as expected.

Our project is now a little more "regular". Not completely so, but we're getting closer.

Thanks

This effort would not have been possible without tons of support.

First, thanks to the Go team for hearing our pain, taking feedback, and solving the problems for us.

Special mega-thanks goes to Michael Matloob, on the Go team at Google, who designed and implemented workspaces. He guided me every step of the way, and was very generous with his tim

#Kubernetes #K8s

·kubernetes.io·Mar 20, 2024

Using Go workspaces in Kubernetes

Blog: Using Go workspaces in Kubernetes

https://www.kubernetes.dev/blog/2024/03/19/go-workspaces-in-kubernetes/

The Go programming language has played a huge role in the success of Kubernetes. As Kubernetes has grown, matured, and pushed the bounds of what “regular” projects do, the Go project team has also grown and evolved the language and tools. In recent releases, Go introduced a feature called “workspaces” which was aimed at making projects like Kubernetes easier to manage.

We’ve just completed a major effort to adopt workspaces in Kubernetes, and the results are great. Our codebase is simpler and less error-prone, and we’re no longer off on our own technology island.

GOPATH and Go modules

Kubernetes is one of the most visible open source projects written in Go. The earliest versions of Kubernetes, dating back to 2014, were built with Go 1.3. Today, 10 years later, Go is up to version 1.22 — and let’s just say that a whole lot has changed.

In 2014, Go development was entirely based on GOPATH. As a Go project, Kubernetes lived by the rules of GOPATH. In the buildup to Kubernetes 1.4 (mid 2016), we introduced a directory tree called staging. This allowed us to pretend to be multiple projects, but still exist within one git repository (which had advantages for development velocity). The magic of GOPATH allowed this to work.

Kubernetes depends on several code-generation tools which have to find, read, and write Go code packages. Unsurprisingly, those tools grew to rely on GOPATH. This all worked pretty well until Go introduced modules in Go 1.11 (mid 2018).

Modules were an answer to many issues around GOPATH. They gave more control to projects on how to track and manage dependencies, and were overall a great step forward. Kubernetes adopted them. However, modules had one major drawback — most Go tools could not work on multiple modules at once. This was a problem for our code-generation tools and scripts.

Thankfully, Go offered a way to temporarily disable modules (GO111MODULE to the rescue). We could get the dependency tracking benefits of modules, but the flexibility of GOPATH for our tools. We even wrote helper tools to create fake GOPATH trees and played tricks with symlinks in our vendor directory (which holds a snapshot of our external dependencies), and we made it all work.

And for the last 5 years it has worked pretty well. That is, it worked well unless you looked too closely at what was happening. Woe be upon you if you had the misfortune to work on one of the code-generation tools, or the build system, or the ever-expanding suite of bespoke shell scripts we use to glue everything together.

The problems

Like any large software project, we Kubernetes developers have all learned to deal with a certain amount of constant low-grade pain. Our custom staging mechanism let us bend the rules of Go; it was a little clunky, but when it worked (which was most of the time) it worked pretty well. When it failed, the errors were inscrutable and un-Googleable — nobody else was doing the silly things we were doing. Usually the fix was to re-run one or more of the update-* shell scripts in our aptly named hack directory.

As time went on we drifted farther and farher from “regular” Go projects. At the same time, Kubernetes got more and more popular. For many people, Kubernetes was their first experience with Go, and it wasn’t always a good experience.

Our eccentricities also impacted people who consumed some of our code, such as our client library and the code-generation tools (which turned out to be useful in the growing ecosystem of custom resources). The tools only worked if you stored your code in a particular GOPATH-compatible directory structure, even though GOPATH had been replaced by modules more than four years prior.

This state persisted because of the confluence of three factors:

Most of the time it only hurt a little (punctuated with short moments of more acute pain).

Kubernetes was still growing in popularity - we all had other, more urgent things to work on.

The fix was not obvious, and whatever we came up with was going to be both hard and tedious.

As a Kubernetes maintainer and long-timer, my fingerprints were all over the build system, the code-generation tools, and the hack scripts. While the pain of our mess may have been low on_average, I was one of the people who felt it regularly.

Enter workspaces

Along the way, the Go language team saw what we (and others) were doing and didn’t love it. They designed a new way of stitching multiple modules together into a new workspace concept. Once enrolled in a workspace, Go tools had enough information to work in any directory structure and across modules, without GOPATH or symlinks or other dirty tricks.

When I first saw this proposal I knew that this was the way out. This was how to break the logjam. If workspaces was the technical solution, then I would put in the work to make it happen.

The work

Adopting workspaces was deceptively easy. I very quickly had the codebase compiling and running tests with workspaces enabled. I set out to purge the repository of anything GOPATH related. That’s when I hit the first real bump - the code-generation tools.

We had about a dozen tools, totalling several thousand lines of code. All of them were built using an internal framework called gengo, which was built on Go’s own parsing libraries. There were two main problems:

Those parsing libraries didn’t understand modules or workspaces.

GOPATH allowed us to pretend that Go package paths and directories on disk were interchangeable in trivial ways. They are not.

Switching to a modules- and workspaces-aware parsing library was the first step. Then I had to make a long series of changes to each of the code-generation tools. Critically, I had to find a way to do it that was possible for some other person to review! I knew that I needed reviewers who could cover the breadth of changes and reviewers who could go into great depth on specific topics like gengo and Go’s module semantics. Looking at the history for the areas I was touching, I asked Joe Betz and Alex Zielenski (SIG API Machinery) to go deep on gengo and code-generation, Jordan Liggitt (SIG Architecture and all-around wizard) to cover Go modules and vendoring and the hack scripts, and Antonio Ojea (wearing his SIG Testing hat) to make sure the whole thing made sense. We agreed that a series of small commits would be easiest to review, even if the codebase might not actually work at each commit.

Sadly, these were not mechanical changes. I had to dig into each tool to figure out where they were processing disk paths versus where they were processing package names, and where those were being conflated. I made extensive use of the delve debugger, which I just can’t say enough good things about.

One unfortunate result of this work was that I had to break compatibility. The gengo library simply did not have enough information to process packages outside of GOPATH. After discussion with gengo and Kubernetes maintainers, we agreed to make gengo/v2. I also used this as an opportunity to clean up some of the gengo APIs and the tools’ CLIs to be more understandable and not conflate packages and directories. For example you can’t just string-join directory names and assume the result is a valid package name.

Once I had the code-generation tools converted, I shifted attention to the dozens of scripts in the hack directory. One by one I had to run them, debug, and fix failures. Some of them needed minor changes and some needed to be rewritten.

Along the way we hit some cases that Go did not support, like workspace vendoring. Kubernetes depends on vendoring to ensure that our dependencies are always available, even if their source code is removed from the internet (it has happened more than once!). After discussing with the Go team, and looking at possible workarounds, they decided the right path was to implement workspace vendoring.

The eventual Pull Request contained over 200 individual commits.

Results

Now that this work has been merged, what does this mean for Kubernetes users? Pretty much nothing. No features were added or changed. This work was not about fixing bugs (and hopefully none were introduced).

This work was mainly for the benefit of the Kubernetes project, to help and simplify the lives of the core maintainers. In fact, it would not be a lie to say that it was rather self-serving - my own life is a little bit better now.

This effort, while unusually large, is just a tiny fraction of the overall maintenance work that needs to be done. Like any large project, we have lots of “technical debt” — tools that made point-in-time assumptions and need revisiting, internal APIs whose organization doesn’t make sense, code which doesn’t follow conventions which didn’t exist at the time, and tests which aren’t as rigorous as they could be, just to throw out a few examples. This work is often called “grungy” or “dirty”, but in reality it’s just an indication that the project has grown and evolved. I love this stuff, but there’s far more than I can ever tackle on my own, which makes it an interesting way for people to get involved. As our unofficial motto goes: “chop wood and carry water”.

Kubernetes used to be a case-study of how not to do large-scale Go development, but now our codebase is simpler (and in some cases faster!) and more consistent. Things that previously seemed like they should work, but didn’t, now behave as expected.

Our project is now a little more “regular”. Not completely so, but we’re getting closer.

Thanks

This effort would not have been possible without tons of support.

First, thanks to the Go team for hearing our pain, taking feedback, and solving the problems for us.

Special mega-thanks goes to Michael Matloob, on the Go team at Google, who designed and implemented workspaces. He guided me every step of the way, and was very generous with his time, answering all m

#Kubernetes #K8s

·kubernetes.dev·Mar 19, 2024

Blog: Using Go workspaces in Kubernetes

Week Ending March 10 2024

Week Ending March 10, 2024

https://lwkd.info/2024/20240315

Developer News

Kubernetes Contributor Summit EU is happening next Tuesday, March 19, 2024. Make sure to register by March 15. If you want to bring a family member to social send an email to summit-team@kubernetes.io. We’re eagerly looking forward to receiving your contributions to the unconference topics.

Also, don’t forget to help your SIG staff its table at the Kubernetes Meet and Greet on Kubecon Friday.

Take a peek at the upcoming Kubernetes v1.30 Release in this Blog.

Release Schedule

Next Deadline: Draft Doc PRs, Mar 12th

Kubernetes v1.30.0-beta.0 is live!

Your SIG should be working on any feature blogs, and discussing what “themes” to feature in the Release Notes.

Featured PR

123516: DRA: structured parameters

DRA, or Dynamic Resource Allocation, is a way to bridge new types of schedulable resources into Kubernetes. A common example of this is GPU accelerator cards but the system is built as generically as possible. Maybe you want to schedule based on cooling capacity, or cash register hardware, or nearby interns, it’s up to you. DRA launched as an alpha feature back in 1.26 but came with some hard limitations. Notably the bespoke logic for simulating scale ups and scale downs in cluster-autoscaler had no way to understand how those would interact with these opaque resources. This PR pulls back the veil a tiny bit, keeping things generic but allowing more forms of structured interaction so core tools like the scheduler and autoscalers can understand dynamic resources.

This happens from a few directions. First, on the node itself a DRA driver plugin provides information about what is available locally, which the kubelet publishes as a NodeResourceSlice object. In parallel, an operator component from the DRA implementation creates ResourceClaimParameters as needed to describe a particular resource claim. The claim parameters include CEL selector expressions for each piece of the claim, allowing anything which can evaluate CEL to check them independently of the DRA plugin. These two new objects combine with the existing ResourceClaim object to allow bidirectional communication between Kubernetes components and the DRA plugin without either side needing to wait for the other in most operations.

While this does increase the implementation complexity of a new DRA provider, it also dramatically expands their capabilities. New resources can be managed with effectively zero overhead and without the even greater complexity of custom schedulers or a plugin-driven autoscaler.

KEP of the Week

KEP-647: APIServer Tracing

This KEP proposes to update the kube-apiserver to allow tracing requests. This is proposed to be done with OpenTelemetry libraries and the data will be exported in the OpenTelemetry format. The kube-apiserver currently uses kubernetes/utils/trace for tracing, but we can make use of distributed tracing to improve ease of use and to make analysis of the data easier. The proposed implementation involves wrapping the API Server’s http server and http clients with otelhttp.

This KEP is tracked to graduate to stable in the upcoming v1.30 release.

Other Merges

podLogsDir key in kubelet configuration to configure default location of pod logs.

New custom flag to kubectl debug for adding custom debug profiles.

PodResources API now has initContainers with containerRestartPolicy of Always when SidecarContainers are enabled.

Fix to the disruption controller’s PDB status sync to maintain PDB conditions during an update.

Service NodePort can now be set to 0 if AllocateLoadBalancerNodePorts is false.

Field selector for services that allows filtering by clusterIP field.

The ‘.memorySwap.swapBehaviour’ field in kubelet configuration gets NoSwap as the default value.

kubectl get jobs now prints the status of the listed jobs.

Bugfix for initContainer with containerRestartPolicy Always where it couldn’t update its Pod state from terminated to non-terminated.

The StorageVersionMigration API, which was previously available as a CRD, is now a built-in API.

InitContainer’s image location will now be considered in scheduling when prioritizing nodes.

Almost all printable ASCII characters are now allowed in environment variables.

Added support for configuring multiple JWT authenticators in Structured Authentication Configuration.

New trafficDistribution field added to the Service spec which allows configuring how traffic is distributed to the endpoints of a Service.

JWT authenticator config set via the –authentication-config flag is now dynamically reloaded as the file changes on disk.

Promotions

StructuredAuthorizationConfiguration to beta.

HPAContainerMetrics to GA.

MinDomainsInPodTopologySpread to GA.

ValidatingAdmissionPolicy to GA.

StructuredAuthenticationConfiguration to beta.

KubeletConfigDropInDir to beta.

Version Updates

google.golang.org/protobuf updated to v1.33.0 to resolve CVE-2024-24786.

Subprojects and Dependency Updates

gRPC to v1.62.1 fix a bug that results in no matching virtual host found RPC errors

cloud-provider-openstack to v1.28.2 Implement imagePullSecret support for release 1.28

via Last Week in Kubernetes Development https://lwkd.info/

March 14, 2024 at 11:53PM

#Kubernetes #K8s

·lwkd.info·Mar 15, 2024

Week Ending March 10 2024

A Peek at Kubernetes v1.30

https://kubernetes.io/blog/2024/03/12/kubernetes-1-30-upcoming-changes/

Authors: Amit Dsouza, Frederick Kautz, Kristin Martin, Abigail McCarthy, Natali Vlatko

A quick look: exciting changes in Kubernetes v1.30

It's a new year and a new Kubernetes release. We're halfway through the release cycle and have quite a few interesting and exciting enhancements coming in v1.30. From brand new features in alpha, to established features graduating to stable, to long-awaited improvements, this release has something for everyone to pay attention to!

To tide you over until the official release, here's a sneak peek of the enhancements we're most excited about in this cycle!

Major changes for Kubernetes v1.30

Structured parameters for dynamic resource allocation (KEP-4381)

Dynamic resource allocation was added to Kubernetes as an alpha feature in v1.26. It defines an alternative to the traditional device-plugin API for requesting access to third-party resources. By design, dynamic resource allocation uses parameters for resources that are completely opaque to core Kubernetes. This approach poses a problem for the Cluster Autoscaler (CA) or any higher-level controller that needs to make decisions for a group of pods (e.g. a job scheduler). It cannot simulate the effect of allocating or deallocating claims over time. Only the third-party DRA drivers have the information available to do this.

Structured Parameters for dynamic resource allocation is an extension to the original implementation that addresses this problem by building a framework to support making these claim parameters less opaque. Instead of handling the semantics of all claim parameters themselves, drivers could manage resources and describe them using a specific "structured model" pre-defined by Kubernetes. This would allow components aware of this "structured model" to make decisions about these resources without outsourcing them to some third-party controller. For example, the scheduler could allocate claims rapidly without back-and-forth communication with dynamic resource allocation drivers. Work done for this release centers on defining the framework necessary to enable different "structured models" and to implement the "named resources" model. This model allows listing individual resource instances and, compared to the traditional device plugin API, adds the ability to select those instances individually via attributes.

Node memory swap support (KEP-2400)

In Kubernetes v1.30, memory swap support on Linux nodes gets a big change to how it works - with a strong emphasis on improving system stability. In previous Kubernetes versions, the NodeSwap feature gate was disabled by default, and when enabled, it used UnlimitedSwap behavior as the default behavior. To achieve better stability, UnlimitedSwap behavior (which might compromise node stability) will be removed in v1.30.

The updated, still-beta support for swap on Linux nodes will be available by default. However, the default behavior will be to run the node set to NoSwap (not UnlimitedSwap) mode. In NoSwap mode, the kubelet supports running on a node where swap space is active, but Pods don't use any of the page file. You'll still need to set --fail-swap-on=false for the kubelet to run on that node. However, the big change is the other mode: LimitedSwap. In this mode, the kubelet actually uses the page file on that node and allows Pods to have some of their virtual memory paged out. Containers (and their parent pods) do not have access to swap beyond their memory limit, but the system can still use the swap space if available.

Kubernetes' Node special interest group (SIG Node) will also update the documentation to help you understand how to use the revised implementation, based on feedback from end users, contributors, and the wider Kubernetes community.

Read the previous blog post or the node swap documentation for more details on Linux node swap support in Kubernetes.

Support user namespaces in pods (KEP-127)

User namespaces is a Linux-only feature that better isolates pods to prevent or mitigate several CVEs rated high/critical, including CVE-2024-21626, published in January 2024. In Kubernetes 1.30, support for user namespaces is migrating to beta and now supports pods with and without volumes, custom UID/GID ranges, and more!

Structured authorization configuration (KEP-3221)

Support for structured authorization configuration.) is moving to beta and will be enabled by default. This feature enables the creation of authorization chains with multiple webhooks with well-defined parameters that validate requests in a particular order and allows fine-grained control – such as explicit Deny on failures. The configuration file approach even allows you to specify CEL rules to pre-filter requests before they are dispatched to webhooks, helping you to prevent unnecessary invocations. The API server also automatically reloads the authorizer chain when the configuration file is modified.

You must specify the path to that authorization configuration using the --authorization-config command line argument. If you want to keep using command line flags instead of a configuration file, those will continue to work as-is. To gain access to new authorization webhook capabilities like multiple webhooks, failure policy, and pre-filter rules, switch to putting options in an --authorization-config file. From Kubernetes 1.30, the configuration file format is beta-level, and only requires specifying --authorization-config since the feature gate is enabled by default. An example configuration with all possible values is provided in the Authorization docs. For more details, read the Authorization docs.

Container resource based pod autoscaling (KEP-1610)

Horizontal pod autoscaling based on ContainerResource metrics will graduate to stable in v1.30. This new behavior for HorizontalPodAutoscaler allows you to configure automatic scaling based on the resource usage for individual containers, rather than the aggregate resource use over a Pod. See our previous article for further details, or read container resource metrics.

CEL for admission control (KEP-3488)

Integrating Common Expression Language (CEL) for admission control in Kubernetes introduces a more dynamic and expressive way of evaluating admission requests. This feature allows complex, fine-grained policies to be defined and enforced directly through the Kubernetes API, enhancing security and governance capabilities without compromising performance or flexibility.

CEL's addition to Kubernetes admission control empowers cluster administrators to craft intricate rules that can evaluate the content of API requests against the desired state and policies of the cluster without resorting to Webhook-based access controllers. This level of control is crucial for maintaining the integrity, security, and efficiency of cluster operations, making Kubernetes environments more robust and adaptable to various use cases and requirements. For more information on using CEL for admission control, see the API documentation for ValidatingAdmissionPolicy.

We hope you're as excited for this release as we are. Keep an eye out for the official release blog in a few weeks for more highlights!

via Kubernetes Blog https://kubernetes.io/

March 11, 2024 at 08:00PM

#Kubernetes #K8s

·kubernetes.io·Mar 12, 2024

A Peek at Kubernetes v1.30

Week Ending March 3 2024

Week Ending March 3, 2024

https://lwkd.info/2024/20240307

Developer News

All CI jobs must be on K8s community infra as of yesterday. While the infra team will migrate ones that are simple, other jobs that you don’t help them move may be deleted. Update your jobs now.

Monitoring dashboards for the GKE and EKS build clusters are live. Also, there was an outage in EKS jobs last week.

After a year of work led by Tim Hockin, Go Workspaces support for hacking on Kubernetes is now available, eliminating a lot of GOPATH pain.

It’s time to start working on your SIG Annual Reports, which you should find a lot shorter and easier than previous years. Note that you don’t have to be the SIG Chair to do these, they just have to review them.

Release Schedule

Next Deadline: Test Freeze, March 27th

Code Freeze is now in effect. If your KEP did not get tracked and you want to get your KEP shipped in the 1.30 release, please file an exception as soon as possible.

March Cherry Pick deadline for patch releases is the 8th.

Featured PRs

122717: KEP-4358: Custom Resource Field Selectors

Selectors in Kubernetes have long been a way to limit large API calls like List and Watch, requesting things with only specific labels or similar. In operators this can be very important to reduce memory usage of shared informer caches, as well as generally keeping apiserver load down. Some core objects extended selectors beyond labels, allowing filtering on other fields such as listing Pods based on spec.nodeName. But this set of fields was limited and could feel random if you didn’t know the specific history of the API (e.g. Pods need a node name filter because it’s the main request made by the kubelet). And it wasn’t available at all to custom type. This PR expands the system, allowing each custom type to declare selector-able fields which will be checked and indexed automatically. The declaration uses JSONPath in a very similar way to the additionalPrinterColumns feature:

selectableFields:

jsonPath: .spec.color
jsonPath: .spec.size

These can then be used in the API just like any other field selector:

c.List(context.Background(), &redThings, client.MatchingFields{ "spec.color": "red", })

As an alpha feature, this is behind the CustomResourceFieldSelectors feature gate.

KEP of the Week

KEP-1610: Container Resource based Autoscaling

For scaling pods based on resource usage, the HPA currently calculates the sum of all the individual container’s resource usage. This is not suitable for workloads where the containers are not related to each other. This KEP proposes that the HPA also provide an option to scale pods based on the resource usages of individual containers in a Pod. The KEP proposes adding a new ContainerResourceMetricSource metric source, with a new Container field, which will be used to identify the container for which the resources should be tracked. When there are multiple containers in a Pod, the individual resource usages of each container can change at different rates. Adding a way to specify the target gives more fine grained control over the scaling.

This KEP is in beta since v1.27 and is planned to graduate to stable in v1.30.

Other Merges

Tunnel kubectl port-forwarding through websockets

Enhanced conflict detection for Service Account and JWT

Create token duration can be zero

Reject empty usernames in OIDC authentication

OpenAPI V2 won’t publish non-matching group-version

New metrics: authorization webhook match conditions, jwt auth latency, watch cache latency

Kubeadm: list nodes needing upgrades, don’t pass duplicate default flags, better upgrade plans, WaitForAllControlPlaneComponents, upgradeConfiguration timeouts, upgradeConfiguration API

Implement strict JWT compact serialization enforcement

Don’t leak discovery documents via the Spec.Service field

Let the container runtime garbage-collect images by tagging them

Client-Go can upgrade subresource fields, and handles cache deletions

Wait for the ProviderID to be available before initializing a node

Don’t panic if nodecondition is nil

Broadcaster logging is now logging level 3

Access mode label for SELinux mounts

AuthorizationConfiguration v1alpha1 is also v1beta1

Kubelet user mapping IDs are configurable

Filter group versions in aggregated API requests

Match condition e2e tests are conformance

Kubelet gets constants from cadvisor

Promotions

PodSchedulingReadiness to GA

ImageMaximumGCAge to Beta

StructuredAuthorizationConfiguration to beta

MinDomainsInPodTopologySpread to beta

RemoteCommand Over Websockets to beta

ContainerCheckpoint to beta

ServiceAccountToken Info to beta

AggregatedDiscovery v2 to GA

PodHostIPs to GA

Version Updates

cadvisor to v0.49.0

kubedns to 1.23.0

Subprojects and Dependency Updates

kubespray to v2.24.1 Set Kubernetes v1.28.6 as the default Kubernetes version.

prometheus to v2.50.1 Fix for broken /metadata API endpoint

via Last Week in Kubernetes Development https://lwkd.info/

March 07, 2024 at 05:00PM

#Kubernetes #K8s

·lwkd.info·Mar 9, 2024

Week Ending March 3 2024

Week Ending March 3, 2024

http://lwkd.info/2024/20240307

Developer News

All CI jobs must be on K8s community infra as of yesterday. While the infra team will migrate ones that are simple, other jobs that you don’t help them move may be deleted. Update your jobs now.

Monitoring dashboards for the GKE and EKS build clusters are live. Also, there was an outage in EKS jobs last week.

After a year of work led by Tim Hockin, Go Workspaces support for hacking on Kubernetes is now available, eliminating a lot of GOPATH pain.

It’s time to start working on your SIG Annual Reports, which you should find a lot shorter and easier than previous years. Note that you don’t have to be the SIG Chair to do these, they just have to review them.

Release Schedule

Next Deadline: Test Freeze, March 27th

Code Freeze is now in effect. If your KEP did not get tracked and you want to get your KEP shipped in the 1.30 release, please file an exception as soon as possible.

March Cherry Pick deadline for patch releases is the 8th.

Featured PRs

122717: KEP-4358: Custom Resource Field Selectors

Selectors in Kubernetes have long been a way to limit large API calls like List and Watch, requesting things with only specific labels or similar. In operators this can be very important to reduce memory usage of shared informer caches, as well as generally keeping apiserver load down. Some core objects extended selectors beyond labels, allowing filtering on other fields such as listing Pods based on spec.nodeName. But this set of fields was limited and could feel random if you didn’t know the specific history of the API (e.g. Pods need a node name filter because it’s the main request made by the kubelet). And it wasn’t available at all to custom type. This PR expands the system, allowing each custom type to declare selector-able fields which will be checked and indexed automatically. The declaration uses JSONPath in a very similar way to the additionalPrinterColumns feature:

selectableFields:

jsonPath: .spec.color
jsonPath: .spec.size

These can then be used in the API just like any other field selector:

c.List(context.Background(), &redThings, client.MatchingFields{ "spec.color": "red", })

As an alpha feature, this is behind the CustomResourceFieldSelectors feature gate.

KEP of the Week

KEP-1610: Container Resource based Autoscaling

For scaling pods based on resource usage, the HPA currently calculates the sum of all the individual container’s resource usage. This is not suitable for workloads where the containers are not related to each other. This KEP proposes that the HPA also provide an option to scale pods based on the resource usages of individual containers in a Pod. The KEP proposes adding a new ContainerResourceMetricSource metric source, with a new Container field, which will be used to identify the container for which the resources should be tracked. When there are multiple containers in a Pod, the individual resource usages of each container can change at different rates. Adding a way to specify the target gives more fine grained control over the scaling.

This KEP is in beta since v1.27 and is planned to graduate to stable in v1.30.

Other Merges

Tunnel kubectl port-forwarding through websockets

Enhanced conflict detection for Service Account and JWT

Create token duration can be zero

Reject empty usernames in OIDC authentication

OpenAPI V2 won’t publish non-matching group-version

New metrics: authorization webhook match conditions, jwt auth latency, watch cache latency

Kubeadm: list nodes needing upgrades, don’t pass duplicate default flags, better upgrade plans, WaitForAllControlPlaneComponents, upgradeConfiguration timeouts, upgradeConfiguration API

Implement strict JWT compact serialization enforcement

Don’t leak discovery documents via the Spec.Service field

Let the container runtime garbage-collect images by tagging them

Client-Go can upgrade subresource fields, and handles cache deletions

Wait for the ProviderID to be available before initializing a node

Don’t panic if nodecondition is nil

Broadcaster logging is now logging level 3

Access mode label for SELinux mounts

AuthorizationConfiguration v1alpha1 is also v1beta1

Kubelet user mapping IDs are configurable

Filter group versions in aggregated API requests

Match condition e2e tests are conformance

Kubelet gets constants from cadvisor

Promotions

PodSchedulingReadiness to GA

ImageMaximumGCAge to Beta

StructuredAuthorizationConfiguration to beta

MinDomainsInPodTopologySpread to beta

RemoteCommand Over Websockets to beta

ContainerCheckpoint to beta

ServiceAccountToken Info to beta

AggregatedDiscovery v2 to GA

PodHostIPs to GA

Version Updates

cadvisor to v0.49.0

kubedns to 1.23.0

Subprojects and Dependency Updates

kubespray to v2.24.1 Set Kubernetes v1.28.6 as the default Kubernetes version.

prometheus to v2.50.1 Fix for broken /metadata API endpoint

via Last Week in Kubernetes Development http://lwkd.info/

March 07, 2024 at 05:00PM

#Kubernetes #K8s

·lwkd.info·Mar 7, 2024

Week Ending March 3 2024

CRI-O: Applying seccomp profiles from OCI registries

https://kubernetes.io/blog/2024/03/07/cri-o-seccomp-oci-artifacts/

Author: Sascha Grunert

Seccomp stands for secure computing mode and has been a feature of the Linux kernel since version 2.6.12. It can be used to sandbox the privileges of a process, restricting the calls it is able to make from userspace into the kernel. Kubernetes lets you automatically apply seccomp profiles loaded onto a node to your Pods and containers.

But distributing those seccomp profiles is a major challenge in Kubernetes, because the JSON files have to be available on all nodes where a workload can possibly run. Projects like the Security Profiles Operator solve that problem by running as a daemon within the cluster, which makes me wonder which part of that distribution could be done by the container runtime.

Runtimes usually apply the profiles from a local path, for example:

apiVersion: v1 kind: Pod metadata: name: pod spec: containers:

name: container image: nginx:1.25.3 securityContext: seccompProfile: type: Localhost localhostProfile: nginx-1.25.3.json

The profile nginx-1.25.3.json has to be available in the root directory of the kubelet, appended by the seccomp directory. This means the default location for the profile on-disk would be /var/lib/kubelet/seccomp/nginx-1.25.3.json. If the profile is not available, then runtimes will fail on container creation like this:

kubectl get pods

NAME READY STATUS RESTARTS AGE pod 0/1 CreateContainerError 0 38s

kubectl describe pod/pod | tail

Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 117s default-scheduler Successfully assigned default/pod to 127.0.0.1 Normal Pulling 117s kubelet Pulling image "nginx:1.25.3" Normal Pulled 111s kubelet Successfully pulled image "nginx:1.25.3" in 5.948s (5.948s including waiting) Warning Failed 7s (x10 over 111s) kubelet Error: setup seccomp: unable to load local profile "/var/lib/kubelet/seccomp/nginx-1.25.3.json": open /var/lib/kubelet/seccomp/nginx-1.25.3.json: no such file or directory Normal Pulled 7s (x9 over 111s) kubelet Container image "nginx:1.25.3" already present on machine

The major obstacle of having to manually distribute the Localhost profiles will lead many end-users to fall back to RuntimeDefault or even running their workloads as Unconfined (with disabled seccomp).

CRI-O to the rescue

The Kubernetes container runtime CRI-O provides various features using custom annotations. The v1.30 release adds support for a new set of annotations called seccomp-profile.kubernetes.cri-o.io/POD and seccomp-profile.kubernetes.cri-o.io/<CONTAINER>. Those annotations allow you to specify:

a seccomp profile for a specific container, when used as: seccomp-profile.kubernetes.cri-o.io/<CONTAINER> (example: seccomp-profile.kubernetes.cri-o.io/webserver: 'registry.example/example/webserver:v1')

a seccomp profile for every container within a pod, when used without the container name suffix but the reserved name POD: seccomp-profile.kubernetes.cri-o.io/POD

a seccomp profile for a whole container image, if the image itself contains the annotation seccomp-profile.kubernetes.cri-o.io/POD or seccomp-profile.kubernetes.cri-o.io/<CONTAINER>.

CRI-O will only respect the annotation if the runtime is configured to allow it, as well as for workloads running as Unconfined. All other workloads will still use the value from the securityContext with a higher priority.

The annotations alone will not help much with the distribution of the profiles, but the way they can be referenced will! For example, you can now specify seccomp profiles like regular container images by using OCI artifacts:

apiVersion: v1 kind: Pod metadata: name: pod annotations: seccomp-profile.kubernetes.cri-o.io/POD: quay.io/crio/seccomp:v2 spec: …

The image quay.io/crio/seccomp:v2 contains a seccomp.json file, which contains the actual profile content. Tools like ORAS or Skopeo can be used to inspect the contents of the image:

oras pull quay.io/crio/seccomp:v2

Downloading 92d8ebfa89aa seccomp.json Downloaded 92d8ebfa89aa seccomp.json Pulled [registry] quay.io/crio/seccomp:v2 Digest: sha256:f0205dac8a24394d9ddf4e48c7ac201ca7dcfea4c554f7ca27777a7f8c43ec1b

jq . seccomp.json | head

{ "defaultAction": "SCMP_ACT_ERRNO", "defaultErrnoRet": 38, "defaultErrno": "ENOSYS", "archMap": [ { "architecture": "SCMP_ARCH_X86_64", "subArchitectures": [ "SCMP_ARCH_X86", "SCMP_ARCH_X32"

Inspect the plain manifest of the image

skopeo inspect --raw docker://quay.io/crio/seccomp:v2 | jq .

{ "schemaVersion": 2, "mediaType": "application/vnd.oci.image.manifest.v1+json", "config": { "mediaType": "application/vnd.cncf.seccomp-profile.config.v1+json", "digest": "sha256:ca3d163bab055381827226140568f3bef7eaac187cebd76878e0b63e9e442356", "size": 3, }, "layers": [ { "mediaType": "application/vnd.oci.image.layer.v1.tar", "digest": "sha256:92d8ebfa89aa6dd752c6443c27e412df1b568d62b4af129494d7364802b2d476", "size": 18853, "annotations": { "org.opencontainers.image.title": "seccomp.json" }, }, ], "annotations": { "org.opencontainers.image.created": "2024-02-26T09:03:30Z" }, }

The image manifest contains a reference to a specific required config media type (application/vnd.cncf.seccomp-profile.config.v1+json) and a single layer (application/vnd.oci.image.layer.v1.tar) pointing to the seccomp.json file. But now, let's give that new feature a try!

Using the annotation for a specific container or whole pod

CRI-O needs to be configured adequately before it can utilize the annotation. To do this, add the annotation to the allowed_annotations array for the runtime. This can be done by using a drop-in configuration /etc/crio/crio.conf.d/10-crun.conf like this:

[crio.runtime] default_runtime = "crun"

[crio.runtime.runtimes.crun] allowed_annotations = [ "seccomp-profile.kubernetes.cri-o.io", ]

Now, let's run CRI-O from the latest main commit. This can be done by either building it from source, using the static binary bundles or the prerelease packages.

To demonstrate this, I ran the crio binary from my command line using a single node Kubernetes cluster via local-up-cluster.sh. Now that the cluster is up and running, let's try a pod without the annotation running as seccomp Unconfined:

cat pod.yaml

apiVersion: v1 kind: Pod metadata: name: pod spec: containers:

name: container image: nginx:1.25.3 securityContext: seccompProfile: type: Unconfined

kubectl apply -f pod.yaml

The workload is up and running:

kubectl get pods

NAME READY STATUS RESTARTS AGE pod 1/1 Running 0 15s

And no seccomp profile got applied if I inspect the container using crictl:

export CONTAINER_ID=$(sudo crictl ps --name container -q) sudo crictl inspect $CONTAINER_ID | jq .info.runtimeSpec.linux.seccomp

null

Now, let's modify the pod to apply the profile quay.io/crio/seccomp:v2 to the container:

apiVersion: v1 kind: Pod metadata: name: pod annotations: seccomp-profile.kubernetes.cri-o.io/container: quay.io/crio/seccomp:v2 spec: containers:

name: container image: nginx:1.25.3

I have to delete and recreate the Pod, because only recreation will apply a new seccomp profile:

kubectl delete pod/pod

pod "pod" deleted

kubectl apply -f pod.yaml

pod/pod created

The CRI-O logs will now indicate that the runtime pulled the artifact:

WARN[…] Allowed annotations are specified for workload [seccomp-profile.kubernetes.cri-o.io] INFO[…] Found container specific seccomp profile annotation: seccomp-profile.kubernetes.cri-o.io/container=quay.io/crio/seccomp:v2 id=26ddcbe6-6efe-414a-88fd-b1ca91979e93 name=/runtime.v1.RuntimeService/CreateContainer INFO[…] Pulling OCI artifact from ref: quay.io/crio/seccomp:v2 id=26ddcbe6-6efe-414a-88fd-b1ca91979e93 name=/runtime.v1.RuntimeService/CreateContainer INFO[…] Retrieved OCI artifact seccomp profile of len: 18853 id=26ddcbe6-6efe-414a-88fd-b1ca91979e93 name=/runtime.v1.RuntimeService/CreateContainer

And the container is finally using the profile:

export CONTAINER_ID=$(sudo crictl ps --name container -q) sudo crictl inspect $CONTAINER_ID | jq .info.runtimeSpec.linux.seccomp | head

{ "defaultAction": "SCMP_ACT_ERRNO", "defaultErrnoRet": 38, "architectures": [ "SCMP_ARCH_X86_64", "SCMP_ARCH_X86", "SCMP_ARCH_X32" ], "syscalls": [ {

The same would work for every container in the pod, if users replace the /container suffix with the reserved name /POD, for example:

apiVersion: v1 kind: Pod metadata: name: pod annotations: seccomp-profile.kubernetes.cri-o.io/POD: quay.io/crio/seccomp:v2 spec: containers:

name: container image: nginx:1.25.3

Using the annotation for a container image

While specifying seccomp profiles as OCI artifacts on certain workloads is a cool feature, the majority of end users would like to link seccomp profiles to published container images. This can be done by using a container image annotation; instead of being applied to a Kubernetes Pod, the annotation is some metadata applied at the container image itself. For example, Podman can be used to add the image annotation directly during image build:

podman build \ --annotation seccomp-profile.kubernetes.cri-o.io=quay.io/crio/seccomp:v2 \ -t quay.io/crio/nginx-seccomp:v2 .

The pushed image then contains the annotation:

skopeo inspect --raw docker://quay.io/crio/nginx-seccomp:v2 |

jq '.annotations."seccomp-profile.kubernetes.cri-o.io"'

"quay.io/crio/seccomp:v2"

If I now use that image in an CRI-O test pod definition:

apiVersion: v1 kind: Pod metadata: name: pod

no Pod annotations set

spec: containers:

name: container image: quay.io/crio/nginx-seccomp:v2

Then the CRI-O logs will indicate that the image annotation got evaluated and the profile got applied:

kubectl delete pod/pod

pod "pod" deleted

kubectl apply -f pod.yaml

po

#Kubernetes #K8s

·kubernetes.io·Mar 7, 2024

CRI-O: Applying seccomp profiles from OCI registries

Spotlight on SIG Cloud Provider

https://kubernetes.io/blog/2024/03/01/sig-cloud-provider-spotlight-2024/

Author: Arujjwal Negi

One of the most popular ways developers use Kubernetes-related services is via cloud providers, but have you ever wondered how cloud providers can do that? How does this whole process of integration of Kubernetes to various cloud providers happen? To answer that, let's put the spotlight on SIG Cloud Provider.

SIG Cloud Provider works to create seamless integrations between Kubernetes and various cloud providers. Their mission? Keeping the Kubernetes ecosystem fair and open for all. By setting clear standards and requirements, they ensure every cloud provider plays nicely with Kubernetes. It is their responsibility to configure cluster components to enable cloud provider integrations.

In this blog of the SIG Spotlight series, Arujjwal Negi interviews Michael McCune (Red Hat), also known as elmiko, co-chair of SIG Cloud Provider, to give us an insight into the workings of this group.

Introduction

Arujjwal: Let's start by getting to know you. Can you give us a small intro about yourself and how you got into Kubernetes?

Michael: Hi, I’m Michael McCune, most people around the community call me by my handle, elmiko. I’ve been a software developer for a long time now (Windows 3.1 was popular when I started!), and I’ve been involved with open-source software for most of my career. I first got involved with Kubernetes as a developer of machine learning and data science applications; the team I was on at the time was creating tutorials and examples to demonstrate the use of technologies like Apache Spark on Kubernetes. That said, I’ve been interested in distributed systems for many years and when an opportunity arose to join a team working directly on Kubernetes, I jumped at it!

Functioning and working

Arujjwal: Can you give us an insight into what SIG Cloud Provider does and how it functions?

Michael: SIG Cloud Provider was formed to help ensure that Kubernetes provides a neutral integration point for all infrastructure providers. Our largest task to date has been the extraction and migration of in-tree cloud controllers to out-of-tree components. The SIG meets regularly to discuss progress and upcoming tasks and also to answer questions and bugs that arise. Additionally, we act as a coordination point for cloud provider subprojects such as the cloud provider framework, specific cloud controller implementations, and the Konnectivity proxy project.

Arujjwal: After going through the project README, I learned that SIG Cloud Provider works with the integration of Kubernetes with cloud providers. How does this whole process go?

Michael: One of the most common ways to run Kubernetes is by deploying it to a cloud environment (AWS, Azure, GCP, etc). Frequently, the cloud infrastructures have features that enhance the performance of Kubernetes, for example, by providing elastic load balancing for Service objects. To ensure that cloud-specific services can be consistently consumed by Kubernetes, the Kubernetes community has created cloud controllers to address these integration points. Cloud providers can create their own controllers either by using the framework maintained by the SIG or by following the API guides defined in the Kubernetes code and documentation. One thing I would like to point out is that SIG Cloud Provider does not deal with the lifecycle of nodes in a Kubernetes cluster; for those types of topics, SIG Cluster Lifecycle and the Cluster API project are more appropriate venues.

Important subprojects

Arujjwal: There are a lot of subprojects within this SIG. Can you highlight some of the most important ones and what job they do?

Michael: I think the two most important subprojects today are the cloud provider framework and the extraction/migration project. The cloud provider framework is a common library to help infrastructure integrators build a cloud controller for their infrastructure. This project is most frequently the starting point for new people coming to the SIG. The extraction and migration project is the other big subproject and a large part of why the framework exists. A little history might help explain further: for a long time, Kubernetes needed some integration with the underlying infrastructure, not necessarily to add features but to be aware of cloud events like instance termination. The cloud provider integrations were built into the Kubernetes code tree, and thus the term "in-tree" was created (check out this article on the topic for more info). The activity of maintaining provider-specific code in the main Kubernetes source tree was considered undesirable by the community. The community’s decision inspired the creation of the extraction and migration project to remove the "in-tree" cloud controllers in favor of "out-of-tree" components.

Arujjwal: What makes [the cloud provider framework] a good place to start? Does it have consistent good beginner work? What kind?

Michael: I feel that the cloud provider framework is a good place to start as it encodes the community’s preferred practices for cloud controller managers and, as such, will give a newcomer a strong understanding of how and what the managers do. Unfortunately, there is not a consistent stream of beginner work on this component; this is due in part to the mature nature of the framework and that of the individual providers as well. For folks who are interested in getting more involved, having some Go language knowledge is good and also having an understanding of how at least one cloud API (e.g., AWS, Azure, GCP) works is also beneficial. In my personal opinion, being a newcomer to SIG Cloud Provider can be challenging as most of the code around this project deals directly with specific cloud provider interactions. My best advice to people wanting to do more work on cloud providers is to grow your familiarity with one or two cloud APIs, then look for open issues on the controller managers for those clouds, and always communicate with the other contributors as much as possible.

Accomplishments

Arujjwal: Can you share about an accomplishment(s) of the SIG that you are proud of?

Michael: Since I joined the SIG, more than a year ago, we have made great progress in advancing the extraction and migration subproject. We have moved from an alpha status on the defining KEP to a beta status and are inching ever closer to removing the old provider code from the Kubernetes source tree. I've been really proud to see the active engagement from our community members and to see the progress we have made towards extraction. I have a feeling that, within the next few releases, we will see the final removal of the in-tree cloud controllers and the completion of the subproject.

Advice for new contributors

Arujjwal: Is there any suggestion or advice for new contributors on how they can start at SIG Cloud Provider?

Michael: This is a tricky question in my opinion. SIG Cloud Provider is focused on the code pieces that integrate between Kubernetes and an underlying infrastructure. It is very common, but not necessary, for members of the SIG to be representing a cloud provider in an official capacity. I recommend that anyone interested in this part of Kubernetes should come to an SIG meeting to see how we operate and also to study the cloud provider framework project. We have some interesting ideas for future work, such as a common testing framework, that will cut across all cloud providers and will be a great opportunity for anyone looking to expand their Kubernetes involvement.

Arujjwal: Are there any specific skills you're looking for that we should highlight? To give you an example from our own [SIG ContribEx] (https://github.com/kubernetes/community/blob/master/sig-contributor-experience/README.md): if you're an expert in Hugo, we can always use some help with k8s.dev!

Michael: The SIG is currently working through the final phases of our extraction and migration process, but we are looking toward the future and starting to plan what will come next. One of the big topics that the SIG has discussed is testing. Currently, we do not have a generic common set of tests that can be exercised by each cloud provider to confirm the behaviour of their controller manager. If you are an expert in Ginkgo and the Kubetest framework, we could probably use your help in designing and implementing the new tests.

This is where the conversation ends. I hope this gave you some insights about SIG Cloud Provider's aim and working. This is just the tip of the iceberg. To know more and get involved with SIG Cloud Provider, try attending their meetings here.

via Kubernetes Blog https://kubernetes.io/

February 29, 2024 at 07:00PM

#Kubernetes #K8s

·kubernetes.io·Mar 1, 2024

Spotlight on SIG Cloud Provider

Blog: Spotlight on SIG Cloud Provider

https://www.kubernetes.dev/blog/2024/03/01/sig-cloud-provider-spotlight-2024/

One of the most popular ways developers use Kubernetes-related services is via cloud providers, but have you ever wondered how cloud providers can do that? How does this whole process of integration of Kubernetes to various cloud providers happen? To answer that, let’s put the spotlight on SIG Cloud Provider.

SIG Cloud Provider works to create seamless integrations between Kubernetes and various cloud providers. Their mission? Keeping the Kubernetes ecosystem fair and open for all. By setting clear standards and requirements, they ensure every cloud provider plays nicely with Kubernetes. It is their responsibility to configure cluster components to enable cloud provider integrations.

In this blog of the SIG Spotlight series, Arujjwal Negi interviews Michael McCune (Red Hat), also known as elmiko, co-chair of SIG Cloud Provider, to give us an insight into the workings of this group.

Introduction

Arujjwal: Let’s start by getting to know you. Can you give us a small intro about yourself and how you got into Kubernetes?

Michael: Hi, I’m Michael McCune, most people around the community call me by my handle, elmiko. I’ve been a software developer for a long time now (Windows 3.1 was popular when I started!), and I’ve been involved with open-source software for most of my career. I first got involved with Kubernetes as a developer of machine learning and data science applications; the team I was on at the time was creating tutorials and examples to demonstrate the use of technologies like Apache Spark on Kubernetes. That said, I’ve been interested in distributed systems for many years and when an opportunity arose to join a team working directly on Kubernetes, I jumped at it!

Functioning and working

Arujjwal: Can you give us an insight into what SIG Cloud Provider does and how it functions?

Michael: SIG Cloud Provider was formed to help ensure that Kubernetes provides a neutral integration point for all infrastructure providers. Our largest task to date has been the extraction and migration of in-tree cloud controllers to out-of-tree components. The SIG meets regularly to discuss progress and upcoming tasks and also to answer questions and bugs that arise. Additionally, we act as a coordination point for cloud provider subprojects such as the cloud provider framework, specific cloud controller implementations, and the Konnectivity proxy project.

Arujjwal: After going through the project README, I learned that SIG Cloud Provider works with the integration of Kubernetes with cloud providers. How does this whole process go?

Michael: One of the most common ways to run Kubernetes is by deploying it to a cloud environment (AWS, Azure, GCP, etc). Frequently, the cloud infrastructures have features that enhance the performance of Kubernetes, for example, by providing elastic load balancing for Service objects. To ensure that cloud-specific services can be consistently consumed by Kubernetes, the Kubernetes community has created cloud controllers to address these integration points. Cloud providers can create their own controllers either by using the framework maintained by the SIG or by following the API guides defined in the Kubernetes code and documentation. One thing I would like to point out is that SIG Cloud Provider does not deal with the lifecycle of nodes in a Kubernetes cluster; for those types of topics, SIG Cluster Lifecycle and the Cluster API project are more appropriate venues.

Important subprojects

Arujjwal: There are a lot of subprojects within this SIG. Can you highlight some of the most important ones and what job they do?

Michael: I think the two most important subprojects today are the cloud provider framework and the extraction/migration project. The cloud provider framework is a common library to help infrastructure integrators build a cloud controller for their infrastructure. This project is most frequently the starting point for new people coming to the SIG. The extraction and migration project is the other big subproject and a large part of why the framework exists. A little history might help explain further: for a long time, Kubernetes needed some integration with the underlying infrastructure, not necessarily to add features but to be aware of cloud events like instance termination. The cloud provider integrations were built into the Kubernetes code tree, and thus the term “in-tree” was created (check out this article on the topic for more info). The activity of maintaining provider-specific code in the main Kubernetes source tree was considered undesirable by the community. The community’s decision inspired the creation of the extraction and migration project to remove the “in-tree” cloud controllers in favor of “out-of-tree” components.

Arujjwal: What makes [the cloud provider framework] a good place to start? Does it have consistent good beginner work? What kind?

Michael: I feel that the cloud provider framework is a good place to start as it encodes the community’s preferred practices for cloud controller managers and, as such, will give a newcomer a strong understanding of how and what the managers do. Unfortunately, there is not a consistent stream of beginner work on this component; this is due in part to the mature nature of the framework and that of the individual providers as well. For folks who are interested in getting more involved, having some Go language knowledge is good and also having an understanding of how at least one cloud API (e.g., AWS, Azure, GCP) works is also beneficial. In my personal opinion, being a newcomer to SIG Cloud Provider can be challenging as most of the code around this project deals directly with specific cloud provider interactions. My best advice to people wanting to do more work on cloud providers is to grow your familiarity with one or two cloud APIs, then look for open issues on the controller managers for those clouds, and always communicate with the other contributors as much as possible.

Accomplishments

Arujjwal: Can you share about an accomplishment(s) of the SIG that you are proud of?

Michael: Since I joined the SIG, more than a year ago, we have made great progress in advancing the extraction and migration subproject. We have moved from an alpha status on the defining KEP to a beta status and are inching ever closer to removing the old provider code from the Kubernetes source tree. I’ve been really proud to see the active engagement from our community members and to see the progress we have made towards extraction. I have a feeling that, within the next few releases, we will see the final removal of the in-tree cloud controllers and the completion of the subproject.

Advice for new contributors

Arujjwal: Is there any suggestion or advice for new contributors on how they can start at SIG Cloud Provider?

Michael: This is a tricky question in my opinion. SIG Cloud Provider is focused on the code pieces that integrate between Kubernetes and an underlying infrastructure. It is very common, but not necessary, for members of the SIG to be representing a cloud provider in an official capacity. I recommend that anyone interested in this part of Kubernetes should come to an SIG meeting to see how we operate and also to study the cloud provider framework project. We have some interesting ideas for future work, such as a common testing framework, that will cut across all cloud providers and will be a great opportunity for anyone looking to expand their Kubernetes involvement.

Arujjwal: Are there any specific skills you’re looking for that we should highlight? To give you an example from our own [SIG ContribEx] (https://github.com/kubernetes/community/blob/master/sig-contributor-experience/README.md): if you’re an expert in Hugo, we can always use some help with k8s.dev!

Michael: The SIG is currently working through the final phases of our extraction and migration process, but we are looking toward the future and starting to plan what will come next. One of the big topics that the SIG has discussed is testing. Currently, we do not have a generic common set of tests that can be exercised by each cloud provider to confirm the behaviour of their controller manager. If you are an expert in Ginkgo and the Kubetest framework, we could probably use your help in designing and implementing the new tests.

This is where the conversation ends. I hope this gave you some insights about SIG Cloud Provider’s aim and working. This is just the tip of the iceberg. To know more and get involved with SIG Cloud Provider, try attending their meetings here.

via Kubernetes Contributors – Contributor Blog https://www.kubernetes.dev/blog/

February 29, 2024 at 07:00PM

#Kubernetes #K8s

·kubernetes.dev·Mar 1, 2024

Blog: Spotlight on SIG Cloud Provider

Week Ending February 25 2024

Week Ending February 25, 2024

http://lwkd.info/2024/20240227

Developer News

There’s an updated Kubernetes v1.30 State of the Release and Important Deadlines

Contributor Summit Paris schedule is live. If you have a new topic, time to suggest an unconference item.

Release Schedule

Next Deadline: CODE Freeze Begins, March 5th

Kubernetes v1.30.0-alpha.3 is live!

The Code Freeze milestone for the Kubernetes 1.30 release cycle is approaching rapidly. Have all your necessary changes been submitted? Following this, there’s the usual release countdown: submit documentation PRs by February 26th, publish deprecation blog on Thursday, and conclude testing freeze and documentation finalization next week. Once we enter Code Freeze, please promptly address any test failures. Questions can be answered on #SIG-release.

Featured PRs

122589: promote contextual logging to beta, enabled by default

Adding contextual logging to Kubernetes has been a long, long road. Removing the tree-wide dependency on klog required refactoring code all over Kubernetes, which took the time of hundreds of contributors. This PR enables contextual logging by default since many components and clients now support it.

123157: Add SELinuxMount feature gate

Use this one neat SELinux trick for faster relabeling of volumes! Users with SELinux=enforcing currently suffer latency due to needing to relabel all content on volume so that pods can access it. SELinuxMount instead mounts the volume using -o context=XYZ which skips the recursive walk. Currently alpha; needs tests, disabled by default.

KEP of the Week

KEP-4176: A new static policy to prefer allocating cores from different CPUs on the same socket

This KEP proposes a new CPU Manager Static Policy Option called distribute-cpus-across-cores to prefer allocating CPUs from different physical cores on the same socket. This will be similar to the distribute-cpus-across-numa policy option, but it proposes to spread CPU allocations instead of packing them together. Such a policy is useful if an application wants to avoid being a noisy neighbor with itself, but still want to take advantage of the L2 cache.

Other Merges

kubeadm certs check-expiration JSON and YAML support

Improved skip node search in specific cases for scheduler performance

kube_codegen `–plural-exceptions and improved API type detection

Fix for kubeadm upgrade mounting a new device.

Flag to disable force detach behaviour in kube-controller-manager

Added the MutatingAdmissionPolicy flag to enable mutation policy in admission chain

kubelet adds an image field to the image_garbage_collected_total metric

Promotions

LoadBalancerIPMode to beta

ImageMaximumGCAge to beta

NewVolumeManagerReconstruction to GA

Version Updates

sampleapiserver is now v1.29.2

golangci-init v1.56.0 to support Go 1.22

Subprojects and Dependency Updates

prometheus to 2.50.0: automated memory limit handling, multiple PromQL improvements

cri-o to v1.29.2: Enable automatic OpenTelemetry instrumentation of ttrpc calls to NRI plugins; Also released v1.28.4 and v1.27.4

via Last Week in Kubernetes Development http://lwkd.info/

February 27, 2024 at 05:00PM

#Kubernetes #K8s

·lwkd.info·Feb 29, 2024

Week Ending February 25 2024

Week Ending February 18 2024

Week Ending February 18, 2024

http://lwkd.info/2024/20240221

Developer News

Kubernetes Contributor Summit Paris scheduled session speakers have been notified. The Schedule will be available on 25th Feb.

Natasha Sarkar is stepping down as SIG-CLI co-chair and Kustomize lead; Marly Puckett is replacing her as co-Chair, and Yugo Kobayashi is taking on Kustomize.

Release Schedule

Next Deadline: Exception Requests Due, February 26th

We are in Enhancements Freeze with 85 Enhancements on the tracking board. Any KEPs that wish to join the v1.30 release must now have an approved Exception.

KEP of the Week

4402: go workspaces

SIG-Arch is adding go workspace support to Kubernetes to simplfy our build tools adn remove code. However, all code generation tools based on gengo will break, and the CLI for kube_codegen will change. If you use any of our code generation tools, you will have work to do after the PR merges, probably for 1.31.

Other Merges

Add a user namespace field to Runtime in prep for namespace support

Add serializer and decoder support for CBOR instead of JSON; rest of CBOR support needs to be added before alpha

AuthenticationConfiguration now has an audienceMatchPolicy API field to support configuring multiple audiences in the authenticator

kube-apiserver now reports metrics for authorization decisions

Integration tests for multiple audience support in structured authentication

JWT authenticator will verify tokens even when not signed using RS256 algorithm.

kube-apiserver can retry create requests which fail due to a name conflict

New metrics: kublet image_pull_duration_seconds, kube-apiserver apiserver_encryption_config_controller_automatic_reloads_total

Job controller only logs deletionTimestamp if it’s not nil

Reduce watch request memory usage by spawning a separate goroutine

Prevent data race in resourceclaim.Lookup.

Kubelet concurrent log rotation is configurable through containerLogMaxWorkers

Promotions

CRDValidationRatcheting to beta

Subprojects and Dependency Updates

kubernetes-sigs/kind v0.22.0 released with support for building node images on hosts with proxies.

Prometheus 2.50.0-rc.0 released. New features includes analyze command for histograms and automatic memory limit handling.

grpc v1.62.0-pre1 released, including refinements, improvements and bug fixes.

via Last Week in Kubernetes Development http://lwkd.info/

February 21, 2024 at 05:00PM

#Kubernetes #K8s

·lwkd.info·Feb 22, 2024

Week Ending February 18 2024

A look into the Kubernetes Book Club

https://kubernetes.io/blog/2024/02/22/k8s-book-club/

Author: Frederico Muñoz (SAS Institute)

Learning Kubernetes and the entire ecosystem of technologies around it is not without its challenges. In this interview, we will talk with Carlos Santana (AWS) to learn a bit more about how he created the Kubernetes Book Club, how it works, and how anyone can join in to take advantage of a community-based learning experience.

Frederico Muñoz (FSM): Hello Carlos, thank you so much for your availability. To start with, could you tell us a bit about yourself?

Carlos Santana (CS): Of course. My experience in deploying Kubernetes in production six years ago opened the door for me to join Knative and then contribute to Kubernetes through the Release Team. Working on upstream Kubernetes has been one of the best experiences I've had in open-source. Over the past two years, in my role as a Senior Specialist Solutions Architect at AWS, I have been assisting large enterprises build their internal developer platforms (IDP) on top of Kubernetes. Going forward, my open source contributions are directed towards CNOE and CNCF projects like Argo, Crossplane, and Backstage.

Creating the Book Club

FSM: So your path led you to Kubernetes, and at that point what was the motivating factor for starting the Book Club?

CS: The idea for the Kubernetes Book Club sprang from a casual suggestion during a TGIK livestream. For me, it was more than just about reading a book; it was about creating a learning community. This platform has not only been a source of knowledge but also a support system, especially during the challenging times of the pandemic. It's gratifying to see how this initiative has helped members cope and grow. The first book Production Kubernetes took 36 weeks, when we started on March 5th 2021. Currently don't take that long to cover a book, one or two chapters per week.

FSM: Could you describe the way the Kubernetes Book Club works? How do you select the books and how do you go through them?

CS: We collectively choose books based on the interests and needs of the group. This practical approach helps members, especially beginners, grasp complex concepts more easily. We have two weekly series, one for the EMEA timezone, and I organize the US one. Each organizer works with their co-host and picks a book on Slack, then sets up a lineup of hosts for a couple of weeks to discuss each chapter.

FSM: If I’m not mistaken, the Kubernetes Book Club is in its 17th book, which is significant: is there any secret recipe for keeping things active?

CS: The secret to keeping the club active and engaging lies in a couple of key factors.

Firstly, consistency has been crucial. We strive to maintain a regular schedule, only cancelling meetups for major events like holidays or KubeCon. This regularity helps members stay engaged and builds a reliable community.

Secondly, making the sessions interesting and interactive has been vital. For instance, I often introduce pop-up quizzes during the meetups, which not only tests members' understanding but also adds an element of fun. This approach keeps the content relatable and helps members understand how theoretical concepts are applied in real-world scenarios.

Topics covered in the Book Club

FSM: The main topics of the books have been Kubernetes, GitOps, Security, SRE, and Observability: is this a reflection of the cloud native landscape, especially in terms of popularity?

CS: Our journey began with 'Production Kubernetes', setting the tone for our focus on practical, production-ready solutions. Since then, we've delved into various aspects of the CNCF landscape, aligning our books with a different theme. Each theme, whether it be Security, Observability, or Service Mesh, is chosen based on its relevance and demand within the community. For instance, in our recent themes on Kubernetes Certifications, we brought the book authors into our fold as active hosts, enriching our discussions with their expertise.

FSM: I know that the project had recent changes, namely being integrated into the CNCF as a Cloud Native Community Group. Could you talk a bit about this change?

CS: The CNCF graciously accepted the book club as a Cloud Native Community Group. This is a significant development that has streamlined our operations and expanded our reach. This alignment has been instrumental in enhancing our administrative capabilities, similar to those used by Kubernetes Community Days (KCD) meetups. Now, we have a more robust structure for memberships, event scheduling, mailing lists, hosting web conferences, and recording sessions.

FSM: How has your involvement with the CNCF impacted the growth and engagement of the Kubernetes Book Club over the past six months?

CS: Since becoming part of the CNCF community six months ago, we've witnessed significant quantitative changes within the Kubernetes Book Club. Our membership has surged to over 600 members, and we've successfully organized and conducted more than 40 events during this period. What's even more promising is the consistent turnout, with an average of 30 attendees per event. This growth and engagement are clear indicators of the positive influence of our CNCF affiliation on the Kubernetes Book Club's reach and impact in the community.

Joining the Book Club

FSM: For anyone wanting to join, what should they do?

CS: There are three steps to join:

First, join the Kubernetes Book Club Community

Then RSVP to the events on the community page

Lastly, join the CNCF Slack channel

kubernetes-book-club.

FSM: Excellent, thank you! Any final comments you would like to share?

CS: The Kubernetes Book Club is more than just a group of professionals discussing books; it's a vibrant community and amazing volunteers that help organize and host Neependra Khare, Eric Smalling, Sevi Karakulak, Chad M. Crowell, and Walid (CNJ) Shaari. Look us up at KubeCon and get your Kubernetes Book Club sticker!

via Kubernetes Blog https://kubernetes.io/

February 21, 2024 at 07:00PM

#Kubernetes #K8s

·kubernetes.io·Feb 22, 2024

A look into the Kubernetes Book Club

Blog: A look into the Kubernetes Book Club

https://www.kubernetes.dev/blog/2024/02/22/k8s-book-club/

Learning Kubernetes and the entire ecosystem of technologies around it is not without its challenges. In this interview, we will talk with Carlos Santana (AWS) to learn a bit more about how he created the Kubernetes Book Club, how it works, and how anyone can join in to take advantage of a community-based learning experience.

Frederico Muñoz (FSM): Hello Carlos, thank you so much for your availability. To start with, could you tell us a bit about yourself?

Carlos Santana (CS): Of course. My experience in deploying Kubernetes in production six years ago opened the door for me to join Knative and then contribute to Kubernetes through the Release Team. Working on upstream Kubernetes has been one of the best experiences I’ve had in open-source. Over the past two years, in my role as a Senior Specialist Solutions Architect at AWS, I have been assisting large enterprises build their internal developer platforms (IDP) on top of Kubernetes. Going forward, my open source contributions are directed towards CNOE and CNCF projects like Argo, Crossplane, and Backstage.

Creating the Book Club

FSM: So your path led you to Kubernetes, and at that point what was the motivating factor for starting the Book Club?

CS: The idea for the Kubernetes Book Club sprang from a casual suggestion during a TGIK livestream. For me, it was more than just about reading a book; it was about creating a learning community. This platform has not only been a source of knowledge but also a support system, especially during the challenging times of the pandemic. It’s gratifying to see how this initiative has helped members cope and grow. The first book Production Kubernetes took 36 weeks, when we started on March 5th 2021. Currently don’t take that long to cover a book, one or two chapters per week.

FSM: Could you describe the way the Kubernetes Book Club works? How do you select the books and how do you go through them?

CS: We collectively choose books based on the interests and needs of the group. This practical approach helps members, especially beginners, grasp complex concepts more easily. We have two weekly series, one for the EMEA timezone, and I organize the US one. Each organizer works with their co-host and picks a book on Slack, then sets up a lineup of hosts for a couple of weeks to discuss each chapter.

FSM: If I’m not mistaken, the Kubernetes Book Club is in its 17th book, which is significant: is there any secret recipe for keeping things active?

CS: The secret to keeping the club active and engaging lies in a couple of key factors.

Firstly, consistency has been crucial. We strive to maintain a regular schedule, only cancelling meetups for major events like holidays or KubeCon. This regularity helps members stay engaged and builds a reliable community.

Secondly, making the sessions interesting and interactive has been vital. For instance, I often introduce pop-up quizzes during the meetups, which not only tests members’ understanding but also adds an element of fun. This approach keeps the content relatable and helps members understand how theoretical concepts are applied in real-world scenarios.

Topics covered in the Book Club

FSM: The main topics of the books have been Kubernetes, GitOps, Security, SRE, and Observability: is this a reflection of the cloud native landscape, especially in terms of popularity?

CS: Our journey began with ‘Production Kubernetes’, setting the tone for our focus on practical, production-ready solutions. Since then, we’ve delved into various aspects of the CNCF landscape, aligning our books with a different theme. Each theme, whether it be Security, Observability, or Service Mesh, is chosen based on its relevance and demand within the community. For instance, in our recent themes on Kubernetes Certifications, we brought the book authors into our fold as active hosts, enriching our discussions with their expertise.

FSM: I know that the project had recent changes, namely being integrated into the CNCF as a Cloud Native Community Group. Could you talk a bit about this change?

CS: The CNCF graciously accepted the book club as a Cloud Native Community Group. This is a significant development that has streamlined our operations and expanded our reach. This alignment has been instrumental in enhancing our administrative capabilities, similar to those used by Kubernetes Community Days (KCD) meetups. Now, we have a more robust structure for memberships, event scheduling, mailing lists, hosting web conferences, and recording sessions.

FSM: How has your involvement with the CNCF impacted the growth and engagement of the Kubernetes Book Club over the past six months?

CS: Since becoming part of the CNCF community six months ago, we’ve witnessed significant quantitative changes within the Kubernetes Book Club. Our membership has surged to over 600 members, and we’ve successfully organized and conducted more than 40 events during this period. What’s even more promising is the consistent turnout, with an average of 30 attendees per event. This growth and engagement are clear indicators of the positive influence of our CNCF affiliation on the Kubernetes Book Club’s reach and impact in the community.

Joining the Book Club

FSM: For anyone wanting to join, what should they do?

CS: There are three steps to join:

First, join the Kubernetes Book Club Community

Then RSVP to the events on the community page

Lastly, join the CNCF Slack channel

kubernetes-book-club.

FSM: Excellent, thank you! Any final comments you would like to share?

CS: The Kubernetes Book Club is more than just a group of professionals discussing books; it’s a vibrant community and amazing volunteers that help organize and host Neependra Khare, Eric Smalling, Sevi Karakulak, Chad M. Crowell, and Walid (CNJ) Shaari. Look us up at KubeCon and get your Kubernetes Book Club sticker!

via Kubernetes Contributors – Contributor Blog https://www.kubernetes.dev/blog/

February 21, 2024 at 07:00PM

#Kubernetes #K8s

·kubernetes.dev·Feb 22, 2024

Blog: A look into the Kubernetes Book Club

Week Ending February 11 2024

Week Ending February 11, 2024

http://lwkd.info/2024/20240211

Developer News

The Contributor Summit is looking for volunteers and a few more pre-planned sessions; remember that KCS sessions need to target contributors.

Need a technical summer intern? We can still accept project proposals for the CNCF Google Summer of Code application if you get them in soon.

Release Schedule

Next Deadline: Docs Deadline for placeholder PRs, February 22nd

We are in Enhancements Freeze now, and currently have 84 opted-in, 56 tracked, and 28 removed features. If your feature missed the deadline, you need to file an Exception.

Patch releases, including a Go update, are due out this week for Valentine’s Day! This is likely to be the last patch release for Kubernetes 1.26. Tell your partner you love them by updating all their clusters.

Roses are red Violets are blue Golang’s outdated 1.26 is EOL too

KEP of the Week

KEP-3962: Mutating Admission Policies

This KEP introduces mutating admission policies, declared using CEL expressions, improving on mutating admission webhooks. It leverages the power of CEL object construction and Server Side Apply’s merge algorithms to allow in-process mutations.

Mutations are specified within a MutatingAdmissionPolicy resource, referencing parameter resources for configuration. Reinvocation will support it as well. Metrics and safety checks are being developed to ensure idempotence and deterministic final states. While limitations exist (e.g., no deletion), this feature offers a declarative and efficient way to perform common mutations, reducing complexity and improving performance.

This KEP was created in 2023, and is planned to reach its alpha milestone in v1.30 release.

Other Merges

ValidatingAdmissionPolicy supports variables in type checks

kubectl explain shows enum values if available

Wildcard events will get requeued

kubeadm: finalize phase uses auth context

Priority and Fairness allows ConcurrencyShares to be zero

Add porto support for vanity imports of the Kubernetes code

Promotions

CloudDualStackNodeIPs is GA

Deprecated

SecurityContextDeny admission plugin is removed; use PodSecurity instead

Version Updates

go to 1.21.7 in 1.26 through 1.29, and to 1.22 in 1.30

debian-base for images to bookworm 1.0.1

etcd to 3.5.12

Subprojects and Dependency Updates

kubespray to v2.22.2 Make kubernetes 1.26.13 the default version

via Last Week in Kubernetes Development http://lwkd.info/

February 11, 2024 at 05:00PM

#Kubernetes #K8s

·lwkd.info·Feb 14, 2024

Week Ending February 11 2024