DIY: Create Your Own Cloud with Kubernetes (Part 3)
https://kubernetes.io/blog/2024/04/05/diy-create-your-own-cloud-with-kubernetes-part-3/
Author: Andrei Kvapil (Ænix)
Approaching the most interesting phase, this article delves into running Kubernetes within Kubernetes. Technologies such as Kamaji and Cluster API are highlighted, along with their integration with KubeVirt.
Previous discussions have covered preparing Kubernetes on bare metal and how to turn Kubernetes into virtual machines management system. This article concludes the series by explaining how, using all of the above, you can build a full-fledged managed Kubernetes and run virtual Kubernetes clusters with just a click.
First up, let's dive into the Cluster API.
Cluster API
Cluster API is an extension for Kubernetes that allows the management of Kubernetes clusters as custom resources within another Kubernetes cluster.
The main goal of the Cluster API is to provide a unified interface for describing the basic entities of a Kubernetes cluster and managing their lifecycle. This enables the automation of processes for creating, updating, and deleting clusters, simplifying scaling, and infrastructure management.
Within the context of Cluster API, there are two terms: management cluster and tenant clusters.
Management cluster is a Kubernetes cluster used to deploy and manage other clusters. This cluster contains all the necessary Cluster API components and is responsible for describing, creating, and updating tenant clusters. It is often used just for this purpose.
Tenant clusters are the user clusters or clusters deployed using the Cluster API. They are created by describing the relevant resources in the management cluster. They are then used for deploying applications and services by end-users.
It's important to understand that physically, tenant clusters do not necessarily have to run on the same infrastructure with the management cluster; more often, they are running elsewhere.
A diagram showing interaction of management Kubernetes cluster and tenant Kubernetes clusters using Cluster API
For its operation, Cluster API utilizes the concept of providers which are separate controllers responsible for specific components of the cluster being created. Within Cluster API, there are several types of providers. The major ones are:
Infrastructure Provider, which is responsible for providing the computing infrastructure, such as virtual machines or physical servers.
Control Plane Provider, which provides the Kubernetes control plane, namely the components kube-apiserver, kube-scheduler, and kube-controller-manager.
Bootstrap Provider, which is used for generating cloud-init configuration for the virtual machines and servers being created.
To get started, you will need to install the Cluster API itself and one provider of each type. You can find a complete list of supported providers in the project's documentation.
For installation, you can use the clusterctl utility, or Cluster API Operator as the more declarative method.
Choosing providers
Infrastructure provider
To run Kubernetes clusters using KubeVirt, the KubeVirt Infrastructure Provider must be installed. It enables the deployment of virtual machines for worker nodes in the same management cluster, where the Cluster API operates.
Control plane provider
The Kamaji project offers a ready solution for running the Kubernetes control plane for tenant clusters as containers within the management cluster. This approach has several significant advantages:
Cost-effectiveness: Running the control plane in containers avoids the use of separate control plane nodes for each cluster, thereby significantly reducing infrastructure costs.
Stability: Simplifying architecture by eliminating complex multi-layered deployment schemes. Instead of sequentially launching a virtual machine and then installing etcd and Kubernetes components inside it, there's a simple control plane that is deployed and run as a regular application inside Kubernetes and managed by an operator.
Security: The cluster's control plane is hidden from the end user, reducing the possibility of its components being compromised, and also eliminates user access to the cluster's certificate store. This approach to organizing a control plane invisible to the user is often used by cloud providers.
Bootstrap provider
Kubeadm as the Bootstrap Provider - as the standard method for preparing clusters in Cluster API. This provider is developed as part of the Cluster API itself. It requires only a prepared system image with kubelet and kubeadm installed and allows generating configs in the cloud-init and ignition formats.
It's worth noting that Talos Linux also supports provisioning via the Cluster API and has providers for this. Although previous articles discussed using Talos Linux to set up a management cluster on bare-metal nodes, to provision tenant clusters the Kamaji+Kubeadm approach has more advantages. It facilitates the deployment of Kubernetes control planes in containers, thus removing the need for separate virtual machines for control plane instances. This simplifies the management and reduces costs.
How it works
The primary object in Cluster API is the Cluster resource, which acts as the parent for all the others. Typically, this resource references two others: a resource describing the control plane and a resource describing the infrastructure, each managed by a separate provider.
Unlike the Cluster, these two resources are not standardized, and their kind depends on the specific provider you are using:
A diagram showing the relationship of a Cluster resource and the resources it links to in Cluster API
Within Cluster API, there is also a resource named MachineDeployment, which describes a group of nodes, whether they are physical servers or virtual machines. This resource functions similarly to standard Kubernetes resources such as Deployment, ReplicaSet, and Pod, providing a mechanism for the declarative description of a group of nodes and automatic scaling.
In other words, the MachineDeployment resource allows you to declaratively describe nodes for your cluster, automating their creation, deletion, and updating according to specified parameters and the requested number of replicas.
A diagram showing the relationship of a MachineDeployment resource and its children in Cluster API
To create machines, MachineDeployment refers to a template for generating the machine itself and a template for generating its cloud-init config:
A diagram showing the relationship of a MachineDeployment resource and the resources it links to in Cluster API
To deploy a new Kubernetes cluster using Cluster API, you will need to prepare the following set of resources:
A general Cluster resource
A KamajiControlPlane resource, responsible for the control plane operated by Kamaji
A KubevirtCluster resource, describing the cluster configuration in KubeVirt
A KubevirtMachineTemplate resource, responsible for the virtual machine template
A KubeadmConfigTemplate resource, responsible for generating tokens and cloud-init
At least one MachineDeployment to create some workers
Polishing the cluster
In most cases, this is sufficient, but depending on the providers used, you may need other resources as well. You can find examples of the resources created for each type of provider in the Kamaji project documentation.
At this stage, you already have a ready tenant Kubernetes cluster, but so far, it contains nothing but API workers and a few core plugins that are standardly included in the installation of any Kubernetes cluster: kube-proxy and CoreDNS. For full integration, you will need to install several more components:
To install additional components, you can use a separate Cluster API Add-on Provider for Helm, or the same FluxCD discussed in previous articles.
When creating resources in FluxCD, it's possible to specify the target cluster by referring to the kubeconfig generated by Cluster API. Then, the installation will be performed directly into it. Thus, FluxCD becomes a universal tool for managing resources both in the management cluster and in the user tenant clusters.
A diagram showing the interaction scheme of fluxcd, which can install components in both management and tenant Kubernetes clusters
What components are being discussed here? Generally, the set includes the following:
CNI Plugin
To ensure communication between pods in a tenant Kubernetes cluster, it's necessary to deploy a CNI plugin. This plugin creates a virtual network that allows pods to interact with each other and is traditionally deployed as a Daemonset on the cluster's worker nodes. You can choose and install any CNI plugin that you find suitable.
A diagram showing a CNI plugin installed inside the tenant Kubernetes cluster on a scheme of nested Kubernetes clusters
Cloud Controller Manager
The main task of the Cloud Controller Manager (CCM) is to integrate Kubernetes with the cloud infrastructure provider's environment (in your case, it is the management Kubernetes cluster in which all worksers of tenant Kubernetes are provisioned). Here are some tasks it performs:
When a service of type LoadBalancer is created, the CCM initiates the process of creating a cloud load balancer, which directs traffic to your Kubernetes cluster.
If a node is removed from the cloud infrastructure, the CCM ensures its removal from your cluster as well, maintaining the cluster's current state.
When using the CCM, nodes are added to the cluster with a special taint, node.cloudprovider.kubernetes.io/uninitialized, which allows for the processing of additional business logic if necessary. After successful initialization, this taint is removed from the node.
Depending on the cloud provider, the CCM can operate both inside and outside the tenant cluster.
The KubeVirt Cloud Provider is designed to be installed in the external parent management cluster. Thus, creating servi