Found 5 bookmarks
Newest
Linux Containers: what they are and why all modern software is packaged in containers
Linux Containers: what they are and why all modern software is packaged in containers

Linux Containers: what they are and why all modern software is packaged in containers

https://cloudificationgmbh.blogspot.com/2025/04/linux-containers-what-they-are-and-why.html

Linux Containers: what they are and why all modern software is packaged in containers

In today’s IT world, containers are far beyond the stage of buzzword – they are the foundation of how modern software is built, shipped, and operated. Whether you’re browsing your favorite app, processing transactions in a banking platform, or managing infrastructure in the cloud, there’s a very good chance that Linux containers are quietly doing the heavy lifting in the background.

But what exactly is a container? Why has this technology become central to cloud computing, system engineering, DevOps, and modern software architectures? And how do platforms like Kubernetes and solutions like our very own c12n leverage containers to unlock new levels of efficiency?

In this article, we’ll dive deep into the world of Linux containers and discover how they work, why they matter, and the role they play in modern infrastructures.

Let’s unpack the container magic ​​🪄

What is a Linux container?

A container is a standardized unit of software that packages up code along with all the dependencies it needs to run – libraries, binaries, dependencies, configuration files, and more – into an isolated environment. Imagine you’re moving into a new house. You could throw your stuff loose into the truck, or you could pack everything into labeled boxes, sealed and stackable. Containers are those boxes, but for software. It is a form of operating-system-level virtualization that allows you to run multiple isolated applications on a single Linux kernel.

Containers are smaller and faster than traditional virtual machines (VMs), because they do not require a full guest operating system (OS). Instead, containers run as isolated processes on top of the host OS. They share the host system’s Linux kernel while remaining fully sandboxed from other processes which makes them incredibly efficient, portable, and perfect for cloud-native workloads.

You might remember from our previous blog post about hypervisors that traditional virtualization adds an extra layer called a hypervisor, which manages virtual machines and can be either Type 1 (bare-metal) or Type 2 (hosted). Like virtualization, containers rely on an extra layer of software to manage isolation, however, instead of a hypervisor, they use Linux kernel features like namespaces (which provide process and network isolation) and Control Groups (aka cgroups, which control resource allocation like CPU and memory) to achieve lightweight, process-level isolation.

Together, these mechanisms create lightweight, fast-starting environments that can be deployed consistently across different infrastructures – whether on a developer’s laptop, a testing environment, or a production cluster in the cloud. It works even if kernel versions or Linux distributions are different. The developer could run Fedora, and the test or production environment could be using Ubuntu.

TL/DR

A Linux container is a lightweight, standalone unit that bundles:

Your application

All its dependencies (libraries, binaries, etc.)

And containers share the host OS Linux kernel to run stuff.

The result? You can run your app anywhere – from a developer’s laptop to a cloud data center – without the “but it worked on my machine!” headache.

Why Containers Matter

Before containers, software was often deployed in a “works on my machine” style. Developers would build applications on their local machines with specific libraries or configurations, only to see them break in staging or production environments due to slight discrepancies. A common example would be that a developer would have a newer version of a required library locally where production runs an older version, which is incompatible. So, things work, unit tests pass, but production breaks right after new deployment.

Containers solve this by making environments portable and predictable. A container runs the same, no matter where you launch it. That consistency accelerates development cycles, simplifies testing, and greatly reduces deployment risks.

Moreover, containers are incredibly lightweight compared to VMs. They consume fewer resources because they share the host OS kernel and don’t need to run their own kernel at all. They launch in seconds, and are easy to duplicate (clone), scale, or discard. This efficiency translates directly into reduced infrastructure costs and faster delivery pipelines.

So... how do Linux containers actually work?

Linux containers create isolated environments for applications to run in, without needing a full operating system for each app. This is achieved by combining several key features built into the Linux kernel.

  1. Containers Use the Host’s Linux Kernel

Unlike virtual machines (VMs), which emulate hardware and run a separate OS, containers don’t need to bring their own operating system. They use the Linux kernel of the host machine but isolate everything else – filesystems, processes, networking, etc.

That’s what makes them fast and lightweight. You’re not booting up a whole OS for each container – you’re running isolated processes on the same kernel which has no idea that it runs in a container.

  1. Linux Namespaces: the Illusion of Separation?

Containers rely on Linux namespaces to make them feel like self-contained environments.

Namespaces create isolated views of:

Processes (PID namespace) – each container sees only its own processes with ids.

File systems (Mount namespace) – eProvides an isolated view of the file system for each container.

Networking (Net namespace) – containers can have their own virtual interfaces and IPs and are not aware of the underlying hardware NICs.

Users (User namespace) – containers can map users differently from the host and have their own UIDs (user ids).

IPC Namespace – Separates shared memory and message queues to prevent cross-container communication.

Mount Namespace – UTS Namespace – Lets containers set their own hostname and domain name.

Cgroup Namespace – Isolates access to cgroups, which control and limit system resource usage.

This feature gives containers the illusion that they’re completely running in their own separate OS, when in fact they’re sharing the host OS with 30 other containers.

  1. Control Groups (cgroups): Resource Management

Namespaces isolate, where cgroups control.

Control groups are another Linux kernel feature that lets you limit and monitor the resources of the host. Cgroups control how much CPU, memory, disk I/O, and network bandwidth a process inside a container can use. This ensures one container doesn’t take over the entire system resources.

What is considered a resource:

CPU time

RAM

Disk I/O

Network bandwidth

So if one container tries to eat all your RAM during a memory leak, cgroups can limit the “blast radius” and kill the process when going over a configured RAM limit (e.g. 1Gb) without letting the whole host OS freeze if RAM runs out.

Such functionality is essential in multi-tenant environments, or when you’re running lots of containers side-by-side. You don’t want one container hogging all the resources and impairing neighbor containers or the host OS itself.

The image below shows how the host CPU and RAM hierarchy has two groups with their own tasks. Tasks refer to the processes that are members of a specific control group. In the below example that is crond with its fork in case of /cg1 group and httpd in case of /cg2 group. We can also see that the httpd process is a part of /cg3 group on the right. The net_cls hierarchy allows tagging network packets with a class identifier, allowing to identify and manage traffic originating from specific cgroups. This handy mechanism provides us a way to manage network traffic based on the cgroup the process belongs to.

  1. Layered Storage

Containers typically use UnionFS (or OverlayFS), which stacks filesystem layers. This means they can share common filesystem layers (like the OS or libraries), while each container has its own read/write layer on top. This is essential as it allows sharing a base container image while providing each container with its own readable and (optionally) writable filesystem layer.

Here’s how it works:

A base image (for example, ubuntu:22.04) forms the bottom layer

Your app and its dependencies go on top of it in a new layer

Any changes during runtime are stored in a separate write layer

The write layer can be ephemeral and discarded automatically when container exits

Or, the write layer can be persistent if the data has to survive container restart

Such a layer-based model is super efficient because:

Base images can be shared between multiple containers

Only the differences need to be saved

Start-up times are fast as we don’t need to copy all data all over

That’s why when you first pull a container image from Docker Hub it might take a minute or two depending on your connection. But after, when you start another container with the same base image – it is already cached and started instantaneously! This saves space, makes containers faster to pull and deploy and allows to do version control via image tags.

  1. Container Runtimes: the engine that runs the show

To create, run, and manage containers, you’ll need a container runtime installed.

Some of the popular container runtime examples:

runc – the low-level container runtime (used by Podman, Docker, containerd, etc.)

containerd – a higher-level runtime used by Kubernetes

CRI-O – a lightweight alternative runtime optimized for Kubernetes

The runtime is the piece of software that actually uses Linux features (namespaces and cgroups) to spin up and manage containers under the hood. Sometimes, container runtimes are categorized as “high-level” and “low-level”, based on their abstraction and complexity levels. In a nutshell, high-level

·cloudificationgmbh.blogspot.com·
Linux Containers: what they are and why all modern software is packaged in containers
GitOps Automated Openstack: Simplifying Release Upgrades and Day-2 Ops
GitOps Automated Openstack: Simplifying Release Upgrades and Day-2 Ops

GitOps Automated Openstack: Simplifying Release Upgrades and Day-2 Ops

https://cloudificationgmbh.blogspot.com/2025/04/gitops-automated-openstack-simplifying.html

GitOps Automated Openstack: Simplifying Release Upgrades and Day-2 Ops

Managing OpenStack can be very complex, especially when dealing with frequent release updates, configuration changes, large scale deployments and tight deadlines. At Cloudification we operate tens of OpenStack deployments of different scales, locations and configurations. To ensure consistency across environments and possible unforeseen complications, an organized approach is needed.

For a couple of years now, we’ve implemented GitOps as a powerful framework for automating OpenStack deployments and making updates and day-2 operations smooth and predictable.

In this article you will learn how GitOps can be leveraged to simplify OpenStack release updates and streamline operational workflows and how we take it even further with c12n private cloud solution by integrating GitOps automation into all stages of cloud deployment.

Let’s start by taking a look at OpenStack’s Architecture:

As we have covered in the previous articles, OpenStack consists of multiple core services with a modular architecture that enables additional features. Each service typically has multiple configuration files and multiple components along with dependencies such as databases (MariaDB or similar), message queues (RabbitMQ), caches (Memached) and other service-specific configuration (coordination, policy, etc.)

This can quickly turn into a configuration nightmare, even for experienced administrators. Now multiply that by the number of OpenStack deployments with their specifics and you’d probably end up with the famous meme:

Fortunately, nowadays we have various tools available to simplify OpenStack configuration and deployment, including devstack, openstack-ansible, kolla-ansible, puppet, microstack and Helm charts for Kubernetes. Each tool has its merits but selecting the best one depends on the specific use case and operational requirements

But how do I pick one?

To find out, we first have to talk about containers and why they have become the default way to deploy and manage OpenStack today.

The Role of Containers in OpenStack Deployments

Containers have become the new standard of deploying modern applications, offering great portability and ease of management because they include all application dependencies.

OpenStack container deployment is supported through multiple community-backed approaches amongst which openstack-helm charts is a native choice for organizations operating Kubernetes. Helm charts are packages for Kubernetes clusters that provide a structured way to deploy OpenStack services along with their configurations and dependencies. Those charts are versioned and updated alongside OpenStack releases, ensuring seamless configuration changes and compatibility with the latest versions.

But why introduce Kubernetes as an additional abstraction layer when kolla-ansible can deploy OpenStack containers directly onto control and compute nodes?

Kubernetes has been the buzz word for the past years and today it is the most popular container orchestrator solution that manages the full lifecycle of containers while also offering modularity through plugins that enable different functionality and extend Kubernetes API for special cases. Similar to OpenStack, Kubernetes enables network and workload segmentation, allowing specific roles to be assigned per node (network, storage, control, etc), but does so for containers and not VMs.

Additionally, Kubernetes’ extensibility makes it easy to integrate observability tools (for instance Prometheus, Grafana, Opensearch), additional authentication mechanisms (Keycloak, Dex) and even compliance tools (OPA agents, Trivy, Falco, etc). And all of those can be done using respective Helm charts that can be found in abundance in ArtifactHub.

Automating OpenStack Deployments with GitOps

The Deployment Process

GitOps introduces automation and operational efficiency to OpenStack deployment through two key tools: Git and ArgoCD.

Git serves as the single source of truth for a complete system configuration, providing visibility, version control, and auditing capabilities.

ArgoCD continuously monitors and reconciles the desired state of OpenStack components by monitoring Kubernetes objects and maintaining them in the desired state defined in Git.

ArgoCD’s features make it particularly powerful for multi-layered cloud deployments enabling:

Automated reconciliation of system configuration in case of a ‘drift’ or unexpected manual interventions.

Dependency management is done via ArgoCD sync waves. Sync waves ensure the correct sequence of service deployments, such as databases and message buses that should be deployed before OpenStack services and the first service deployed should be Keystone as an example.

Once the Kubernetes cluster is set up with ArgoCD and apps become green that means the rollout and state synchronization with Git is complete. At this point, adding new cloud nodes of any kind becomes a pretty straightforward process. Install Kubernetes binary (we use official kube-spray SIG), join the node into the cluster and let GitOps automation take care of the rest.

On the screenshots below you can see how OpenStack components, each with their own App look in ArgoCD dashboard:

For each Argo App Kubernetes objects (StatefulSets, Deployments, DaemonSets, Services, etc.) will be created and K8s controllers will automatically deploy the necessary OpenStack services based on how the new node is labeled (role label). In our case we use c12n-compute role for compute nodes, c12n-storage-controller for ceph storage nodes and c12n-control-plane for all OpenStack APIs and satellite components (DBs, RabbitMQs, Memcached, etc.):

kubectl get nodes

NAME STATUS ROLES AGE VERSION master-1.dev.cloudification.io Ready c12n-control-plane,c12n-storage-controller,control-plane,local-storage 139d v1.30.3 master-2.dev.cloudification.io Ready c12n-control-plane,c12n-storage-controller,control-plane,local-storage 139d v1.30.3 master-3.dev.cloudification.io Ready c12n-control-plane,c12n-storage-controller,control-plane,local-storage 139d v1.30.3 worker-1.dev.cloudification.io Ready c12n-compute,local-storage 139d v1.30.3 worker-2.dev.cloudification.io Ready c12n-compute,local-storage 139d v1.30.3 worker-3.dev.cloudification.io Ready c12n-compute,local-storage 139d v1.30.3

And here is how Nova pods in Kubernetes including HA DB and RabbitMQ clusters look like:

# kubectl get pods -n openstack | grep nova

db-nova-haproxy-0 2/2 Running 0 11d db-nova-haproxy-1 2/2 Running 0 11d db-nova-haproxy-2 2/2 Running 0 11d db-nova-pxc-0 4/4 Running 0 11d db-nova-pxc-1 4/4 Running 0 11d db-nova-pxc-2 4/4 Running 0 11d nova-api-metadata-5dbb65685f-dsbwg 1/1 Running 0 11d nova-api-metadata-5dbb65685f-nrg4j 1/1 Running 0 11d nova-api-metadata-5dbb65685f-znspr 1/1 Running 0 11d nova-api-osapi-6cff9f8679-5t5bg 1/1 Running 0 11d nova-api-osapi-6cff9f8679-bgwv7 1/1 Running 0 11d nova-api-osapi-6cff9f8679-cljjg 1/1 Running 0 11d nova-compute-default-9jxp6 2/2 Running 0 11d nova-compute-default-mpdzp 2/2 Running 0 11d nova-compute-default-vk62x 2/2 Running 0 11d nova-conductor-586d8d66d8-nkmpm 1/1 Running 0 11d nova-novncproxy-58458d5c66-trqfn 1/1 Running 0 11d nova-novncproxy-58458d5c66-xx2gh 1/1 Running 0 11d nova-scheduler-584f98bfff-2wpbk 1/1 Running 0 11d nova-scheduler-584f98bfff-x66cx 1/1 Running 0 11d rabbitmq-nova-server-0 1/1 Running 0 11d rabbitmq-nova-server-1 1/1 Running 0 11d rabbitmq-nova-server-2 1/1 Running 0 11d

Simplifying OpenStack Upgrades with GitOps

How does GitOps make upgrades easier?

OpenStack upgrades have a reputation for being complex, sometimes even painful in the old releases 😅 But situation has improved a lot with the recent OpenStack versions, in fact you only need to upgrade once a year to keep up with the release cycle because every second release comes with SLURP which stands for Skip Level Upgrade Release Process:

Yet, in our case we want to make our customers’ lifes easier and we do fully GitOps automated release upgrades which run smoothly and fast thanks to:

Helm Chart Versioning: Helm ensures OpenStack components are upgraded systematically, reflecting the necessary configuration changes.

Git-based Change Management: Changes to OpenStack configurations are human readable in Git, all

·cloudificationgmbh.blogspot.com·
GitOps Automated Openstack: Simplifying Release Upgrades and Day-2 Ops
Kernel-based Virtual Machine (KVM): The Major Open-Source Hypervisor Powering World Clouds
Kernel-based Virtual Machine (KVM): The Major Open-Source Hypervisor Powering World Clouds

Kernel-based Virtual Machine (KVM): The Major Open-Source Hypervisor Powering World Clouds

https://cloudificationgmbh.blogspot.com/2025/03/kernel-based-virtual-machine-kvm-major.html

Kernel-based Virtual Machine (KVM): The Major Open-Source Hypervisor Powering World Clouds

IT infrastructure wouldn’t be what it is today without virtualization. By optimizing resource utilization, ensuring workload isolation, and enabling cloud computing, virtualization has revolutionized modern IT infrastructures.

The concept of virtualization dates back to 1960 when the only way of running applications was bare metal and IBM introduced time-sharing systems to maximize the use of rare and expensive mainframe hardware. Over the decades virtualization shifted workloads from bare metal to Virtual Machines (VM’s) and more recently to Containers. It has evolved from hardware partitioning to hypervisor-based virtualization, which became mainstream in the early 2000s with the rise of x86-based virtualization and paved the way for Cloud computing as we know it today.

At the heart of virtualization is the hypervisor software and Kernel-based Virtual Machine (KVM) is among the most popular open-source hypervisors that is built-in directly into the Linux kernel.

KVM is widely adopted today in enterprise virtualization and all kind of cloud environments, making it the default hypervisor in OpenStack, including Cloudification’s c12n, and is used by major cloud providers such as Amazon Web Services (AWS), Google Cloud, Oracle Cloud, OVH, UpCloud and many others.

What is a Hypervisor?

Before diving into KVM in more detail, let’s take a look at hypervisors in general.

A hypervisor is software that enables virtualization by creating and managing virtual machines (VMs) on a physical host system. It allows multiple operating systems to run independently on the same hardware, maximizing resource efficiency and scalability.

Hypervisor Architecture

Hypervisors can be classified into two main types. While both hypervisor types serve the same purpose of running VMs, their concept is rather different.

Type 1 Hypervisors (Bare-metal)

Type 1 hypervisors directly access underlying hardware resources and do not require a host operating system (except for KVM which is part of Linux OS kernel). This architecture provides best performance, security, and resource efficiency. That’s why type 1 hypervisors are used in data centers, clouds and enterprises. The well-known examples of type 1 hypervisors are:

KVM

VMware ESXi

Microsoft Hyper-V

XEN

Type 2 Hypervisors (Hosted)

Type 2 hypervisors run as an application on top of a host operating system, making them easier to install and use, but generally less efficient than Type 1 hypervisors. They are often used for development, testing, and simple desktop virtualization. Examples include:

VMware Workstation

Oracle VirtualBox

Parallels Desktop

Types of Hypervisors

While KVM is classified as a Type 1 hypervisor, it operates within a Linux host, making it unique compared to traditional bare-metal hypervisors like ESXi or XEN. Because Linux itself runs on hardware, KVM inherits the performance advantages of a type 1 hypervisor at the same time benefiting from Linux’s vast ecosystem and security features.

What is KVM?

KVM is a part of the Linux kernel that allows to turn any modern Linux distribution into a fully functional hypervisor. KVM leverages the built-in capabilities of the Linux kernel, turning it into a robust virtualization platform that supports multiple guest operating systems, including Linux, Windows, BSD and even MacOS. The so-called guest operating systems are the OS which run on the VMs.

The core components of KVM include:

A loadable kernel module (kvm.ko) that enables virtualization capabilities.

CPU-specific modules (kvm-intel.ko and kvm-amd.ko) for hardware acceleration using Intel VT-x and AMD-V depending on the processor of the host.

QEMU (Quick Emulator) integration, which provides device emulation and facilitates guest VM management. (We’ll talk about it in more detail below).

Libvirt, a toolkit that provides an abstraction layer for managing virtualization platforms, including KVM, simplifying VM provisioning, automation, and orchestration.

Where does KVM fit into the stack?

The Role of QEMU and Libvirt in KVM Virtualization

While KVM provides the core virtualization capabilities, QEMU and libvirt play essential roles in making KVM a fully functional virtualization stack.

QEMU

QEMU is a generic, open-source machine emulator and virtualizer. As a user-space emulator, it enables essential device emulation, allowing virtual machines to interact with virtualized hardware components.

It enhances KVM by enabling features such as VM snapshots, live migration, and advanced networking options. QEMU uses hypervisors like XEN or KVM to use CPU extensions (HVM) for virtualization.

When used as a virtualizer, QEMU can achieve near native performance by executing the guest code directly on the host CPU.

 

Libvirt

Libvirt is another piece of software designed to simplify the management of virtual machines and related virtualization tasks, including storage and network interface management.

Libvirt is a widely used virtualization management framework that abstracts the complexity of KVM and QEMU. It provides a unified API and a set of command-line tools (such as virsh) to manage VMs, networks, and storage, making KVM-based virtualization easier to deploy and maintain. Besides KVM, it also supports XEN hypervisor and LXC containers.

KVM Management - libvirt

KVM meets OpenStack

OpenStack is a widely adopted open-source cloud computing platform designed to manage distributed compute, network, and storage resources within data centers. It acts as an orchestration layer that aggregates physical infrastructure into a unified pool and enables users to provision virtual resources (VMs, storage, networks, routers) on demand through a self-service portal or APIs. You can learn more about OpenStack in our previous post here.

While OpenStack itself does not do virtualization, it integrates seamlessly with KVM to provide virtualization capabilities. By leveraging KVM as a default hypervisor with libvirt, OpenStack allows organizations to build cost-effective, scalable, and vendor-independent cloud environments without any license costs.

In recent years OpenStack has emerged as a great alternative to proprietary cloud and virtualization solutions from VMware and Microsoft and today it helps businesses all over the world to optimize costs and compete with hyperscale cloud providers.

Why Choose KVM?

In almost 20 years of its existence, KVM has become the virtualization standard for public cloud providers due to its performance, scalability, and open-source nature. Cloud giants such as AWS, Google and Oracle have adopted KVM because it:

Delivers near-native performance through hardware-assisted virtualization.

Supports live migration, enabling seamless VM movement across physical hosts without interruption.

Provides strong security features, including sVirt (SELinux integration) and Seccomp.

Avoids vendor lock-in, giving organizations full control over their infrastructure and costs.

Scales efficiently, handling everything from few VM deployments to hyperscale cloud environments with hundreds of thousands of VMs.

Has broad industry adoption, with an active open-source community and enterprise-grade solutions from multiple vendors including Red Hat and Canonical.

Closing Thoughts

KVM has solidified its position as a leading open-source hypervisor, enabling cloud providers and enterprises to build scalable, efficient, and secure virtualization environments. As the default hypervisor for OpenStack and Cloudification’s c12n, KVM continues to be the backbone of modern cloud infrastructure, providing an open, flexible alternative to proprietary solutions like VMware ESXi and Microsoft Hyper-V. Its widespread adoption by major cloud providers demonstrates KVM’s reliability and performance, making it the go-to choice for virtualization in the cloud era.

Are you considering integrating KVM into your cloud infrastructure or optimizing your OpenStack environment? Contact us today and let our experts help you build a scalable, cost-effective, and high-performance cloud infrastructure.

The post Kernel-based Virtual Machine (KVM): The Major Open-Source Hypervisor Powering World Clouds first appeared on Cloudification - We build Clouds 🚀☁️.

from Cloudification – We build Clouds 🚀☁️ https://cloudification.io/cloud-blog/kernel-based-virtual-machine-kvm-open-source-hypervisor-powering-world-clouds/

https://cloudification.io/

https://cloudification.io/

·cloudificationgmbh.blogspot.com·
Kernel-based Virtual Machine (KVM): The Major Open-Source Hypervisor Powering World Clouds
Openstack vs. Proxmox: Which Virtualization Solution is Right for Me?
Openstack vs. Proxmox: Which Virtualization Solution is Right for Me?

Openstack vs. Proxmox: Which Virtualization Solution is Right for Me?

https://cloudificationgmbh.blogspot.com/2025/03/openstack-vs-proxmox-which.html

Openstack vs. Proxmox: Which Virtualization Solution is Right for Me?

Today many organizations seek cost-effective and scalable cloud solutions (and alternatives to migrate away from VMware), with two players – OpenStack and Proxmox that have emerged as leading open-source virtualization solutions with over a decade of development. Both offer similar features, but they cater to different use cases and scale. In this article we compare OpenStack vs. Proxmox to help you choose the best solution for your business case. Let’s take a closer look.

First things first, what are OpenStack and Proxmox?

What is OpenStack?

OpenStack is an open-source cloud computing platform designed to manage large pools of compute, storage, and networking resources. It is a collection of open-source software modules that integrate together to orchestrate and manage cloud infrastructure resources such as VMs, Volumes, Load-Balancers, Routers and more. It is a great option for enterprises of all sizes looking to build private, hybrid, or public cloud infrastructure. In fact some of the well-known public clouds are running OpenStack – for example OVH, Open Telekom Cloud and Rackspace to name a few.

Openstack’s key features include scalability, support of multi-region cloud deployments and strong multi-tenancy. Also native integration with Kubernetes (KaaS), advanced networking SDN features with Neutron and flexible storage options (Ceph, Swift, NFS, and more than 30 vendor integrations such as NetApp, Pure, HP, Dell, Huawei, etc.) making it a versatile and fully featured cloud solution suitable for all kinds of scales and industries.

Check our previous post to learn more about Openstack.

 

What is Proxmox?

Proxmox Virtual Environment (PVE) is an open-source virtualization management platform developed by Proxmox Server Solutions GmbH. It supports two virtualization technologies, namely KVM (Kernel-based Virtual Machine which is also used in OpenStack) for full virtualization and LXC (Linux Containers) for lightweight container-based virtualization. It has recently gained popularity thanks to its simplicity and ease of use.

Key features of Proxmox include an user-friendly web-based GUI, high availability support, integrated backup, multiple storage options (ZFS, Ceph, LVM, and NFS) and lightweight container virtualization making it a great fit for small businesses and homelabs.

Key differences between OpenStack and Proxmox

Feature

OpenStack

Proxmox

Architecture

Modular (microservice) and highly scalable with separate components for compute (Nova), storage (Cinder, Swift), networking (Neutron), identity (Keystone), etc. Main hypervisor is KVM, with optional support for ESXi, XEN and LXC

Integrated stack with built-in KVM for VMs and LXC support for containers

User Interface

Horizon or Skyline dashboards (web-based) and CLI. Third party, commercial UI and billing solutions exist (HostBill, OSIE, Fleio)

Simple, user-friendly web-based GUI and CLI.

API

Extensive REST API for automation, orchestration and integration with third-party tools supported such as Terraform (OpenTofu), Ansible and backup solutions.

API allows users to programmatically manage their virtualization environments using RESTful web services. Provides endpoints for managing virtual machines, containers, storage, networking.

Deployment Complexity

High Requires knowledge of architecture, of multiple components and services.

Moderate

Easier to set up with a web-based management interface.

Base OS Layer

Has to be deployed to any Linux operating system or in containers.

Based on Debian with its own kernel, no other distributions are supported.

Scalability

Highly scalable, designed for large-scale infrastructure with multi-region support. Regions can grow to 1000-s of hypervisors.

Limited scalability compared to OpenStack, ideal for small clusters between 3-30 nodes.Limited scalability compared to OpenStack, ideal for small clusters between 3-30 nodes.

Storage

Supports Object, Block and Share types with a variety of backends (Ceph, Swift, LVM, NFS, NetApp, Pure, HP, etc.) and additional features including access control.

Supports Block and Share types via ZFS, Ceph, LVM, and NFS. Very limited support for storage vendor solutions.

Networking

Advanced SDN networking with Neutron, supports security groups, complex routing and multi-tenancy (ML2, OVS/OVN); VLAN, VxLAN and GRE tunnels; VPNaaS, L2GW. Load Balancing with Octavia.

Basic SDN networking (VLAN, DHCP, FRR, VxLAN)

Multi-Tenancy

Yes, built-in support with the concept of Keystone Domains, Projects and Sub-projects and a complete configurable RBAC.

Very limited, requires 3rd party solutions such as multiportal.io

High Availability

Requires third-party tools or advanced configuration (e.g. HAproxy, Pacemaker, Keepalived)

Built-in High Availability and clustering features

Licensing and Costs

Open source with optional but often required enterprise support or custom development

Open source with optional subscriptions to receive official packages and updates

Use Cases

IaaS & PaaS for public and private clouds, enterprise-grade multi-tenancy, and large-scale automation. Lots of customization options with drivers and integrations.

Small business virtualization, homelabs, test and dev environments where multitenancy is not required. Limited customization options.

The Business Model behind OpenStack and Proxmox

Both OpenStack and Proxmox are offered as open-source software. However, the business models and target audience differ quite a lot. Let’s check in more details.

Openstack Openstack follows a community-driven approach. It is governed by the OpenInfra Foundation (previously known as OpenStack Foundation), a non-profit organization dedicated to promoting open-source cloud technologies and OpenStack specifically. The foundation itself is financed by its supporting members with different levels of membership and contributions depending on the size of organization.

OpenStack is licensed under the Apache 2.0 license, ensuring broad usage rights and flexibility for both end-users and companies that want to offer public or private cloud solutions with OpenStack.

OpenStack is backed by a broad ecosystem of contributors, including major cloud providers and enterprise IT vendors, fostering an open development community. The source code of OpenStack is released simultaneously to all users ensuring equal access and transparency in development. With over 2000 active contributors both individuals and organizations, a regular voting for board of directors and Project Technical Leads (PTL), OpenStack can be considered a truly open-source software.

Proxmox Proxmox on the other hand is backed by Proxmox Server Solutions GmbH, a for-profit company based in Austria. It is operating under the GNU Affero General Public License (AGPL) v3, which requires users who modify and distribute the software to share their changes, unlike with OpenStack.

While the Proxmox Virtual Environment (VE) is free and open-source, Proxmox offers a freemium model where users can access a free community-supported version and enterprise users can purchase paid support subscriptions, which provide access to stable repositories, security updates and professional assistance. In other words, enterprise features, stability and support are gated behind paid subscriptions.

While both organizations are actively promoting open source, there is a potential licensing risk with Proxmox with a for-profit company behind it. In the recent years we have seen a number of companies alternating the licensing terms of their open-source solutions to charge money for any commercial usage. Big examples are HashiCorp with Terraform, Elastic with Elasticsearch and RedHat with CentOS. Such risk is less likely to happen with OpenStack that is governed by a non-profit, community-driven foundation.

In case you are concerned with the possibility of licensing changes, you should probably consider the implications of picking Proxmox over OpenStack for your organization.

OpenStack vs. Proxmox: Which One is for Me?

Choose OpenStack if..

Choose Proxmox if…

…you need an easy-to-use virtualization platform with easy setup.

…your infrastructure is limited to a single cluster or a few small clusters.

…your team/organisation is limited to a single tenant without the need for advanced permission separation or access management.

…you want built-in VM backup, high availability, and snapshot management out of the box.

…you do not need LBaaS, KaaS or other PaaS features from the platform.

…you need an easy-to-use virtualization platform with easy setup.

…your infrastructure is limited to a single cluster or a few small clusters.

…your team/organisation is limited to a single tenant without the need for advanced permission separation or access management.

…you want built-in VM backup, high availability, and snapshot management out of the box.

…you do not need LBaaS, KaaS or other PaaS features from the platform.

Can Proxmox replace OpenStack?

Not entirely.

While Proxmox is a great virtualization tool and uses similar technologies under the hood (KVM, Ceph, OVS like OpenStack) and it can be used as a simple VMware alternative, it lacks advanced SDN and LBaaS features, Kubernetes orchestration, and extensive storage drivers support found in OpenStack. OpenStack features come from numerous integrations and drivers, supporting all major vendors. The OpenStack upstream development process with Zuul CI allows 3rd party vendors to integrate their tests to ensure fully driver compatibility between the changes. With Proxmox the vendor and feature options are much more limited, therefore enterprises tend to pick OpenStack over Proxmox.

Openstack’s complexity comes with its benefits: OpenStack provides rich

·cloudificationgmbh.blogspot.com·
Openstack vs. Proxmox: Which Virtualization Solution is Right for Me?
Configuring Ceph pg_autoscale with Rook: A Guide to Balanced Data Distribution
Configuring Ceph pg_autoscale with Rook: A Guide to Balanced Data Distribution

Configuring Ceph pg_autoscale with Rook: A Guide to Balanced Data Distribution

https://cloudificationgmbh.blogspot.com/2025/02/configuring-ceph-pgautoscale-with-rook.html

Configuring Ceph pg_autoscale with Rook for OpenStack Deployments: A Guide to Balanced Data Distribution

At Cloudification, we deploy private clouds based on OpenStack, leveraging Rook-Ceph as a highly available storage solution. During the installation process, one of the recurring issues we faced is a proper configuration of the Ceph cluster to ensure balanced data distribution across OSDs (Object Storage Daemons).

The Problem: PG Imbalance Alerts

Right after a fresh installation, we started receiving PGImbalance alerts from Prometheus, indicating poorly distributed data across hosts. PG stands for Placement Group which is an abstraction under Storage Pool, where each individual object in a cluster is assigned to a PG. Since the number of objects in the cluster can be on the count of hundreds of millions, PGs allow Ceph to operate and rebalance without the need to address each object individually. Let’s have a look at Ceph Placement groups in the cluster:

$ ceph pg dump ... OSD_STAT USED AVAIL USED_RAW TOTAL HB_PEERS PG_SUM PRIMARY_PG_SUM 23 33 GiB 1.7 TiB 33 GiB 1.7 TiB [0,2,5,7,8,10,16,18,19,22] 4 0 4 113 MiB 1.7 TiB 113 MiB 1.7 TiB [0,1,2,3,5,6,8,9,11,12,14,15,16,17,20,23] 2 1 1 49 GiB 1.7 TiB 49 GiB 1.7 TiB [0,2,5,6,9,10,12,13,15,16,17,18,21,22] 26 19 19 23 GiB 1.7 TiB 23 GiB 1.7 TiB [1,2,3,5,10,16,18,20,21,22] 15 17 22 19 GiB 1.7 TiB 19 GiB 1.7 TiB [4,5,6,11,15,17,19,20,21,23] 11 0 21 226 GiB 1.5 TiB 226 GiB 1.7 TiB [1,3,9,10,13,16,17,18,20,22] 108 17 20 117 MiB 1.7 TiB 117 MiB 1.7 TiB [0,4,7,12,14,17,18,19,21,22] 5 0 18 258 GiB 1.5 TiB 258 GiB 1.7 TiB [1,5,8,10,11,14,16,17,19,21,22,23] 122 19 17 34 GiB 1.7 TiB 34 GiB 1.7 TiB [0,1,2,3,5,6,8,9,11,12,13,15,16,18,20,21,22,23] 6 4 16 33 GiB 1.7 TiB 33 GiB 1.7 TiB [0,5,7,8,11,12,13,15,17,20] 23 2 15 109 MiB 1.7 TiB 109 MiB 1.7 TiB [2,10,12,14,16,18,19,21,22,23] 4 0 0 109 MiB 1.7 TiB 109 MiB 1.7 TiB [1,2,7,8,12,13,14,17,20,23] 5 1 13 111 MiB 1.7 TiB 111 MiB 1.7 TiB [0,1,2,3,8,9,12,14,15,17,19,21] 7 2 2 116 MiB 1.7 TiB 116 MiB 1.7 TiB [1,3,8,11,15,17,18,19,20,22] 3 0 3 33 GiB 1.7 TiB 33 GiB 1.7 TiB [2,4,5,7,8,9,10,11,16,23] 12 0 5 52 GiB 1.7 TiB 52 GiB 1.7 TiB [1,4,6,11,12,13,14,16,17,18,19,20,21,22,23] 16 2 6 23 GiB 1.7 TiB 23 GiB 1.7 TiB [4,5,7,9,10,11,15,19,20,22] 4 2 7 793 MiB 1.7 TiB 793 MiB 1.7 TiB [0,1,3,4,6,8,10,12,13,14,15,16,18,19,21,23] 4 20 8 34 GiB 1.7 TiB 34 GiB 1.7 TiB [0,5,7,9,12,13,14,18,20,22] 5 2 9 60 GiB 1.7 TiB 60 GiB 1.7 TiB [0,1,3,8,10,12,13,16,17,21] 5 2 10 216 GiB 1.5 TiB 216 GiB 1.7 TiB [1,3,4,5,6,7,9,11,12,14,15,16,18,19,21,22] 101 18 11 101 MiB 1.7 TiB 101 MiB 1.7 TiB [1,2,5,10,12,16,18,19,22,23] 4 1 12 54 GiB 1.7 TiB 54 GiB 1.7 TiB [0,1,3,5,6,7,8,9,10,11,13,14,18,20,21] 16 34 14 25 GiB 1.7 TiB 25 GiB 1.7 TiB [4,5,6,7,10,12,13,15,19,20,22] 5 2 sum 1.1 TiB 41 TiB 1.1 TiB 42 TiB

Let’s check how many PGs are configured for pools:

bash-5.1$ for pool in $(ceph osd lspools | awk '{print $2}') ; do echo "pool: $pool - pg_num: ceph osd pool get $pool pg_num" ; done

pool: .rgw.root - pg_num: pg_num: 1 pool: replicapool - pg_num: pg_num: 1 pool: .mgr - pg_num: pg_num: 1 pool: rgw-data-pool - pg_num: pg_num: 1 pool: s3-store.rgw.log - pg_num: pg_num: 1 pool: s3-store.rgw.control - pg_num: pg_num: 1 pool: s3-store.rgw.buckets.index - pg_num: pg_num: 1 pool: s3-store.rgw.otp - pg_num: pg_num: 1 pool: s3-store.rgw.buckets.non-ec - pg_num: pg_num: 1 pool: s3-store.rgw.meta - pg_num: pg_num: 1 pool: rgw-meta-pool - pg_num: pg_num: 1 pool: s3-store.rgw.buckets.data - pg_num: pg_num: 1 pool: cephfs-metadata - pg_num: pg_num: 1 pool: cephfs-data0 - pg_num: pg_num: 1 pool: cinder.volumes.hdd - pg_num: pg_num: 1 pool: cinder.backups - pg_num: pg_num: 1 pool: glance.images - pg_num: pg_num: 1 pool: nova.ephemeral - pg_num: pg_num: 1

This directly correlates with imbalanced OSD utilization, as Ceph was only creating 1 Placement Group per pool, leading to inefficient data distribution.

To diagnose the issue, we used the rados df command to identify pools consuming the most space and adjusting pg_num. In this document you will find what you need to calculate this number here.

If we manually reconfigure the current number of PGs for several pools, for example Cinder, Nova, Glance and CephFS:

$ ceph osd pool set cinder.volumes.nvme pg_num 256 $ ceph osd pool set nova.ephemeral pg_num 16 $ ceph osd pool set glance.images pg_num 16 $ ceph osd pool set cephfs-data0 pg_num 16

This triggers rebalancing, resulting in more balanced usage and the resolution of the alert:

bash-5.1$ ceph -s cluster: id: a6ab9446-2c0d-42f4-b009-514e989fd4a0 health: HEALTH_OK

services: mon: 3 daemons, quorum b,d,f (age 3d) mgr: b(active, since 3d), standbys: a mds: 1/1 daemons up, 1 hot standby osd: 24 osds: 24 up (since 3d), 24 in (since 3d) rgw: 3 daemons active (3 hosts, 1 zones)

data: volumes: 1/1 healthy pools: 17 pools, 331 pgs objects: 101.81k objects, 371 GiB usage: 1.2 TiB used, 41 TiB / 42 TiB avail pgs: 331 active+clean

io: client: 7.4 KiB/s rd, 1.7 MiB/s wr, 9 op/s rd, 166 op/s wr

...

OSD_STAT USED AVAIL USED_RAW TOTAL HB_PEERS PG_SUM PRIMARY_PG_SUM 23 68 GiB 1.7 TiB 68 GiB 1.7 TiB [0,1,2,3,4,5,6,10,11,12,13,14,16,17,18,19,22] 37 12 4 33 GiB 1.7 TiB 33 GiB 1.7 TiB [0,1,2,3,5,6,7,8,9,10,11,12,13,14,15,16,17,20,22,23] 34 13 1 37 GiB 1.7 TiB 37 GiB 1.7 TiB [0,2,3,5,6,7,9,10,11,12,13,14,15,16,17,18,20,21,22] 42 13 19 39 GiB 1.7 TiB 39 GiB 1.7 TiB [0,2,3,6,7,9,10,11,12,13,15,16,17,18,20,22,23] 41 12 22 36 GiB 1.7 TiB 36 GiB 1.7 TiB [0,1,2,3,4,5,6,7,8,9,10,11,12,15,16,19,21,23] 36 11 21 62 GiB 1.7 TiB 62 GiB 1.7 TiB [0,1,2,3,5,6,8,9,10,13,14,15,16,17,18,19,20,22] 37 9 20 35 GiB 1.7 TiB 35 GiB 1.7 TiB [0,1,4,6,7,8,10,12,14,15,16,17,18,19,21] 39 10 18 67 GiB 1.7 TiB 67 GiB 1.7 TiB [1,2,5,7,8,9,10,11,13,14,16,17,19,20,21,22,23] 37 12 17 65 GiB 1.7 TiB 65 GiB 1.7 TiB [0,1,2,3,4,5,6,8,9,11,12,13,15,16,18,19,20,21,22,23] 34 14 16 35 GiB 1.7 TiB 35 GiB 1.7 TiB [0,1,4,5,7,8,9,10,11,12,13,15,17,18,19,20,21,22,23] 39 13 15 39 GiB 1.7 TiB 39 GiB 1.7 TiB [1,2,6,10,12,13,14,16,18,19,21,23] 41 5 0 34 GiB 1.7 TiB 34 GiB 1.7 TiB [1,2,4,5,7,8,9,10,11,12,13,14,15,16,17,19,20,21,22,23] 37 13 13 31 GiB 1.7 TiB 31 GiB 1.7 TiB [0,1,2,3,4,5,6,7,8,9,12,14,15,16,17,18,19,20,21,22,23] 36 16 2 33 GiB 1.7 TiB 33 GiB 1.7 TiB [0,1,3,6,8,11,13,14,15,16,17,18,19,20,21,22] 34 11 3 33 GiB 1.7 TiB 33 GiB 1.7 TiB [2,4,5,7,8,9,10,12,13,15,16,17,19,21,22,23] 33 12 5 64 GiB 1.7 TiB 64 GiB 1.7 TiB [0,1,3,4,6,8,10,11,12,13,14,15,16,17,18,19,20,21,22,23] 37 9 6 54 GiB 1.7 TiB 54 GiB 1.7 TiB [1,4,5,7,8,9,10,11,12,13,14,15,16,19,20,21,22,23] 32 9 7 38 GiB 1.7 TiB 38 GiB 1.7 TiB [0,1,3,4,6,8,10,11,12,13,14,15,16,17,18,19,20,22,23] 39 11 8 65 GiB 1.7 TiB 65 GiB 1.7 TiB [0,3,5,6,7,9,10,12,13,14,15,17,18,20,22] 33 14 9 95 GiB 1.7 TiB 95 GiB 1.7 TiB [0,1,3,6,8,10,11,12,13,14,15,16,17,18,19,20,21,23] 36 11 10 62 GiB 1.7 TiB 62 GiB 1.7 TiB [0,3,4,5,6,7,8,9,11,14,15,16,17,18,19,20,21,22,23] 36 14 11 35 GiB 1.7 TiB 35 GiB 1.7 TiB [0,1,2,3,5,8,9,10,12,14,15,16,18,19,20,22,23] 37 14 12 58 GiB 1.7 TiB 58 GiB 1.7 TiB [0,1,3,4,5,6,7,8,9,11,13,14,15,17,18,19,20,21,23] 35 13 14 56 GiB 1.7 TiB 56 GiB 1.7 TiB [1,2,4,5,6,7,8,9,10,12,13,15,18,19,20,21,22,23] 34 15 sum 1.1 TiB 41 TiB 1.1 TiB 42 TiB

Why did this happen?

By default, Ceph might not create the optimal number of PGs for each pool, resulting in data skew and uneven utilization of storage devices. Manually setting the pg_num for each pool is not a sustainable solution, as data volume is expected to grow over time.

That mean

·cloudificationgmbh.blogspot.com·
Configuring Ceph pg_autoscale with Rook: A Guide to Balanced Data Distribution