54588 bookmarks

Custom sorting

KYAML · Issue #5295 · kubernetes/enhancements

Enhancement Description One-line enhancement description (can be used as a release note): Add KYAML output for kubectl Kubernetes Enhancement Proposal: https://github.com/kubernetes/enhancements/bl...

#Kubernetes #KYAML

·kep.k8s.io·today at 11:38 PM

KYAML · Issue #5295 · kubernetes/enhancements

Last Week in Kubernetes Development - Week Ending August 24 2025

Week Ending August 24, 2025

https://lwkd.info/2025/20250827

Developer News

Kubernetes 1.34 is released! This version, named “Of Wind & Will”, includes DRA GA, KYAML spec, structured authentication config, better watch cache initialization, and much more.

Yuki Iwai is nominated as a new Working Group Batch lead, joining Marcin and Kevin, as Swati and Maciej step down. Raise any Concerns before September 4, 2025.

Tim Hockin is stepping down as SIG Network co-chair and nominating Bowei Du as his replacement. He will remain a SIG Network Tech Lead. Lazy consensus on August 29, 2025.

Steering Committee Election

The Steering Committee election has started. This first stage is candidate nominations, to register potential new steering members. Have you considered working on the Steering Committee?

It is also time to verify if you are am eligible voter. If you are not, and should be, file a ballot exception.

Release Schedule

Next Deadline: Release Day 27th August

Kubernetes v1.34 is released.

A regression in kube-proxy v1.34.* that prevented startup on single-stack IPv4 or IPv6 hosts was identified and fixed ahead of release cut. A huge thank you to all contributors, reviewers, and release team members whose efforts made this release possible!

The next scheduled patch releases are on September 9, 2025 (cherry pick deadline: September 5, 2025). As a reminder, Kubernetes 1.31 will enter maintenance mode on August 28, 2025, with End of Life (EOL) planned for October 28, 2025.

Featured PRs

133604: Fix storage counting all objects instead of objects for resource

This PR fixes a regression where apiserver_storage_objects was overcounted by counting all etcd objects (using /registry) instead of just the target resource (e.g., pods); It now counts only that resource’s objects, thus giving accurate per-resource metrics and avoiding extra work when the watch cache is disabled.

KEP of the Week

KEP 24: Add AppArmor Support

This KEP introduces support for AppArmor within a cluster. AppArmor can enable users to run a more secure deployment, and/or provide better auditing and monitoring of their systems. The AppArmor support provides users an alternative to SELinux, and provides an interface for users that are already maintaining a set of AppArmor profiles. This KEP is proposing a minimal path to GA, per the no perma-Beta requirement.

This KEP was released as Stable in 1.34

Other Merges

Count storage types accurately when filtering per type

Prevent data race around claimsToAllocate

Subprojects and Dependency Updates

cluster-api v1.11.0 adds support for Kubernetes v1.33 (management and workload clusters), introduces the v1beta2 API, and includes new providers (Scaleway, cdk8s)

kubespray v2.28.1 fixes etcd and kubeadm issues while improving Cilium, Hubble, and Calico networking stability

Shoutouts

Christian Schlotter (@chrischdi): Thanks to Fabrizio Pandini (@fabrizio.pandini) and Stefan Büringer (@sbueringer) for the huge amount of work they did for the latest cluster api :cluster-api: v1.11.0 release to set the stage for the v1beta2 api version, which benefits all users to have a more clear and consistent API as well as a better feedback loop!

via Last Week in Kubernetes Development https://lwkd.info/

August 27, 2025 at 05:50PM

·lwkd.info·today at 10:34 PM

Last Week in Kubernetes Development - Week Ending August 24 2025

A hacker used AI to automate an 'unprecedented' cybercrime spree, Anthropic says

The company behind the Claude chatbot said it caught a hacker using its chatbot to identify, hack and extort at least 17 companies.

·nbcnews.com·today at 10:09 PM

A hacker used AI to automate an 'unprecedented' cybercrime spree, Anthropic says

Kubernetes v1.34: Of Wind & Will (O' WaW)

https://kubernetes.io/blog/2025/08/27/kubernetes-v1-34-release/

Editors: Agustina Barbetta, Alejandro Josue Leon Bellido, Graziano Casto, Melony Qin, Dipesh Rawat

Similar to previous releases, the release of Kubernetes v1.34 introduces new stable, beta, and alpha features. The consistent delivery of high-quality releases underscores the strength of our development cycle and the vibrant support from our community.

This release consists of 58 enhancements. Of those enhancements, 23 have graduated to Stable, 22 have entered Beta, and 13 have entered Alpha.

There are also some deprecations and removals in this release; make sure to read about those.

Release theme and logo

A release powered by the wind around us — and the will within us.

Every release cycle, we inherit winds that we don't really control — the state of our tooling, documentation, and the historical quirks of our project. Sometimes these winds fill our sails, sometimes they push us sideways or die down.

What keeps Kubernetes moving isn't the perfect winds, but the will of our sailors who adjust the sails, man the helm, chart the courses and keep the ship steady. The release happens not because conditions are always ideal, but because of the people who build it, the people who release it, and the bears ^, cats, dogs, wizards, and curious minds who keep Kubernetes sailing strong — no matter which way the wind blows.

This release, Of Wind & Will (O' WaW), honors the winds that have shaped us, and the will that propels us forward.

^ Oh, and you wonder why bears? Keep wondering!

Spotlight on key updates

Kubernetes v1.34 is packed with new features and improvements. Here are a few select updates the Release Team would like to highlight!

Stable: The core of DRA is GA

Dynamic Resource Allocation (DRA) enables more powerful ways to select, allocate, share, and configure GPUs, TPUs, NICs and other devices.

Since the v1.30 release, DRA has been based around claiming devices using structured parameters that are opaque to the core of Kubernetes. This enhancement took inspiration from dynamic provisioning for storage volumes. DRA with structured parameters relies on a set of supporting API kinds: ResourceClaim, DeviceClass, ResourceClaimTemplate, and ResourceSlice API types under resource.k8s.io, while extending the .spec for Pods with a new resourceClaims field.

The resource.k8s.io/v1 APIs have graduated to stable and are now available by default.

This work was done as part of KEP #4381 led by WG Device Management.

Beta: Projected ServiceAccount tokens for kubelet image credential providers

The kubelet credential providers, used for pulling private container images, traditionally relied on long-lived Secrets stored on the node or in the cluster. This approach increased security risks and management overhead, as these credentials were not tied to the specific workload and did not rotate automatically.

To solve this, the kubelet can now request short-lived, audience-bound ServiceAccount tokens for authenticating to container registries. This allows image pulls to be authorized based on the Pod's own identity rather than a node-level credential.

The primary benefit is a significant security improvement. It eliminates the need for long-lived Secrets for image pulls, reducing the attack surface and simplifying credential management for both administrators and developers.

This work was done as part of KEP #4412 led by SIG Auth and SIG Node.

Alpha: Support for KYAML, a Kubernetes dialect of YAML

KYAML aims to be a safer and less ambiguous YAML subset, and was designed specifically for Kubernetes. Whatever version of Kubernetes you use, starting from Kubernetes v1.34 you are able to use KYAML as a new output format for kubectl.

KYAML addresses specific challenges with both YAML and JSON. YAML's significant whitespace requires careful attention to indentation and nesting, while its optional string-quoting can lead to unexpected type coercion (for example: "The Norway Bug"). Meanwhile, JSON lacks comment support and has strict requirements for trailing commas and quoted keys.

You can write KYAML and pass it as an input to any version of kubectl, because all KYAML files are also valid as YAML. With kubectl v1.34, you are also able to request KYAML output (as in kubectl get -o kyaml …) by setting environment variable KUBECTL_KYAML=true. If you prefer, you can still request the output in JSON or YAML format.

This work was done as part of KEP #5295 led by SIG CLI.

Features graduating to Stable

This is a selection of some of the improvements that are now stable following the v1.34 release.

Delayed creation of Job’s replacement Pods

By default, Job controllers create replacement Pods immediately when a Pod starts terminating, causing both Pods to run simultaneously. This can cause resource contention in constrained clusters, where the replacement Pod may struggle to find available nodes until the original Pod fully terminates. The situation can also trigger unwanted cluster autoscaler scale-ups. Additionally, some machine learning frameworks like TensorFlow and JAX require only one Pod per index to run at a time, making simultaneous Pod execution problematic. This feature introduces .spec.podReplacementPolicy in Jobs. You may choose to create replacement Pods only when the Pod is fully terminated (has .status.phase: Failed). To do this, set .spec.podReplacementPolicy: Failed.

Introduced as alpha in v1.28, this feature has graduated to stable in v1.34.

This work was done as part of KEP #3939 led by SIG Apps.

Recovery from volume expansion failure

This feature allows users to cancel volume expansions that are unsupported by the underlying storage provider, and retry volume expansion with smaller values that may succeed.

Introduced as alpha in v1.23, this feature has graduated to stable in v1.34.

This work was done as part of KEP #1790 led by SIG Storage.

VolumeAttributesClass for volume modification

VolumeAttributesClass has graduated to stable in v1.34. VolumeAttributesClass is a generic, Kubernetes-native API for modifying volume parameters like provisioned IO. It allows workloads to vertically scale their volumes on-line to balance cost and performance, if supported by their provider.

Like all new volume features in Kubernetes, this API is implemented via the container storage interface (CSI). Your provisioner-specific CSI driver must support the new ModifyVolume API which is the CSI side of this feature.

This work was done as part of KEP #3751 led by SIG Storage.

Structured authentication configuration

Kubernetes v1.29 introduced a configuration file format to manage API server client authentication, moving away from the previous reliance on a large set of command-line options. The AuthenticationConfiguration kind allows administrators to support multiple JWT authenticators, CEL expression validation, and dynamic reloading. This change significantly improves the manageability and auditability of the cluster's authentication settings - and has graduated to stable in v1.34.

This work was done as part of KEP #3331 led by SIG Auth.

Finer-grained authorization based on selectors

Kubernetes authorizers, including webhook authorizers and the built-in node authorizer, can now make authorization decisions based on field and label selectors in incoming requests. When you send list, watch or deletecollection requests with selectors, the authorization layer can now evaluate access with that additional context.

For example, you can write an authorization policy that only allows listing Pods bound to a specific .spec.nodeName. The client (perhaps the kubelet on a particular node) must specify the field selector that the policy requires, otherwise the request is forbidden. This change makes it feasible to set up least privilege rules, provided that the client knows how to conform to the restrictions you set. Kubernetes v1.34 now supports more granular control in environments like per-node isolation or custom multi-tenant setups.

This work was done as part of KEP #4601 led by SIG Auth.

Restrict anonymous requests with fine-grained controls

Instead of fully enabling or disabling anonymous access, you can now configure a strict list of endpoints where unauthenticated requests are allowed. This provides a safer alternative for clusters that rely on anonymous access to health or bootstrap endpoints like /healthz, /readyz, or /livez.

With this feature, accidental RBAC misconfigurations that grant broad access to anonymous users can be avoided without requiring changes to external probes or bootstrapping tools.

This work was done as part of KEP #4633 led by SIG Auth.

More efficient requeueing through plugin-specific callbacks

The kube-scheduler can now make more accurate decisions about when to retry scheduling Pods that were previously unschedulable. Each scheduling plugin can now register callback functions that tell the scheduler whether an incoming cluster event is likely to make a rejected Pod schedulable again.

This reduces unnecessary retries and improves overall scheduling throughput - especially in clusters using dynamic resource allocation. The feature also lets certain plugins skip the usual backoff delay when it is safe to do so, making scheduling faster in specific cases.

This work was done as part of KEP #4247 led by SIG Scheduling.

Ordered Namespace deletion

Semi-random resource deletion order can create security gaps or unintended behavior, such as Pods persisting after their associated NetworkPolicies are deleted.

This improvement introduces a more structured deletion process for Kubernetes namespaces to ensure secure and deterministic resource removal. By enforcing a structured deletion sequence that respects logical and security dependencies, this approach ensures Pods are removed before other resources.

This feature was introduced in Kubernetes v1.33 and graduated to stable in

·kubernetes.io·today at 7:00 PM

Kubernetes v1.34: Of Wind & Will (O' WaW)

DigitalOcean MCP Server is now available | DigitalOcean

DigitalOcean is thrilled to offer its support for MCP Server, a new way to leverage AI.

·digitalocean.com·today at 6:57 PM

DigitalOcean MCP Server is now available | DigitalOcean

Shocked I tell you, shocked /s | Dish gives up on becoming the fourth major wireless carrier

Boost Mobile will primarily use AT&T for connectivity.

·theverge.com·today at 12:37 PM

Shocked I tell you, shocked /s | Dish gives up on becoming the fourth major wireless carrier

AI & DevOps Toolkit - Ep33 - Ask Me Anything About Anything with Scott Rosenberg - https://www.youtube.com/watch?v=7_-PoHIWVl4

Ep33 - Ask Me Anything About Anything with Scott Rosenberg

There are no restrictions in this AMA session. You can ask anything about DevOps, AI, Cloud, Kubernetes, Platform Engineering, containers, or anything else. Scott Rosenberg, a regular guest, will be here to help us out.

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: Codefresh 🔗 GitOps Argo CD Certifications: https://learning.codefresh.io (use "viktor" for a 50% discount) ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/

▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox

via YouTube https://www.youtube.com/watch?v=7_-PoHIWVl4

·youtube.com·yesterday at 6:03 PM

AI & DevOps Toolkit - Ep33 - Ask Me Anything About Anything with Scott Rosenberg - https://www.youtube.com/watch?v=7_-PoHIWVl4

OSPO Jobs | TODO Group

Current openings for Open Source Program Office (OSPO) professionals - program managers, compliance, policy, governance & more

·todogroup.org·yesterday at 3:55 PM

OSPO Jobs | TODO Group

The air is hissing out of the overinflated AI balloon

Opinion: Are tech giants getting nervous? They should be

·theregister.com·yesterday at 3:40 PM

The air is hissing out of the overinflated AI balloon

AWS, Cloudflare, Google, helped Feds identify DDOS suspect

Infosec in brief: Comet AI browser fooled; Microsoft sets sail for quantum safety; Sailor sent down for espionage

·theregister.com·yesterday at 12:16 PM

AWS, Cloudflare, Google, helped Feds identify DDOS suspect

Inside the A.I. Talent Wars

Researchers in the technology have been landing quarter-billion dollar salaries.

·nytimes.com·yesterday at 12:11 PM

Inside the A.I. Talent Wars

Arch Linux Project Responding to Week-Long DDoS Attack

The Arch Linux Project has been targeted in a DDoS attack that disrupted its website, repository, and forums.

·securityweek.com·yesterday at 12:09 PM

Arch Linux Project Responding to Week-Long DDoS Attack

What are SLOs, SLAs, and SLIs? A complete guide to service reliability etrics | Blog | incident.io

SRE is necessary to build sustainable software systems. In this article, we explain the fundamentals of SRE, including SLO, SLI, and SLA, and how they function.

·incident.io·yesterday at 12:04 PM

What are SLOs, SLAs, and SLIs? A complete guide to service reliability etrics | Blog | incident.io

Teaching Kubernetes to Scale with a MacBook Screen Lock with Brian Donelan

Teaching Kubernetes to Scale with a MacBook Screen Lock, with Brian Donelan

https://ku.bz/sFd8TL1cS

Brian Donelan, VP Cloud Platform Engineering at JPMorgan Chase, shares his ingenious side project that automatically scales Kubernetes workloads based on whether his MacBook is open or closed.

By connecting macOS screen lock events to CloudWatch, KEDA, and Karpenter, he built a system that achieves 80% cost savings by scaling pods and nodes to zero when he's away from his laptop.

You will learn:

How KEDA differs from traditional Kubernetes HPA - including its scale-to-zero capabilities, event-driven scaling, and extensive ecosystem of 60+ built-in scalers

The technical architecture connecting macOS notifications through CloudWatch to trigger Kubernetes autoscaling using Swift, AWS SDKs, and custom metrics

Cost optimization strategies including how to calculate actual savings, account for API costs, and identify leading indicators of compute demand

Creative approaches to autoscaling signals beyond CPU and memory, including examples from financial services and e-commerce that could revolutionize workload management

Sponsor

This episode is brought to you by Testkube—the ultimate Continuous Testing Platform for Cloud Native applications. Scale fast, test continuously, and ship confidently. Check it out at testkube.io

More info

Find all the links and info for this episode here: https://ku.bz/sFd8TL1cS

Interested in sponsoring an episode? Learn more.

via KubeFM https://kube.fm

August 26, 2025 at 06:00AM

·kube.fm·yesterday at 11:25 AM

Teaching Kubernetes to Scale with a MacBook Screen Lock with Brian Donelan

Reimagining Cluster Management for the AI Era | Jonathon Anderson, CIQ - TFiR

CIQ’s Warewulf Pro modernizes HPC cluster management with a web UI, pre-built images, and AI-ready provisioning, building on a 20-year open-source legacy.

·tfir.io·Aug 25, 2025

Reimagining Cluster Management for the AI Era | Jonathon Anderson, CIQ - TFiR

AI & DevOps Toolkit - Stop Wasting Time: Turn AI Prompts Into Production Code - https://www.youtube.com/watch?v=XwWCFINXIoU

Stop Wasting Time: Turn AI Prompts Into Production Code

Spent three hours writing the perfect AI prompt only to watch it fail spectacularly? You're not alone. The problem isn't bad AI – it's that most developers treat prompts like throwaway commands instead of production code. This video reveals why context is everything in AI development, walking you through the evolution of a real prompt from 5 words to 500, and showing how proper prompt engineering can transform your team's productivity.

But here's the kicker: even perfect prompts are useless if your team can't share them effectively. I'll demonstrate how to turn your carefully crafted prompts into a shared asset using the Model Context Protocol (MCP), creating a system that evolves with your team and deploys like any other code. By the end, you'll understand why prompt management – not smarter models – is the real future of AI development, and you'll have the tools to build that future for your organization.

AIPrompts #MCPProtocol #DevOpsAI

Consider joining the channel: https://www.youtube.com/c/devopstoolkit/join

▬▬▬▬▬▬ 🔗 Additional Info 🔗 ▬▬▬▬▬▬ ➡ Transcript and commands: https://devopstoolkit.live/ai/stop-wasting-time-turn-ai-prompts-into-production-code 🔗 Model Context Protocol: https://modelcontextprotocol.io

▬▬▬▬▬▬ 💰 Sponsorships 💰 ▬▬▬▬▬▬ If you are interested in sponsoring this channel, please visit https://devopstoolkit.live/sponsor for more information. Alternatively, feel free to contact me over Twitter or LinkedIn (see below).

▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/

▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox

▬▬▬▬▬▬ ⏱ Timecodes ⏱ ▬▬▬▬▬▬ 00:00 Introduction to AI Context and MCPs 01:23 AI Context Management Explained 05:44 Prompt Engineering Best Practices 09:40 Sharing AI Prompts Across Teams 13:25 MCP for Prompt Distribution 16:03 Prompt Management Key Takeaways

via YouTube https://www.youtube.com/watch?v=XwWCFINXIoU

·youtube.com·Aug 25, 2025

AI & DevOps Toolkit - Stop Wasting Time: Turn AI Prompts Into Production Code - https://www.youtube.com/watch?v=XwWCFINXIoU

Why wind farms attract so much misinformation and conspiracy theory

If you think climate change is a hoax, you might believe wind turbines poison groundwater.

·arstechnica.com·Aug 24, 2025

Why wind farms attract so much misinformation and conspiracy theory

Developer gets 4 years for activating network “kill switch” to avenge his firing

Disgruntled developer was caught after naming the “kill switch” after himself.

·arstechnica.com·Aug 24, 2025

Developer gets 4 years for activating network “kill switch” to avenge his firing

Why Did a $10 Billion Startup Let Me Vibe-Code for Them—and Why Did I Love It?

I spent two days at Notion and saw an industry in upheaval. I also shipped some actual code.

·wired.com·Aug 24, 2025

Why Did a $10 Billion Startup Let Me Vibe-Code for Them—and Why Did I Love It?

To SSH is human, but that doesn’t mean we should - Sidero Labs

SSH is like opening the hood of your car while driving 70mph to adjust the engine. It works fine, until it doesn’t… Consider this: You go weeks with everything running smoothly. You follow the process, write great code, and don’t SSH into a node right before going to bed on Friday night. One day, an […]

·siderolabs.com·Aug 21, 2025

To SSH is human, but that doesn’t mean we should - Sidero Labs

Last Week in Kubernetes Development - Week Ending August 17 2025

Week Ending August 17, 2025

https://lwkd.info/2025/20250820

Developer News

A medium-severity vulnerability (CVE-2025-5187, CVSS 6.7) affects Kubernetes clusters using the NodeRestriction admission controller without OwnerReferencesPermissionEnforcement. It allows a compromised node to delete its own Node object by patching OwnerReferences, then recreate it with altered taints or labels, bypassing normal delete restrictions. Update to the latest patch release (1.33.4, 1.32.8, or 1.31.12) to close this security hole.

Release Schedule

Next Deadline: Release day, 27 August

We are in the final week before releasing 1.34. Make sure to respond quickly to any blocker issues or test failures your SIG is tagged on.

Patch releases 1.33.4, 1.32.8, and 1.31.12 were published this week, built with Go 1.24.5 and 1.23.11 respectively. These patch releases primarily addresses an exploitable security hole so admins should update at the next availble downtime. Kubernetes 1.31 enters maintenance mode on Aug 28, 2025; the End of Life date for Kubernetes 1.31 is Oct 28, 2025.

Featured PRs

133409: Make podcertificaterequestcleaner role feature-gated

This PR restricts the creation of RBAC permissions for the podcertificaterequestcleaner controller behind a feature gate. The ClusterRole and ClusterRoleBinding for this controller are now only created when the related feature is enabled; This change helps reduce unnecessary permissions in clusters where the controller is not in use; It supports a more secure and minimal RBAC configuration by avoiding unused roles.

KEP of the Week

KEP 2340: Consistent Reads from Cache

This KEP introduces a mechanism to serve most reads from the watch cache while maintaining the same consistency guarantees as serving reads from etcd. Previously, the Get and List requests were guaranteed to be Consistent reads and were served from etcd using a “quorum read”. Serving reads from the watch cache is more performant and scalable than reading them from etcd, deserializing them, applying selectors, converting them to the desired version, and then garbage collecting all the objects that were allocated during the whole process.

This KEP is tracked for Stable in 1.34

Other Merges

Prevent data race around claimsToAllocate

Clarify staging repository READMEs

Version Updates

Bumped Go Version to 1.23.12 for publishing bot rules.

Bumped dependencies and images to Go 1.24.6 and distroless iptables

Subprojects and Dependency Updates

Ingress-NGINX v1.13.1 updates NGINX to v2.2.1, Go to v1.24.6, and includes bug fixes and improvements; Helm Chart v4.13.1 adds helm-test target and includes the updated controller

Shoutouts

Want to thank someone in the community? Drop a note in #shoutouts on Slack.

via Last Week in Kubernetes Development https://lwkd.info/

August 20, 2025 at 06:00PM

·lwkd.info·Aug 20, 2025

Last Week in Kubernetes Development - Week Ending August 17 2025

Building a custom telescope mount with harmonic drives and ESP32

How I went from buying a €200 tracker to building a custom telescope mount with harmonic drives, ESP32, and way more engineering than necessary

·svendewaerhert.com·Aug 20, 2025

Building a custom telescope mount with harmonic drives and ESP32

Tuning Linux Swap for Kubernetes: A Deep Dive

https://kubernetes.io/blog/2025/08/19/tuning-linux-swap-for-kubernetes-a-deep-dive/

The Kubernetes NodeSwap feature, likely to graduate to stable in the upcoming Kubernetes v1.34 release, allows swap usage: a significant shift from the conventional practice of disabling swap for performance predictability. This article focuses exclusively on tuning swap on Linux nodes, where this feature is available. By allowing Linux nodes to use secondary storage for additional virtual memory when physical RAM is exhausted, node swap support aims to improve resource utilization and reduce out-of-memory (OOM) kills.

However, enabling swap is not a "turn-key" solution. The performance and stability of your nodes under memory pressure are critically dependent on a set of Linux kernel parameters. Misconfiguration can lead to performance degradation and interfere with Kubelet's eviction logic.

In this blogpost, I'll dive into critical Linux kernel parameters that govern swap behavior. I will explore how these parameters influence Kubernetes workload performance, swap utilization, and crucial eviction mechanisms. I will present various test results showcasing the impact of different configurations, and share my findings on achieving optimal settings for stable and high-performing Kubernetes clusters.

Introduction to Linux swap

At a high level, the Linux kernel manages memory through pages, typically 4KiB in size. When physical memory becomes constrained, the kernel's page replacement algorithm decides which pages to move to swap space. While the exact logic is a sophisticated optimization, this decision-making process is influenced by certain key factors:

Page access patterns (how recently pages are accessed)

Page dirtyness (whether pages have been modified)

Memory pressure (how urgently the system needs free memory)

Anonymous vs File-backed memory

It is important to understand that not all memory pages are the same. The kernel distinguishes between anonymous and file-backed memory.

Anonymous memory: This is memory that is not backed by a specific file on the disk, such as a program's heap and stack. From the application's perspective this is private memory, and when the kernel needs to reclaim these pages, it must write them to a dedicated swap device.

File-backed memory: This memory is backed by a file on a filesystem. This includes a program's executable code, shared libraries, and filesystem caches. When the kernel needs to reclaim these pages, it can simply discard them if they have not been modified ("clean"). If a page has been modified ("dirty"), the kernel must first write the changes back to the file before it can be discarded.

While a system without swap can still reclaim clean file-backed pages memory under pressure by dropping them, it has no way to offload anonymous memory. Enabling swap provides this capability, allowing the kernel to move less-frequently accessed memory pages to disk to conserve memory to avoid system OOM kills.

Key kernel parameters for swap tuning

To effectively tune swap behavior, Linux provides several kernel parameters that can be managed via sysctl.

vm.swappiness: This is the most well-known parameter. It is a value from 0 to 200 (100 in older kernels) that controls the kernel's preference for swapping anonymous memory pages versus reclaiming file-backed memory pages (page cache).

High value (eg: 90+): The kernel will be aggressive in swapping out less-used anonymous memory to make room for file-cache.

Low value (eg: < 10): The kernel will strongly prefer dropping file cache pages over swapping anonymous memory.

vm.min_free_kbytes: This parameter tells the kernel to keep a minimum amount of memory free as a buffer. When the amount of free memory drops below the this safety buffer, the kernel starts more aggressively reclaiming pages (swapping, and eventually handling OOM kills).

Function: It acts as a safety lever to ensure the kernel has enough memory for critical allocation requests that cannot be deferred.

Impact on swap: Setting a higher min_free_kbytes effectively raises the floor for for free memory, causing the kernel to initiate swap earlier under memory pressure.

vm.watermark_scale_factor: This setting controls the gap between different watermarks: min, low and high, which are calculated based on min_free_kbytes.

Watermarks explained:

low: When free memory is below this mark, the kswapd kernel process wakes up to reclaim pages in the background. This is when a swapping cycle begins.

min: When free memory hits this minimum level, then aggressive page reclamation will block process allocation. Failing to reclaim pages will cause OOM kills.

high: Memory reclamation stops once the free memory reaches this level.

Impact: A higher watermark_scale_factor careates a larger buffer between the low and min watermarks. This gives kswapd more time to reclaim memory gradually before the system hits a critical state.

In a typical server workload, you might have a long-running process with some memory that becomes 'cold'. A higher swappiness value can free up RAM by swapping out the cold memory, for other active processes that can benefit from keeping their file-cache.

Tuning the min_free_kbytes and watermark_scale_factor parameters to move the swapping window early will give more room for kswapd to offload memory to disk and prevent OOM kills during sudden memory spikes.

Swap tests and results

To understand the real-impact of these parameters, I designed a series of stress tests.

Test setup

Environment: GKE on Google Cloud

Kubernetes version: 1.33.2

Node configuration: n2-standard-2 (8GiB RAM, 50GB swap on a pd-balanced disk, without encryption), Ubuntu 22.04

Workload: A custom Go application designed to allocate memory at a configurable rate, generate file-cache pressure, and simulate different memory access patterns (random vs sequential).

Monitoring: A sidecar container capturing system metrics every second.

Protection: Critical system components (kubelet, container runtime, sshd) were prevented from swapping by setting memory.swap.max=0 in their respective cgroups.

Test methodology

I ran a stress-test pod on nodes with different swappiness settings (0, 60, and 90) and varied the min_free_kbytes and watermark_scale_factor parameters to observe the outcomes under heavy memory allocation and I/O pressure.

Visualizing swap in action

The graph below, from a 100MBps stress test, shows swap in action. As free memory (in the "Memory Usage" plot) decreases, swap usage (Swap Used (GiB)) and swap-out activity (Swap Out (MiB/s)) increase. Critically, as the system relies more on swap, the I/O activity and corresponding wait time (IO Wait % in the "CPU Usage" plot) also rises, indicating CPU stress.

Findings

My initial tests with default kernel parameters (swappiness=60, min_free_kbytes=68MB, watermark_scale_factor=10) quickly led to OOM kills and even unexpected node restarts under high memory pressure. With selecting appropriate kernel parameters a good balance in node stability and performance can be achieved.

The impact of swappiness

The swappiness parameter directly influences the kernel's choice between reclaiming anonymous memory (swapping) and dropping page cache. To observe this, I ran a test where one pod generated and held file-cache pressure, followed by a second pod allocating anonymous memory at 100MB/s, to observe the kernel preference on reclaim:

My findings reveal a clear trade-off:

swappiness=90: The kernel proactively swapped out the inactive anonymous memory to keep the file cache. This resulted in high and sustained swap usage and significant I/O activity ("Blocks Out"), which in turn caused spikes in I/O wait on the CPU.

swappiness=0: The kernel favored dropping file-cache pages delaying swap consumption. However, it's critical to understand that this does not disable swapping. When memory pressure was high, the kernel still swapped anonymous memory to disk.

The choice is workload-dependent. For workloads sensitive to I/O latency, a lower swappiness is preferable. For workloads that rely on a large and frequently accessed file cache, a higher swappiness may be beneficial, provided the underlying disk is fast enough to handle the load.

Tuning watermarks to prevent eviction and OOM kills

The most critical challenge I encountered was the interaction between rapid memory allocation and Kubelet's eviction mechanism. When my test pod, which was deliberately configured to overcommit memory, allocated it at a high rate (e.g., 300-500 MBps), the system quickly ran out of free memory.

With default watermarks, the buffer for reclamation was too small. Before kswapd could free up enough memory by swapping, the node would hit a critical state, leading to two potential outcomes:

Kubelet eviction If kubelet's eviction manager detected memory.available was below its threshold, it would evict the pod.

OOM killer In some high-rate scenarios, the OOM Killer would activate before eviction could complete, sometimes killing higher priority pods that were not the source of the pressure.

To mitigate this I tuned the watermarks:

Increased min_free_kbytes to 512MiB: This forces the kernel to start reclaiming memory much earlier, providing a larger safety buffer.

Increased watermark_scale_factor to 2000: This widened the gap between the low and high watermarks (from ≈337MB to ≈591MB in my test node's /proc/zoneinfo), effectively increasing the swapping window.

This combination gave kswapd a larger operational zone and more time to swap pages to disk during memory spikes, successfully preventing both premature evictions and OOM kills in my test runs.

Table compares watermark levels from /proc/zoneinfo (Non-NUMA node):

min_free_kbytes=67584KiB and watermark_scale_factor=10

min_free_kbytes=524288KiB and watermark_scale_factor=2000

Node 0, zone Normal pages free 583273 boost 0 min 10504 low

·kubernetes.io·Aug 19, 2025

Tuning Linux Swap for Kubernetes: A Deep Dive

Building a Carbon and Price-Aware Kubernetes Scheduler with Dave Masselink

Building a Carbon and Price-Aware Kubernetes Scheduler, with Dave Masselink

https://ku.bz/zk2xM1lfW

Data centers consume over 4% of global electricity and this number is projected to triple in the next few years due to AI workloads.

Dave Masselink, founder of Compute Gardener, discusses how he built a Kubernetes scheduler that makes scheduling decisions based on real-time carbon intensity data from power grids.

You will learn:

How carbon-aware scheduling works - Using real-time grid data to shift workloads to periods when electricity generation has lower carbon intensity, without changing energy consumption

Technical implementation details - Building custom Kubernetes schedulers using the scheduler plugin framework, including pre-filter and filter stages for carbon and time-of-use pricing optimization

Energy measurement strategies - Approaches for tracking power consumption across CPUs, memory, and GPUs

Sponsor

This episode is brought to you by Testkube—the ultimate Continuous Testing Platform for Cloud Native applications. Scale fast, test continuously, and ship confidently. Check it out at testkube.io

More info

Find all the links and info for this episode here: https://ku.bz/zk2xM1lfW

Interested in sponsoring an episode? Learn more.

via KubeFM https://kube.fm

August 19, 2025 at 06:00AM

·kube.fm·Aug 19, 2025

Building a Carbon and Price-Aware Kubernetes Scheduler with Dave Masselink

The real danger of systemd-coredump CVE-2025-4598 | CIQ

TL;DR: A critical vulnerability in systemd-coredump remains unfixed in Enterprise Linux 9, allowing attackers to steal password hashes and cryptographic keys within seconds - but Rocky Linux from CIQ…

·ciq.com·Aug 18, 2025

The real danger of systemd-coredump CVE-2025-4598 | CIQ

AI & DevOps Toolkit - AI Will Replace Coders - But Not the Way You Think - https://www.youtube.com/watch?v=qBp8d6yBPPg

AI Will Replace Coders - But Not the Way You Think

After three decades in tech, I've never seen developers this terrified, and for good reason. AI can already write code faster than us, and it's rapidly approaching the point where it might write better code too. But here's what's driving me crazy: everyone is panicking about the wrong thing. They're worried AI will steal their jobs because it can code, which is like a chef fearing unemployment because someone invented a better knife.

Your real value was never in typing syntax or executing commands; that's just the mechanical stuff that happens after all the important thinking is done. The developers who will thrive aren't trying to out-code AI; they're the architects, problem-solvers, and domain experts who understand what needs to be built and why. Your deep knowledge of your industry, your business context, and the messy realities of how things actually work? That's your moat. AI doesn't know why your healthcare platform needs that weird HIPAA workaround, or why your e-commerce flow accommodates that legacy client system. Stop being a code monkey and start being the expert AI needs to not screw everything up. The choice is yours, but the clock is ticking.

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: Readdy 🔗 https://readdy.ai ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

AIandDevelopers #FutureOfCoding #TechCareerAdvice

Consider joining the channel: https://www.youtube.com/c/devopstoolkit/join

▬▬▬▬▬▬ 🔗 Additional Info 🔗 ▬▬▬▬▬▬ ➡ Transcript and commands: https://devopstoolkit.live/ai/ai-will-replace-coders---but-not-the-way-you-think

▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/

▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox

▬▬▬▬▬▬ ⏱ Timecodes ⏱ ▬▬▬▬▬▬ 00:00 Introduction to Coding with AI 01:24 Sponsor (Readdy) 02:51 The Fear: AI Replacing Developers 06:50 The Truth: What Developers Really Do 13:54 The Secret Weapon: Your Domain Knowledge is Your Moat 19:16 The Adaptation: Thriving with AI

via YouTube https://www.youtube.com/watch?v=qBp8d6yBPPg

·youtube.com·Aug 18, 2025

AI & DevOps Toolkit - AI Will Replace Coders - But Not the Way You Think - https://www.youtube.com/watch?v=qBp8d6yBPPg

Container Security: Techniques, Misconfigurations, and Attack Path

Investigate container security by exploring attack paths and misconfigurations in Docker and Kubernetes to improve security practices

·offensivebytes.com·Aug 16, 2025

Container Security: Techniques, Misconfigurations, and Attack Path

yokecd/yoke: Kubernetes Package Management as Code; infrastructure as code, but actually.

Kubernetes Package Management as Code; infrastructure as code, but actually. - yokecd/yoke

·github.com·Aug 16, 2025

yokecd/yoke: Kubernetes Package Management as Code; infrastructure as code, but actually.

Hackers Hijacked Google’s Gemini AI With a Poisoned Calendar Invite to Take Over a Smart Home

For likely the first time ever, security researchers have shown how AI can be hacked to create real world havoc, allowing them to turn off lights, open smart shutters, and more.

·wired.com·Aug 15, 2025

Hackers Hijacked Google’s Gemini AI With a Poisoned Calendar Invite to Take Over a Smart Home

We Rewrote the Ghostty GTK Application

·mitchellh.com·Aug 15, 2025

We Rewrote the Ghostty GTK Application