
1_r/devopsish
Ep23 - Ask Me Anything About Anything with Esmira Bayramova
There are no restrictions in this AMA session. You can ask anything about DevOps, Cloud, Kubernetes, Platform Engineering, containers, or anything else. We'll have special guest Esmira Bayramova to help us out.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: Codefresh 🔗 GitOps Argo CD Certifications: https://learning.codefresh.io (use "viktor" for a 50% discount) ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/
▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox
via YouTube https://www.youtube.com/watch?v=b5pN35kcOkk
Week Ending May 25, 2025
https://lwkd.info/2025/20250527
Developer News
The Program Committee is now accepting applications for the Maintainer Summit North America 2025. Share your interest in joining the committee before Monday, July 7th.
Release Schedule
Next Deadline: PRR Freeze, June 12th
The Release Cycle for 1.34 has started, and the release team is actively collecting enhancements. SIG Leads should discuss enhancements and add the lead-opted-in label for KEPs going into v1.34.
Featured PRs
131842: Add metrics for compatibility version
This PR adds alpha metrics for binary, emulation, and minimum compatibility versions in componentGlobalsRegistry, exposed via Prometheus in kube-apiserver, scheduler, and controller-manager for observability of version negotiation. It introduces an AddMetrics method that publishes the binary version, emulation version, and minimum compatibility version of each component as Prometheus gauge metrics. Users can now monitor version negotiation for kube-apiserver, scheduler, and controller-manager using these metrics.
128748: feat: introduce pInfo.UnschedulableCount to make the backoff calculation more appropriate
This PR updates the scheduler to separate scheduling failures caused by plugin rejections from those caused by internal errors. It introduces UnschedulableCount to track only plugin-based rejections, ensuring that transient errors like API failures or network issues do not increase backoff time unfairly. This change improves scheduling fairness and responsiveness under cluster instability.
129983: feature(scheduler): Customizable pod selection and ordering in DefaultPreemption plugin
This PR introduces support for customizing pod selection and ordering in the DefaultPreemption plugin; It adds optional EligiblePods and OrderedPods function hooks, allowing scheduler integrations to override the default behavior without reimplementing the plugin. This enables more flexible preemption strategies while maintaining the existing plugin interface.
This PR adds support for the EncryptionAlgorithmECDSAP384 in kubeadm API types; Users can now choose ECDSA-P384 for generating PKI assets like CA and component certificates during kubeadm init; Implemented key generation logic for ECDSA P-384 keys in pkiutil (using elliptic.P384()). This ensures the algorithm is handled correctly across pkiutil and cluster configuration paths.
KEP of the Week
KEP 4369: Allow almost all printable ASCII characters in environment variables
This enhancement allowed all printable ASCII characters (with ASCII codes 32–126), except "=", to be used in environment variable names. Previously, Kubernetes imposed restrictions that could prevent certain applications from functioning as intended, especially when users couldn’t control the variable names. By lifting these constraints, the change improved compatibility with a broader range of applications and removed an adoption barrier, aligning Kubernetes behaviour more closely with real-world usage patterns
This KEP is tracked for beta in v1.34.
Other Merges
automatic_reloads of authz config metrics to beta
Pod backoff to be completely skipped when PodMaxBackoffDuration kube-scheduler option is set to zero
Shorthand for –output flag in kubectl explain which was accidentally deleted has been added back
Kubernetes is now built using Go 1.24.3
References to group resource in metrics unified
e2e: Shadowed error fixed in reboot test
Filter integration tests added for NodeAffinity plugin
AuthenticationConfiguration type has been promoted to apiserver.config.k8s.io/v1
Volumes on nodes to not be expanded if controller expansion is finished
Promotions
QueueingHint to GA
kuberc to beta
Version Updates
system-validators to v1.10.1
etcd to v3.6.0
Go for publishing bot rules to 1.23.9
Subprojects and Dependency Updates
minikube v1.36.0 delivers significantly faster vfkit networking on macOS with the --network vmnet-shared option, supports Kubernetes v1.33.1, enables addon configuration via a dedicated config file, and includes additional improvements
vertical-pod-autoscaler v1.4.0 is out, with alpha support for in-place pod resource updates via the InPlaceOrRecreate Feature Gate, improved resource tracking from pod status, options for global maximum resource limits, and a set of bug fixes and dependency updates
kubespray v2.28.0 is out with a bunch of version updates. Krew installation support is removed.
Shoutouts
No shoutouts this week. Want to thank someone for special efforts to improve Kubernetes? Tag them in the #shoutouts channel.
via Last Week in Kubernetes Development https://lwkd.info/
May 27, 2025 at 06:10PM
Performance testing Kubernetes workloads, with Stephan Schwarz
If you're tasked with performance testing Kubernetes workloads without much guidance, this episode offers clear, experience-based strategies that go beyond theory.
Stephan Schwarz, a DevOps engineer at iits-consulting, walks through his systematic approach to performance testing Kubernetes applications. He covers everything from defining what performance actually means, to the practical methodology of breaking individual pods to understand their limits, and navigating the complexities of Kubernetes-specific components that affect test results.
You will learn:
How to establish baseline performance metrics by systematically testing individual pods, disabling autoscaling features, and documenting each incremental change to understand real application limits
Why shared Kubernetes components skew results and how ingress controllers, service meshes, and monitoring stacks create testing challenges that require careful consideration of the entire request chain
Practical approaches to HPA configuration, including how to account for scaling latency, the time delays inherent in Kubernetes scaling operations, and planning for spare capacity based on your SLA requirements
The role of observability tools like OpenTelemetry in production environments where load testing isn't feasible, and how distributed tracing helps isolate performance bottlenecks across interdependent services
Sponsor
This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/yY-FnmGfH
Interested in sponsoring an episode? Learn more.
via KubeFM https://kube.fm
May 27, 2025 at 06:00AM
The Missing Link: How MCP Servers Supercharge Your AI Coding Assistant
Discover the power of Model Context Protocol (MCP) for AI-assisted software engineering! This video explores how MCP enhances Large Language Models and AI agents by providing crucial context. Learn about two essential MCP servers: Memory and Context7. See how they improve AI's ability to understand project specifics, retain information, and access up-to-date documentation. Witness practical demonstrations using Cursor, and learn how to integrate MCP servers into your workflow. Elevate your AI-assisted coding experience with MCP!
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: Stacklok Toolhive 🔗 https://github.com/stacklok/toolhive ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
AIForDevelopers, #ModelContextProtocol, #LLMEnhancements
Consider joining the channel: https://www.youtube.com/c/devopstoolkit/join
▬▬▬▬▬▬ 🔗 Additional Info 🔗 ▬▬▬▬▬▬ ➡ Transcript and commands: https://devopstoolkit.live/ai/the-missing-link-how-mcp-servers-supercharge-your-ai-coding-assistant 🔗 Model Context Protocol: https://github.com/modelcontextprotocol 🎬 Outdated AI Responses? Context7 Solves LLMs' Biggest Flaw: https://youtu.be/F0MLnVgk4as
▬▬▬▬▬▬ 💰 Sponsorships 💰 ▬▬▬▬▬▬ If you are interested in sponsoring this channel, please visit https://devopstoolkit.live/sponsor for more information. Alternatively, feel free to contact me over Twitter or LinkedIn (see below).
▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/
▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox
▬▬▬▬▬▬ ⏱ Timecodes ⏱ ▬▬▬▬▬▬ 00:00 MCP Intro 00:50 What is Model Context Protocol (MCP)? 06:53 Memory and Context7 MCP Servers 07:49 Stacklok Toolhive (sponsor) 09:04 Memory and Context7 MCP Servers (cont.)
via YouTube https://www.youtube.com/watch?v=n0dCFY6wMeI
more filename tips (in the shell)
filename tips (in the shell)
Week Ending May 18, 2025
https://lwkd.info/2025/20250521
Developer News
James Sturtevant and Amim Knabben are stepping down from their roles as techincal leads in SIG Windows and Yuanliang Zhang is notimated as the new Lead
Wenjia Zhang has stepped down as the co-chair of Kubernetes SIG etcd. Siyuan Zhang is nominated to take over Wenjia’s role as the co-chair.
SIG Contributor Experience has updated the help-wanted guidelines to remove the “low barrier to entry” requirement. This improves the distinction between “good first issue” and “help-wanted” and better aligns with other open source projects. The help-wanted issues still require clear tasks, goldilocks priority and must be up-to-date.
Release Schedule
Next Deadline: v1.34 cycle starts May 19
The v1.34 release cycle has officially started this week, with a planned release date of 27th August.
Patch releases v1.33.1, 1.32.5, 1.31.9 and 1.30.13 are available. This is mostly a bugfix release, with a golang update.
Featured PRs
131299: DRA: prevent admin access claims from getting duplicate devices
This PR fixes a bug where ResourceClaims with adminAccess could be allocated the same device multiple times within a single claim; The DRA allocator now checks that each device is used only once per claim, preventing invalid CDI specs and ensuring correct behavior for device sharing with Dynamic Resource Allocation.
131345: scheduler: return UnschedulableAndUnresolvable when node capacity is insufficient
This PR updates the NodeResourcesFit plugin to return UnschedulableAndUnresolvable when a pod’s resource requests exceed a node’s allocatable capacity, even if the node is empty; This avoids unnecessary preemption attempts for nodes that can never satisfy the request, improves scheduling efficiency in large clusters, and provides clearer signals for unschedulable pods.
KEP of the Week
KEP 4247: Per-plugin callback functions for efficient requeueing in the scheduling queue
This KEP introduced the QueueingHint functionality to the Kubernetes scheduler, enabling plugins to provide more precise suggestions for when to requeue Pods. By filtering out low-impact events such as unnecessary Node updates for NodeAffinity the scheduler reduced redundant retries and improved scheduling throughput. The KEP also allowed plugins like the DRA plugin to skip backoff in specific cases, enhancing performance for Pods requiring dynamic resource allocation by avoiding unnecessary delays while waiting for device driver updates.
This KEP is tracked for beta in v1.34.
Other Merges
e2e tests for kuberc added
Scheduler improved the backoff calculation to O(1)
Response body closed after http calls in watch test
Error message improved when a pod with user namespaces is created and the runtime doesn’t support user namespaces
DRA: Reject NodePrepareResources if the cached claim UID doesn’t match resource claim
suggestChangeEmulationVersion to clarify how to test a locked feature for emulation version
kubelet removed the deprecated –cloud-config flag
Non-scheduling related errors to not lengthen the Pod scheduling backoff time
kube-log-runner adds log rotation
Scheduler introduced pInfo.GatingPlugin to filter out events more generally
Subprojects and Dependency Updates
etcd releases v3.6.0 bringing bugfixes and features like robust downgrade support, full migration to the v3store backend, Kubernetes-style feature gates, major memory optimizations and new health check endpoints for improved cluster monitoring.
Shoutouts
Josh Berkus (@jberkus): A big TY to Benjamin Wang (@Benjamin Wang) and Wenjia Zhang (@wenjiaswe) for getting Etcd 3.6 out the door, and to Tim Bannister (@LMKTFY), Ryota Sawada (@Ryota), Mario Fahlandt (@Mario Fahlandt) and Kaslin Fields (@kaslin) for helping promote it!
via Last Week in Kubernetes Development https://lwkd.info/
May 21, 2025 at 04:00PM
Managing 100s of Kubernetes Clusters using Cluster API, with Zain Malik
Discover how to manage Kubernetes at scale with declarative infrastructure and automation principles.
Zain Malik shares his experience managing multi-tenant Kubernetes clusters with up to 30,000 pods across clusters capped at 950 nodes. He explains how his team transitioned from Terraform to Cluster API for declarative cluster lifecycle management, contributing upstream to improve AKS support while implementing GitOps workflows.
You will learn:
How to address challenges in large-scale Kubernetes operations, including node pool management inconsistencies and lengthy provisioning times
Why Cluster API provides a powerful foundation for multi-cloud cluster management, and how to extend it with custom operators for production-specific needs
How implementing GitOps principles eliminates manual intervention in critical operations like cluster upgrades
Strategies for handling production incidents and bugs when adopting emerging technologies like Cluster API
Sponsor
This episode is sponsored by Learnk8s — get started on your Kubernetes journey through comprehensive online, in-person or remote training.
More info
Find all the links and info for this episode here: https://ku.bz/5PLksqVlk
Interested in sponsoring an episode? Learn more.
via KubeFM https://kube.fm
May 20, 2025 at 06:00AM
Ep22 - Ask Me Anything About Anything with Scott Rosenberg
There are no restrictions in this AMA session. You can ask anything about DevOps, Cloud, Kubernetes, Platform Engineering, containers, or anything else. We'll have special guests Scott Rosenberg and Ramiro Berrelleza to help us out.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: Codefresh 🔗 GitOps Argo CD Certifications: https://learning.codefresh.io (use "viktor" for a 50% discount) ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/
▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox
via YouTube https://www.youtube.com/watch?v=7brdKxUiB9s
Outdated AI Responses? Context7 Solves LLMs' Biggest Flaw
Discover the power of AI-enhanced coding with Context7! This video explores how to overcome outdated LLM information using Context7, an MCP server that provides up-to-date documentation. See how Context7 integrates with AI agents, improving their ability to provide current, reliable information for over 11000 projects. Boost your development workflow and stay ahead with cutting-edge tools and techniques.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬ Sponsor: Korbit AI 🔗 https://korbit.ai ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
AIAgents #Context7 #AIDocs
Consider joining the channel: https://www.youtube.com/c/devopstoolkit/join
▬▬▬▬▬▬ 🔗 Additional Info 🔗 ▬▬▬▬▬▬ ➡ Transcript and commands: https://devopstoolkit.live/ai/outdated-ai-responses?-context7-solves-llms-biggest-flaw 🔗 Context7: https://context7.com
▬▬▬▬▬▬ 💰 Sponsorships 💰 ▬▬▬▬▬▬ If you are interested in sponsoring this channel, please visit https://devopstoolkit.live/sponsor for more information. Alternatively, feel free to contact me over Twitter or LinkedIn (see below).
▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/
▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox
▬▬▬▬▬▬ ⏱ Timecodes ⏱ ▬▬▬▬▬▬ 00:00 The Problem with Models (LLMs) 01:07 Korbit AI (sponsor) 02:13 The Problem with Models (LLMs) (cont.) 02:23 Agents Using LLM Alone 04:19 Agents with Context7 MCP 07:04 What Is Context7?
via YouTube https://www.youtube.com/watch?v=DeZ-gw_aop0