1_r/devopsish

1_r/devopsish

54497 bookmarks
Custom sorting
Homebrew - Flox Docs
Homebrew - Flox Docs
Using Flox to replace or augment Homebrew
·flox.dev·
Homebrew - Flox Docs
chris-short/rust-spitter: A universal code minifier that supports multiple programming languages. This tool finds all relevant source files in a directory, minifies them, and outputs them in a standardized format.
chris-short/rust-spitter: A universal code minifier that supports multiple programming languages. This tool finds all relevant source files in a directory, minifies them, and outputs them in a standardized format.
A universal code minifier that supports multiple programming languages. This tool finds all relevant source files in a directory, minifies them, and outputs them in a standardized format. - chris-s...
·github.com·
chris-short/rust-spitter: A universal code minifier that supports multiple programming languages. This tool finds all relevant source files in a directory, minifies them, and outputs them in a standardized format.
WordPress CEO Rage Quits Community Slack After Court Injunction
WordPress CEO Rage Quits Community Slack After Court Injunction
Automattic is ordered to undo several of the actions of its CEO Matt Mullenweg in its ongoing legal battle with WP Engine. “It's hard to imagine wanting to continue to working on WordPress after this," Mullenweg said in a community Slack message.
·404media.co·
WordPress CEO Rage Quits Community Slack After Court Injunction
Kubernetes v1.32: QueueingHint Brings a New Possibility to Optimize Pod Scheduling
Kubernetes v1.32: QueueingHint Brings a New Possibility to Optimize Pod Scheduling

Kubernetes v1.32: QueueingHint Brings a New Possibility to Optimize Pod Scheduling

https://kubernetes.io/blog/2024/12/12/scheduler-queueinghint/

The Kubernetes scheduler is the core component that selects the nodes on which new Pods run. The scheduler processes these new Pods one by one. Therefore, the larger your clusters, the more important the throughput of the scheduler becomes.

Over the years, Kubernetes SIG Scheduling has improved the throughput of the scheduler in multiple enhancements. This blog post describes a major improvement to the scheduler in Kubernetes v1.32: a scheduling context element named QueueingHint. This page provides background knowledge of the scheduler and explains how QueueingHint improves scheduling throughput.

Scheduling queue

The scheduler stores all unscheduled Pods in an internal component called the scheduling queue.

The scheduling queue consists of the following data structures:

ActiveQ: holds newly created Pods or Pods that are ready to be retried for scheduling.

BackoffQ: holds Pods that are ready to be retried but are waiting for a backoff period to end. The backoff period depends on the number of unsuccessful scheduling attempts performed by the scheduler on that Pod.

Unschedulable Pod Pool: holds Pods that the scheduler won't attempt to schedule for one of the following reasons:

The scheduler previously attempted and was unable to schedule the Pods. Since that attempt, the cluster hasn't changed in a way that could make those Pods schedulable.

The Pods are blocked from entering the scheduling cycles by PreEnqueue Plugins, for example, they have a scheduling gate, and get blocked by the scheduling gate plugin.

Scheduling framework and plugins

The Kubernetes scheduler is implemented following the Kubernetes scheduling framework.

And, all scheduling features are implemented as plugins (e.g., Pod affinity is implemented in the InterPodAffinity plugin.)

The scheduler processes pending Pods in phases called cycles as follows:

Scheduling cycle: the scheduler takes pending Pods from the activeQ component of the scheduling queue one by one. For each Pod, the scheduler runs the filtering/scoring logic from every scheduling plugin. The scheduler then decides on the best node for the Pod, or decides that the Pod can't be scheduled at that time.

If the scheduler decides that a Pod can't be scheduled, that Pod enters the Unschedulable Pod Pool component of the scheduling queue. However, if the scheduler decides to place the Pod on a node, the Pod goes to the binding cycle.

Binding cycle: the scheduler communicates the node placement decision to the Kubernetes API server. This operation bounds the Pod to the selected node.

Aside from some exceptions, most unscheduled Pods enter the unschedulable pod pool after each scheduling cycle. The Unschedulable Pod Pool component is crucial because of how the scheduling cycle processes Pods one by one. If the scheduler had to constantly retry placing unschedulable Pods, instead of offloading those Pods to the Unschedulable Pod Pool, multiple scheduling cycles would be wasted on those Pods.

Improvements to retrying Pod scheduling with QueuingHint

Unschedulable Pods only move back into the ActiveQ or BackoffQ components of the scheduling queue if changes in the cluster might allow the scheduler to place those Pods on nodes.

Prior to v1.32, each plugin registered which cluster changes could solve their failures, an object creation, update, or deletion in the cluster (called cluster events), with EnqueueExtensions (EventsToRegister), and the scheduling queue retries a pod with an event that is registered by a plugin that rejected the pod in a previous scheduling cycle.

Additionally, we had an internal feature called preCheck, which helped further filtering of events for efficiency, based on Kubernetes core scheduling constraints; For example, preCheck could filter out node-related events when the node status is NotReady.

However, we had two issues for those approaches:

Requeueing with events was too broad, could lead to scheduling retries for no reason.

A new scheduled Pod might solve the InterPodAffinity's failure, but not all of them do. For example, if a new Pod is created, but without a label matching InterPodAffinity of the unschedulable pod, the pod wouldn't be schedulable.

preCheck relied on the logic of in-tree plugins and was not extensible to custom plugins, like in issue #110175.

Here QueueingHints come into play; a QueueingHint subscribes to a particular kind of cluster event, and make a decision about whether each incoming event could make the Pod schedulable.

For example, consider a Pod named pod-a that has a required Pod affinity. pod-a was rejected in the scheduling cycle by the InterPodAffinity plugin because no node had an existing Pod that matched the Pod affinity specification for pod-a.

A diagram showing the scheduling queue and pod-a rejected by InterPodAffinity plugin

pod-a moves into the Unschedulable Pod Pool. The scheduling queue records which plugin caused the scheduling failure for the Pod. For pod-a, the scheduling queue records that the InterPodAffinity plugin rejected the Pod.

pod-a will never be schedulable until the InterPodAffinity failure is resolved. There're some scenarios that the failure could be resolved, one example is an existing running pod gets a label update and becomes matching a Pod affinity. For this scenario, the InterPodAffinity plugin's QueuingHint callback function checks every Pod label update that occurs in the cluster. Then, if a Pod gets a label update that matches the Pod affinity requirement of pod-a, the InterPodAffinity, plugin's QueuingHint prompts the scheduling queue to move pod-a back into the ActiveQ or the BackoffQ component.

A diagram showing the scheduling queue and pod-a being moved by InterPodAffinity QueueingHint

QueueingHint's history and what's new in v1.32

At SIG Scheduling, we have been working on the development of QueueingHint since Kubernetes v1.28.

While QueuingHint isn't user-facing, we implemented the SchedulerQueueingHints feature gate as a safety measure when we originally added this feature. In v1.28, we implemented QueueingHints with a few in-tree plugins experimentally, and made the feature gate enabled by default.

However, users reported a memory leak, and consequently we disabled the feature gate in a patch release of v1.28. From v1.28 until v1.31, we kept working on the QueueingHint implementation within the rest of the in-tree plugins and fixing bugs.

In v1.32, we made this feature enabled by default again. We finished implementing QueueingHints in all plugins and also identified the cause of the memory leak!

We thank all the contributors who participated in the development of this feature and those who reported and investigated the earlier issues.

Getting involved

These features are managed by Kubernetes SIG Scheduling.

Please join us and share your feedback.

How can I learn more?

KEP-4247: Per-plugin callback functions for efficient requeueing in the scheduling queue

via Kubernetes Blog https://kubernetes.io/

December 11, 2024 at 07:00PM

·kubernetes.io·
Kubernetes v1.32: QueueingHint Brings a New Possibility to Optimize Pod Scheduling
Kubernetes v1.32: Penelope
Kubernetes v1.32: Penelope

Kubernetes v1.32: Penelope

https://kubernetes.io/blog/2024/12/11/kubernetes-v1-32-release/

Editors: Matteo Bianchi, Edith Puclla, William Rizzo, Ryota Sawada, Rashan Smith

Announcing the release of Kubernetes v1.32: Penelope!

In line with previous releases, the release of Kubernetes v1.32 introduces new stable, beta, and alpha features. The consistent delivery of high-quality releases underscores the strength of our development cycle and the vibrant support from our community. This release consists of 44 enhancements in total. Of those enhancements, 13 have graduated to Stable, 12 are entering Beta, and 19 have entered in Alpha.

Release theme and logo

The Kubernetes v1.32 Release Theme is "Penelope".

If Kubernetes is Ancient Greek for "pilot", in this release we start from that origin and reflect on the last 10 years of Kubernetes and our accomplishments: each release cycle is a journey, and just like Penelope, in "The Odyssey",

weaved for 10 years -- each night removing parts of what she had done during the day -- so does each release add new features and removes others, albeit here with a much clearer purpose of constantly improving Kubernetes. With v1.32 being the last release in the year Kubernetes marks it's first decade anniversary, we wanted to honour all of those that have been part of the global Kubernetes crew that roams the cloud-native seas through perils and challanges: may we continue to weave the future of Kubernetes together.

Updates to recent key features

A note on DRA enhancements

In this release, like the previous one, the Kubernetes project continues proposing a number of enhancements to the Dynamic Resource Allocation (DRA), a key component of the Kubernetes resource management system. These enhancements aim to improve the flexibility and efficiency of resource allocation for workloads that require specialized hardware, such as GPUs, FPGAs and network adapters. These features are particularly useful for use-cases such as machine learning or high-performance computing applications. The core part enabling DRA Structured parameter support got promoted to beta.

Quality of life improvements on nodes and sidecar containers update

SIG Node has the following highlights that go beyond KEPs:

The systemd watchdog capability is now used to restart the kubelet when its health check fails, while also limiting the maximum number of restarts within a given time period. This enhances the reliability of the kubelet. For more details, see pull request #127566.

In cases when an image pull back-off error is encountered, the message displayed in the Pod status has been improved to be more human-friendly and to indicate details about why the Pod is in this condition. When an image pull back-off occurs, the error is appended to the status.containerStatuses[*].state.waiting.message field in the Pod specification with an ImagePullBackOff value in the reason field. This change provides you with more context and helps you to identify the root cause of the issue. For more details, see pull request

127918.

The sidecar containers feature is targeting graduation to Stable in v1.33. To view the remaining work items and feedback from users, see comments in the issue

753.

Highlights of features graduating to Stable

This is a selection of some of the improvements that are now stable following the v1.32 release.

Custom Resource field selectors

Custom resource field selector allows developers to add field selectors to custom resources, mirroring the functionality available for built-in Kubernetes objects. This allows for more efficient and precise filtering of custom resources, promoting better API design practices.

This work was done as a part of KEP #4358, by SIG API Machinery.

Support to size memory backed volumes

This feature makes it possible to dynamically size memory-backed volumes based on Pod resource limits, improving the workload's portability and overall node resource utilization.

This work was done as a part of KEP #1967, by SIG Node.

Bound service account token improvement

The inclusion of the node name in the service account token claims allows users to use such information during authorization and admission (ValidatingAdmissionPolicy). Furthermore this improvement keeps service account credentials from being a privilege escalation path for nodes.

This work was done as part of KEP #4193 by SIG Auth.

Structured authorization configuration

Multiple authorizers can be configured in the API server to allow for structured authorization decisions, with support for CEL match conditions in webhooks. This work was done as part of KEP #3221 by SIG Auth.

Auto remove PVCs created by StatefulSet

PersistentVolumeClaims (PVCs) created by StatefulSets get automatically deleted when no longer needed, while ensuring data persistence during StatefulSet updates and node maintenance. This feature simplifies storage management for StatefulSets and reduces the risk of orphaned PVCs.

This work was done as part of KEP #1847 by SIG Apps.

Highlights of features graduating to Beta

This is a selection of some of the improvements that are now beta following the v1.32 release.

Job API managed-by mechanism

The managedBy field for Jobs was promoted to beta in the v1.32 release. This feature enables external controllers (like Kueue) to manage Job synchronization, offering greater flexibility and integration with advanced workload management systems.

This work was done as a part of KEP #4368, by SIG Apps.

Only allow anonymous auth for configured endpoints

This feature lets admins specify which endpoints are allowed for anonymous requests. For example, the admin can choose to only allow anonymous access to health endpoints like /healthz, /livez, and /readyz while making sure preventing anonymous access to other cluster endpoints or resources even if a user misconfigures RBAC.

This work was done as a part of KEP #4633, by SIG Auth.

Per-plugin callback functions for accurate requeueing in kube-scheduler enhancements

This feature enhances scheduling throughput with more efficient scheduling retry decisions by per-plugin callback functions (QueueingHint). All plugins now have QueueingHints.

This work was done as a part of KEP #4247, by SIG Scheduling.

Recover from volume expansion failure

This feature lets users recover from volume expansion failure by retrying with a smaller size. This enhancement ensures that volume expansion is more resilient and reliable, reducing the risk of data loss or corruption during the process.

This work was done as a part of KEP #1790, by SIG Storage.

Volume group snapshot

This feature introduces a VolumeGroupSnapshot API, which lets users take a snapshot of multiple volumes together, ensuring data consistency across the volumes.

This work was done as a part of KEP #3476, by SIG Storage.

Structured parameter support

The core part of Dynamic Resource Allocation (DRA), the structured parameter support, got promoted to beta. This allows the kube-scheduler and Cluster Autoscaler to simulate claim allocation directly, without needing a third-party driver. These components can now predict whether resource requests can be fulfilled based on the cluster's current state without actually committing to the allocation. By eliminating the need for a third-party driver to validate or test allocations, this feature improves planning and decision-making for resource distribution, making the scheduling and scaling processes more efficient.

This work was done as a part of KEP #4381, by WG Device Management (a cross functional team containing SIG Node, SIG Scheduling and SIG Autoscaling).

Label and field selector authorization

Label and field selectors can be used in authorization decisions. The node authorizer automatically takes advantage of this to limit nodes to list or watch their pods only. Webhook authorizers can be updated to limit requests based on the label or field selector used.

This work was done as part of KEP #4601 by SIG Auth.

Highlights of new features in Alpha

This is a selection of key improvements introduced as alpha features in the v1.32 release.

Asynchronous preemption in the Kubernetes Scheduler

The Kubernetes scheduler has been enhanced with Asynchronous Preemption, a feature that improves scheduling throughput by handling preemption operations asynchronously. Preemption ensures higher-priority pods get the resources they need by evicting lower-priority ones, but this process previously involved heavy operations like API calls to delete pods, slowing down the scheduler. With this enhancement, such tasks are now processed in parallel, allowing the scheduler to continue scheduling other pods without delays. This improvement is particularly beneficial in clusters with high Pod churn or frequent scheduling failures, ensuring a more efficient and resilient scheduling process.

This work was done as a part of KEP #4832 by SIG Scheduling.

Mutating admission policies using CEL expressions

This feature leverages CEL's object instantiation and JSON Patch strategies, combined with Server Side Apply’s merge algorithms. It simplifies policy definition, reduces mutation conflicts, and enhances admission control performance while laying a foundation for more robust, extensible policy frameworks in Kubernetes.

The Kubernetes API server now supports Common Expression Language (CEL)-based Mutating Admission Policies, providing a lightweight, efficient alternative to mutating admission webhooks. With this enhancement, administrators can use CEL to declare mutations like setting labels, defaulting fields, or injecting sidecars with simple, declarative expressions. This approach reduces operational complexity, eliminates the need for webhooks, and integrates directly with the kube-apiserver, offering faster and more reliable in-process mutation handling.

This work was done as a part of KEP #3962 by SIG API Machinery.

Pod-level resource specifications

This enhancement sim

·kubernetes.io·
Kubernetes v1.32: Penelope
What the EU’s new software legislation means for developers
What the EU’s new software legislation means for developers
The EU Cyber Resilience Act will introduce new cybersecurity requirements for software released in the EU. Learn what it means for your open source projects and what GitHub is doing to ensure the law will be a net win for open source maintainers.
·github.blog·
What the EU’s new software legislation means for developers
Tech predictions for 2025 and beyond
Tech predictions for 2025 and beyond
We've entered an era of unprecedented societal challenges and rapid technological advancements. Harnessing technology for good has become both an ethical imperative and a profitable endeavor. These are the areas where I see technology shaping society in 2025 and beyond—and it all starts with mission-driven work.
Proem
·allthingsdistributed.com·
Tech predictions for 2025 and beyond
The 6 Mistakes You’re Going to Make as a New Manager
The 6 Mistakes You’re Going to Make as a New Manager
Transitioning from an individual contributor to a manager is tough but rewarding. The key is to delegate, find new sources of fulfillment, focus on quality over quantity, maintain proper engagement…
·terriblesoftware.org·
The 6 Mistakes You’re Going to Make as a New Manager
Last Week in Kubernetes Development - Week Ending December 8 2024
Last Week in Kubernetes Development - Week Ending December 8 2024

Week Ending December 8, 2024

https://lwkd.info/2024/20241210

Developer News

Marko Mudrinic was nominated as TL of SIG K8s-Infra, and Mario Fahlandt to co-chair SIG-ContribEx.

Release Schedule

It’s 1.32 Release Week! Just to make sure you noticed the release team put out an extra Release Candidate (also to fix two release blocking issues). With that, here’s a little taste of the new/alpha features in 1.32 according to the Enhancements Board:

Mutating Admission Policies based on CEL

Allow splitting stdout and stderr in container log stream

Resource limits at the pod level

The Topology scheduler knows about shared L3 caches

Statusz page and Flagz page for all core components

Fine-grained Node API authorizations

Supporting external signers for service account tokens

Windows gets CPU and Memory affinity

CBOR data format as a JSON alternative

Of course, there are tons more enhancements, and 30 features are graduating to Beta or Stable. Find out more, and download and try, when Kubernetes 1.32 comes out tommorrow.

In the meantime, we have a bunch of patch releases; 1.29.12, 1.30.8, 1.31.4 are now available, mainly containing a golang update.

Shoutouts

See the 2024 Kuberentes Contributor Award Recipients.

SIG Node wants to shoutout people who contributed extra time and effort for the 1.32 release coordination. SIG Node is a leader on number of KEPs proposed and merged every release and we were addressing feedback from previous releases by introducing a new (currently informal) role - KEPs wranglers. Please join me in thanking the wranglers: @Adrian Reber, @fromani, @haircommander, @Kevin Hannon, @Sohan, @Sreeram Venkitesh. And the approvers: @dawnchen, @derekwaynecarr, @klueska, @mrunalp, @Sergey Kanzhelev, @tallclair, @yujuhong

Ben gives a heartfelt thank you to @neolit123 for all of your help and contributions over the years. #kubeadm, #kind and more owe you a great debt. Thank you!

via Last Week in Kubernetes Development https://lwkd.info/

December 10, 2024 at 05:00PM

·lwkd.info·
Last Week in Kubernetes Development - Week Ending December 8 2024
MC LR Router and GoCast unpatched vulnerabilities
MC LR Router and GoCast unpatched vulnerabilities
Cisco Talos' Vulnerability Research team recently discovered two vulnerabilities in MC Technologies LR Router and three vulnerabilities in the GoCast service.  These vulnerabilities have not been patched at time of this posting.  For Snort coverage that can detect the exploitation of these vulnerabilities, download the latest rule sets from Snort.
·blog.talosintelligence.com·
MC LR Router and GoCast unpatched vulnerabilities
Your lying virtual eyes
Your lying virtual eyes
Well, who you gonna believe, me or your own eyes? – Chico Marx (dressed as Groucho), from Duck Soup: In the ACM Queue article Above the Line, Below the Line, the late safety researcher Richar…
·surfingcomplexity.blog·
Your lying virtual eyes
Ep02 - Ask Me Anything about DevOps, Cloud, Kubernetes, Platform Engineering,... w/Scott Rosenberg
Ep02 - Ask Me Anything about DevOps, Cloud, Kubernetes, Platform Engineering,... w/Scott Rosenberg

Ask Me Anything about DevOps, Cloud, Kubernetes, Platform Engineering,... with Scott Rosenberg

There are no restrictions in this AMA session. You can ask anything about DevOps, Cloud, Kubernetes, Platform Engineering, containers, or anything else. We'll have a special guest Scott Rosenberg to help us out.

▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬ ➡ BlueSky: https://vfarcic.bsky.social ➡ LinkedIn: https://www.linkedin.com/in/viktorfarcic/

▬▬▬▬▬▬ 🚀 Other Channels 🚀 ▬▬▬▬▬▬ 🎤 Podcast: https://www.devopsparadox.com/ 💬 Live streams: https://www.youtube.com/c/DevOpsParadox

via YouTube https://www.youtube.com/watch?v=jbVDksQo8KI

·youtube.com·
Ep02 - Ask Me Anything about DevOps, Cloud, Kubernetes, Platform Engineering,... w/Scott Rosenberg
Exploring multi-tenancy for my Kubernetes learning platform with Stefan Roman
Exploring multi-tenancy for my Kubernetes learning platform with Stefan Roman

Exploring multi-tenancy for my Kubernetes learning platform, with Stefan Roman

https://kube.fm/multi-tenancy-stefan

Stefan Roman shares his experience building Labs4Grabs, a platform that gives students root access to Kubernetes clusters. He discusses the journey from evaluating simple namespace-based isolation to implementing full VM-based isolation with KubeVirt.

You will learn:

Why namespace isolation isn't sufficient for untrusted users and the limitations of tools like vCluster when running privileged workloads.

How to use KubeVirt to achieve complete workload isolation and the trade-offs.

Practical approaches to implementing network security with NetworkPolicies and managing resource allocation across multiple student environments.

Follow Stefan's journey from simple to complex isolation strategies, focusing on the technical decisions and trade-offs he encountered.

Sponsor

This episode is sponsored by Kusari — gain complete visibility into your software components and secure your supply chain through comprehensive tracking and analysis.

More info

Find all the links and info for this episode here: https://kube.fm/multi-tenancy-stefan

Interested in sponsoring an episode? Learn more.

via KubeFM https://kube.fm

December 10, 2024 at 05:00AM

·kube.fm·
Exploring multi-tenancy for my Kubernetes learning platform with Stefan Roman
Webb telescope confirms the universe is expanding at an unexpected rate
Webb telescope confirms the universe is expanding at an unexpected rate
Fresh corroboration of the perplexing observation that the universe is expanding more rapidly than expected has scientists pondering the cause - perhaps some unknown factor involving the mysterious cosmic components dark energy and dark matter.
·apple.news·
Webb telescope confirms the universe is expanding at an unexpected rate
Us agencies brief house chinese salt typhoon telecom hacking 2024 12 09
Us agencies brief house chinese salt typhoon telecom hacking 2024 12 09
U.S. government agencies will hold a classified briefing for the House of Representatives on Tuesday on China's alleged efforts known as Salt Typhoon to infiltrate American telecommunications companies and steal data about U.S. calls, officials said on Monday.
·reuters.com·
Us agencies brief house chinese salt typhoon telecom hacking 2024 12 09
The Biggest Shell Programs in the World
The Biggest Shell Programs in the World
Oils is our upgrade path from bash to a better language and runtime. It's also for Python and JavaScript users who avoid shell! - oils-for-unix/oils
·github.com·
The Biggest Shell Programs in the World