Intel Releases Updated Version Of Its Open-Source Font For Developers
Intel is well regarded for their vast open-source contributions from being a major contributor to the Linux kernel and other areas like Mesa, GCC/glibc, and other key open-source projects to various niche projects like ConnMan and other smaller software projects
ccfos/nightingale: An enterprise-level cloud-native observability solution, which can be used as drop-in replacement of Prometheus for alerting and Grafana for visualization.
An enterprise-level cloud-native observability solution, which can be used as drop-in replacement of Prometheus for alerting and Grafana for visualization. - GitHub - ccfos/nightingale: An enterpri...
A successful Arm debut could breathe new life into a depressed IPO market that has paused many venture-backed startups’ plans to go public for the past two years.
Blog: Kubernetes 1.28: Node podresources API Graduates to GA
Author:
Francesco Romani (Red Hat)
The podresources API is an API served by the kubelet locally on the node, which exposes the compute resources exclusively
allocated to containers. With the release of Kubernetes 1.28, that API is now Generally Available.
What problem does it solve?
The kubelet can allocate exclusive resources to containers, like
CPUs, granting exclusive access to full cores
or memory, either regions or hugepages .
Workloads which require high performance, or low latency (or both) leverage these features.
The kubelet also can assign devices to containers .
Collectively, these features which enable exclusive assignments are known as "resource managers".
Without an API like podresources, the only possible option to learn about resource assignment was to read the state files the
resource managers use. While done out of necessity, the problem with this approach is the path and the format of these file are
both internal implementation details. Albeit very stable, the project reserves the right to change them freely.
Consuming the content of the state files is thus fragile and unsupported, and projects doing this are recommended to consider
moving to podresources API or to other supported APIs.
Overview of the API
The podresources API was initially proposed to enable device monitoring .
In order to enable monitoring agents, a key prerequisite is to enable introspection of device assignment, which is performed by the kubelet.
Serving this purpose was the initial goal of the API. The first iteration of the API only had a single function implemented, List ,
to return information about the assignment of devices to containers.
The API is used by multus CNI and by
GPU monitoring tools .
Since its inception, the podresources API increased its scope to cover other resource managers than device manager.
Starting from Kubernetes 1.20, the List API reports also CPU cores and memory regions (including hugepages); the API also
reports the NUMA locality of the devices, while the locality of CPUs and memory can be inferred from the system.
In Kubernetes 1.21, the API gained
the GetAllocatableResources function.
This newer API complements the existing List API and enables monitoring agents to determine the unallocated resources,
thus enabling new features built on top of the podresources API like a
NUMA-aware scheduler plugin .
Finally, in Kubernetes 1.27, another function, Get was introduced to be more friendly with CNI meta-plugins, to make it simpler to access resources
allocated to a specific pod, rather than having to filter through resources for all pods on the node. The Get function is currently alpha level.
Consuming the API
The podresources API is served by the kubelet locally, on the same node on which is running.
On unix flavors, the endpoint is served over a unix domain socket; the default path is /var/lib/kubelet/pod-resources/kubelet.sock .
On windows, the endpoint is served over a named pipe; the default path is npipe://\\.\pipe\kubelet-pod-resources .
In order for the containerized monitoring application consume the API, the socket should be mounted inside the container.
A good practice is to mount the directory on which the podresources socket endpoint sits rather than the socket directly.
This will ensure that after a kubelet restart, the containerized monitor application will be able to re-connect to the socket.
An example manifest for a hypothetical monitoring agent consuming the podresources API and deployed as a DaemonSet could look like:
apiVersion : apps/v1
kind : DaemonSet
metadata :
name : podresources-monitoring-app
namespace : monitoring
spec :
selector :
matchLabels :
name : podresources-monitoring
template :
metadata :
labels :
name : podresources-monitoring
spec :
containers :
- args :
- --podresources-socket=unix:///host-podresources/kubelet.sock
command :
- /bin/podresources-monitor
image : podresources-monitor:latest # just for an example
volumeMounts :
- mountPath : /host-podresources
name : host-podresources
serviceAccountName : podresources-monitor
volumes :
- hostPath :
path : /var/lib/kubelet/pod-resources
type : Directory
name : host-podresources
I hope you find it straightforward to consume the podresources API programmatically.
The kubelet API package provides the protocol file and the go type definitions; however, a client package is not yet available from the project,
and the existing code should not be used directly.
The recommended
approach is to reimplement the client in your projects, copying and pasting the related functions like for example
the multus project is doing .
When operating the containerized monitoring application consuming the podresources API, few points are worth highlighting to prevent "gotcha" moments:
Even though the API only exposes data, and doesn't allow by design clients to mutate the kubelet state, the gRPC request/response model requires
read-write access to the podresources API socket. In other words, it is not possible to limit the container mount to ReadOnly .
Multiple clients are allowed to connect to the podresources socket and consume the API, since it is stateless.
The kubelet has built-in rate limits to mitigate local Denial of Service attacks from
misbehaving or malicious consumers. The consumers of the API must tolerate rate limit errors returned by the server. The rate limit is currently
hardcoded and global, so misbehaving clients can consume all the quota and potentially starve correctly behaving clients.
Future enhancements
For historical reasons, the podresources API has a less precise specification than typical kubernetes APIs (such as the Kubernetes HTTP API, or the container runtime interface).
This leads to unspecified behavior in corner cases.
An effort is ongoing to rectify this state and to have a more precise specification.
The Dynamic Resource Allocation (DRA) infrastructure
is a major overhaul of the resource management.
The integration with the podresources API
is already ongoing.
An effort is ongoing to recommend or create a reference client package ready to be consumed.
Getting involved
This feature is driven by SIG Node .
Please join us to connect with the community and share your ideas and feedback around the above feature and
beyond. We look forward to hearing from you!
Why a Highly Mutated Coronavirus Variant Has Scientists on Alert
Research is under way to determine whether the mutation-laden lineage BA.2.86 is nothing to worry about — or has the potential to spread globally
soraro/kurt: A Kubernetes plugin that gives context to what is restarting in your Kubernetes cluster
A Kubernetes plugin that gives context to what is restarting in your Kubernetes cluster - soraro/kurt: A Kubernetes plugin that gives context to what is restarting in your Kubernetes cluster
Hotmail email delivery fails after Microsoft misconfigures DNS
Hotmail users worldwide have problems sending emails, with messages flagged as spam or not delivered after Microsoft misconfigured the domain's DNS SPF record.
It is fair to say that the DNF package manager is not the favorite tool of many Fedora users. It was brought in as a replacement for Yum but got off to a rather rocky start; DNF has stabilized over the years, though and the complaints have subsided. That can only mean one thing: it must be time to throw it away and start over from the beginning. The replacement, called DNF5, was slated to be a part of the Fedora 39 release, due in October, but that is not going to happen.
How Nvidia Built a Competitive Moat Around A.I. Chips (Gift Article)
The most visible winner of the artificial intelligence boom achieved its dominance by becoming a one-stop shop for A.I. development, from chips to software to other services.
Arm's full-year revenue fell 1% ahead of IPO - source
SoftBank Group Corp's Arm Ltd is expected to report a revenue decline of about 1% in the year ended March, when the chip designer reveals its initial public offering (IPO) filing on Monday, according to a person familiar with the matter.
Oh Bufferapp devs… | Posting via the Bluesky API | AT Protocol
The Bluesky post record type has many features, including replies, quote-posts, embedded social cards, mentions, and images. Here's some example code for all the common post formats.
Blog: Kubernetes 1.28: Improved failure handling for Jobs
Authors: Kevin Hannon (G-Research), Michał Woźniak (Google)
This blog discusses two new features in Kubernetes 1.28 to improve Jobs for batch
users: Pod replacement policy
and Backoff limit per index .
These features continue the effort started by the
Pod failure policy
to improve the handling of Pod failures in a Job.
Pod replacement policy
By default, when a pod enters a terminating state (e.g. due to preemption or
eviction), Kubernetes immediately creates a replacement Pod. Therefore, both Pods are running
at the same time. In API terms, a pod is considered terminating when it has a
deletionTimestamp and it has a phase Pending or Running .
The scenario when two Pods are running at a given time is problematic for
some popular machine learning frameworks, such as
TensorFlow and JAX , which require at most one Pod running at the same time,
for a given index.
Tensorflow gives the following error if two pods are running for a given index.
/job:worker/task:4: Duplicate task registration with task_name=/job:worker/replica:0/task:4
See more details in the (issue ).
Creating the replacement Pod before the previous one fully terminates can also
cause problems in clusters with scarce resources or with tight budgets, such as:
cluster resources can be difficult to obtain for Pods pending to be scheduled,
as Kubernetes might take a long time to find available nodes until the existing
Pods are fully terminated.
if cluster autoscaler is enabled, the replacement Pods might produce undesired
scale ups.
How can you use it?
This is an alpha feature, which you can enable by turning on JobPodReplacementPolicy
feature gate in
your cluster.
Once the feature is enabled in your cluster, you can use it by creating a new Job that specifies a
podReplacementPolicy field as shown here:
kind : Job
metadata :
name : new
...
spec :
podReplacementPolicy : Failed
...
In that Job, the Pods would only be replaced once they reached the Failed phase,
and not when they are terminating.
Additionally, you can inspect the .status.terminating field of a Job. The value
of the field is the number of Pods owned by the Job that are currently terminating.
kubectl get jobs/myjob -o= jsonpath = '{.items[*].status.terminating}'
3 # three Pods are terminating and have not yet reached the Failed phase
This can be particularly useful for external queueing controllers, such as
Kueue , that tracks quota
from running Pods of a Job until the resources are reclaimed from
the currently terminating Job.
Note that the podReplacementPolicy: Failed is the default when using a custom
Pod failure policy .
Backoff limit per index
By default, Pod failures for Indexed Jobs
are counted towards the global limit of retries, represented by .spec.backoffLimit .
This means, that if there is a consistently failing index, it is restarted
repeatedly until it exhausts the limit. Once the limit is reached the entire
Job is marked failed and some indexes may never be even started.
This is problematic for use cases where you want to handle Pod failures for
every index independently. For example, if you use Indexed Jobs for running
integration tests where each index corresponds to a testing suite. In that case,
you may want to account for possible flake tests allowing for 1 or 2 retries per
suite. There might be some buggy suites, making the corresponding
indexes fail consistently. In that case you may prefer to limit retries for
the buggy suites, yet allowing other suites to complete.
The feature allows you to:
complete execution of all indexes, despite some indexes failing.
better utilize the computational resources by avoiding unnecessary retries of consistently failing indexes.
How can you use it?
This is an alpha feature, which you can enable by turning on the
JobBackoffLimitPerIndex
feature gate
in your cluster.
Once the feature is enabled in your cluster, you can create an Indexed Job with the
.spec.backoffLimitPerIndex field specified.
Example
The following example demonstrates how to use this feature to make sure the
Job executes all indexes (provided there is no other reason for the early Job
termination, such as reaching the activeDeadlineSeconds timeout, or being
manually deleted by the user), and the number of failures is controlled per index.
apiVersion : batch/v1
kind : Job
metadata :
name : job-backoff-limit-per-index-execute-all
spec :
completions : 8
parallelism : 2
completionMode : Indexed
backoffLimitPerIndex : 1
template :
spec :
restartPolicy : Never
containers :
- name : example # this example container returns an error, and fails,
# when it is run as the second or third index in any Job
# (even after a retry)
image : python
command :
- python3
- -c
- |
import os, sys, time
id = int(os.environ.get("JOB_COMPLETION_INDEX"))
if id == 1 or id == 2:
sys.exit(1)
time.sleep(1)
Now, inspect the Pods after the job is finished:
kubectl get pods -l job-name= job-backoff-limit-per-index-execute-all
Returns output similar to this:
NAME READY STATUS RESTARTS AGE
job-backoff-limit-per-index-execute-all-0-b26vc 0/1 Completed 0 49s
job-backoff-limit-per-index-execute-all-1-6j5gd 0/1 Error 0 49s
job-backoff-limit-per-index-execute-all-1-6wd82 0/1 Error 0 37s
job-backoff-limit-per-index-execute-all-2-c66hg 0/1 Error 0 32s
job-backoff-limit-per-index-execute-all-2-nf982 0/1 Error 0 43s
job-backoff-limit-per-index-execute-all-3-cxmhf 0/1 Completed 0 33s
job-backoff-limit-per-index-execute-all-4-9q6kq 0/1 Completed 0 28s
job-backoff-limit-per-index-execute-all-5-z9hqf 0/1 Completed 0 28s
job-backoff-limit-per-index-execute-all-6-tbkr8 0/1 Completed 0 23s
job-backoff-limit-per-index-execute-all-7-hxjsq 0/1 Completed 0 22s
Additionally, you can take a look at the status for that Job:
kubectl get jobs job-backoff-limit-per-index-fail-index -o yaml
The output ends with a status similar to:
status :
completedIndexes : 0 ,3-7
failedIndexes : 1 ,2
succeeded : 6
failed : 4
conditions :
- message : Job has failed indexes
reason : FailedIndexes
status : "True"
type : Failed
Here, indexes 1 and 2 were both retried once. After the second failure,
in each of them, the specified .spec.backoffLimitPerIndex was exceeded, so
the retries were stopped. For comparison, if the per-index backoff was disabled,
then the buggy indexes would retry until the global backoffLimit was exceeded,
and then the entire Job would be marked failed, before some of the higher
indexes are started.
How can you learn more?
Read the user-facing documentation for Pod replacement policy ,
Backoff limit per index , and
Pod failure policy
Read the KEPs for Pod Replacement Policy ,
Backoff limit per index , and
Pod failure policy .
Getting Involved
These features were sponsored by SIG Apps . Batch use cases are actively
being improved for Kubernetes users in the
batch working group .
Working groups are relatively short-lived initiatives focused on specific goals.
The goal of the WG Batch is to improve experience for batch workload users, offer support for
batch processing use cases, and enhance the
Job API for common use cases. If that interests you, please join the working
group either by subscriping to our
mailing list or on
Slack .
Acknowledgments
As with any Kubernetes feature, multiple people contributed to getting this
done, from testing and filing bugs to reviewing code.
We would not have been able to achieve either of these features without Aldo
Culquicondor (Google) providing excellent domain knowledge and expertise
throughout the Kubernetes ecosystem.
How We Achieved Minimal Downtime During Our PostgreSQL Database Upgrade (English Version)
Hello everyone, I’m Kenny, a Backend Engineer from Dcard. Dcard is a social networking platform that allows everyone to share ideas with confidence, regardless of background, age, or interest. It is