Workload
A workload is an application that will run to completion. It can be composed by one or multiple Pods that, loosely or tightly coupled, as a whole, complete a task. A workload is the unit of admission in Kueue.
The prototypical workload can be represented with a
Kubernetes batch/v1.Job
.
For this reason, we sometimes use the word job to refer to any workload, and
Job when we refer specifically to the Kubernetes API.
However, Kueue does not directly manipulate Job objects. Instead, Kueue manages Workload objects that represent the resource requirements of an arbitrary workload. Kueue automatically creates a Workload for each Job object and syncs the decisions and statuses.
The manifest for a Workload looks like the following:
apiVersion: kueue.x-k8s.io/v1beta1
kind: Workload
metadata:
name: sample-job
namespace: team-a
spec:
active: true
queueName: team-a-queue
podSets:
- count: 3
name: main
template:
spec:
containers:
- image: gcr.io/k8s-staging-perf-tests/sleep:latest
imagePullPolicy: Always
name: container
resources:
requests:
cpu: "1"
memory: 200Mi
restartPolicy: Never
Active
You can stop or resume a running workload by setting the Active field. The active field determines if a workload can be admitted into a queue or continue running, if already admitted.
Changing .spec.Active
from true to false will cause a running workload to be evicted and not be requeued.
Queue name
To indicate in which LocalQueue you want your Workload to be
enqueued, set the name of the LocalQueue in the .spec.queueName
field.
Pod sets
A Workload might be composed of multiple Pods with different pod specs.
Each item of the .spec.podSets
list represents a set of homogeneous Pods and has
the following fields:
spec
describes the pods using av1/core.PodSpec
.count
is the number of pods that use the samespec
.name
is a human-readable identifier for the pod set. You can use the role of the Pods in the Workload, likedriver
,worker
,parameter-server
, etc.
Resource requests
Kueue uses the podSets
resources requests to calculate the quota used by a Workload and decide if and when to admit a Workload.
Kueue calculates the total resources usage for a Workload as the sum of the resource requests for each podSet
. The resource usage of a podSet
is equal to the resource requests of the pod spec multiplied by the count
.
Requests values adjustment
Depending on the cluster setup, Kueue will adjust the resource usage of a Workload based on:
- The cluster defines default values in Limit Ranges, the default values will be used if not provided in the
spec
. - The created pods are subject of a Runtime Class Overhead.
- The spec defines only resource limits, case in which the limit values will be treated as requests.
Requests values validation
In cases when the cluster defines Limit Ranges, the values resulting from the adjustment above will be validated against the ranges.
Kueue will mark the workload as Inadmissible
if the range validation fails.
Reserved resource names
In addition to the usual resource naming restrictions, you cannot use the pods
resource name in a Pod spec, as it is reserved for internal Kueue use. You can use the pods
resource name in a ClusterQueue to set quotas on the maximum number of pods.
Priority
Workloads have a priority that influences the order in which they are admitted by a ClusterQueue. There are two ways to set the Workload priority:
-
Pod Priority: You can see the priority of the Workload in the field
.spec.priority
. For abatch/v1.Job
, Kueue sets the priority of the Workload based on the pod priority of the Job’s pod template. -
WorkloadPriority: Sometimes developers would like to control workload’s priority without affecting pod’s priority. By using
WorkloadPriority
, you can independently manage the priority of workloads for queuing and preemption, separate from pod’s priority.
Custom Workloads
As described previously, Kueue has built-in support for workloads created with the Job API. But any custom workload API can integrate with Kueue by creating a corresponding Workload object for it.
Dynamic Reclaim
It’s a mechanism allowing a currently Admitted workload to release a part of it’s Quota Reservation that is no longer needed.
Job integrations communicate this information by setting the reclaimablePods
status field, enumerating the number of pods per podset for which the Quota Reservation is no longer needed.
status:
reclaimablePods:
- name: podset1
count: 2
- name: podset2
count: 2
The count
can only increase while the workload holds a Quota Reservation.
All-or-nothing semantics for Job Resource Assignment
This mechanism allows a Job to be evicted and re-queued if the job doesn’t become ready. Please refer to the All-or-nothing with ready Pods for more details.
Exponential Backoff Requeueing
Once evictions with PodsReadyTimeout
reasons occur, a Workload will be re-queued with backoff.
The Workload status allows you to know the following:
.status.requeueState.count
indicates the numbers of times a Workload has already been backoff re-queued by Eviction with PodsReadyTimeout reason.status.requeueState.requeueAt
indicates the time when a Workload will be re-queued the next time
status:
requeueState:
count: 5
requeueAt: 2024-02-11T04:51:03Z
When a Workload deactivated by All-or-nothing with ready Pods is re-activated,
the requeueState (.status.requeueState
) will be reset to null.
Replicate labels from Jobs into Workloads
You can configure Kueue to copy labels, at Workload creation, into the new Workload from the underlying Job or Pod objects. This can be useful for Workload identification and debugging.
You can specify which labels should be copied by setting the labelKeysToCopy
field in the configuration API (under integrations
). By default, Kueue does not copy any Job or Pod label into the Workload.
Maximum execution time
You can configure a Workload’s maximum execution time by specifying the expected maximum number of seconds for it to run in:
spec:
maximumExecutionTimeSeconds: n
If the workload spends more then n
seconds in Admitted
state, including the time spent as Admitted
in previous “Admit/Evict” cycles, it gets automatically deactivated.
Once deactivated, the accumulated time spent as active in previous “Admit/Evict” cycles is set to 0.
If maximumExecutionTimeSeconds
is not specified, the workload has no execution time limit.
You can configure the maximumExecutionTimeSeconds
of the Workload associated with any supported Kueue Job by specifying the desired value as kueue.x-k8s.io/max-exec-time-seconds
label of the job.
What’s next
- Learn about workload priority class.
- Learn how to run jobs
- Read the API reference for
Workload
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.