Setting resource limits for your Kubernetes pods prevents a faulty container from affecting other workloads. With Kubernetes, you can limit resources, including CPU and memory usage. Pods can be terminated when their limits are exceeded, preserving the overall stability of the cluster.
Before defining limits, it’s worth noting how Kubernetes expresses resource availability.
CPU consumption is measured in terms of vCPUs used. A limit of
0.5 vCPUs indicates that the pod can consume half the available time of any of the available vCPUs. A vCPU is what you see advertised on the hosting pages of cloud providers. If you are using bare-metal hardware, this is one hyper-thread on your processor.
Memory is measured in bytes. You can specify it as an integer number of bytes, or a kinder amount, such as
Create a CPU limit
To add a CPU limit to pod containers, take it
resources:limits field in your container manifest:
apiVersion: v1 kind: Pod metadata: name: demo namespace: demo spec: containers: - name: my-container image: example/example resources: limits: cpu: "0.5"
The above example limits your containers to 0.5 vCPUs. They are limited so that they cannot consume more than half of the available CPU time within a 100 ms period.
Create a memory limit
Memory limits are created in a similar way. Change the
limits:cpu field in the manifest to
The container is limited to 512Mi RAM. Kubernetes still allows it to access more if the node it is planned on has overcapacity. Otherwise, exceeding the limit will result in the container being marked as a candidate for termination.
All Kubernetes nodes have some amount of temporary storage space available. This storage is used by pods to store caches and logs. The ephemeral storage pool is also where the Kubernetes cluster keeps container images.
You can set limits for the temporary storage usage of a pod. This is a beta feature intended to prevent a single pod’s cache from using the entire storage pool. Use the
limits:ephemeral-storage container manifest field:
limits: ephemeral-storage: "1Gi"
This container would now be limited to using 1Gi of available short-term storage. Pods that try to use more storage space are turned off. If a pod contains multiple containers, the pod will be deleted if the sum of the storage usage of all containers exceeds the total storage limit.
Kubernetes usually tracks storage usage by periodically scanning the node’s temporary storage file system. Then the storage usage of each pod and container is added up. There is optional support for file system storage quotas at OS level, allowing for more accurate monitoring.
You need a quota supported file system such as XFS or ext4. Make sure the file system is mounted with project quota tracking enabled, and then enable the
LocalStorageCapacityIsolationFSQuotaMonitoring attribute flag in
kubeletGuidelines for configuring this system are provided in the Kubernetes documentation.
Requests for Resources
In addition to resource limits, you can set resources to requestThese are available for CPU, memory and short term storage – change the
limits field to
requests in each of the above examples.
When you set up a resource request, specify how much of that resource you expect the container to use. Kubernetes takes this information into account when determining which node to schedule the pod to.
Using memory as an example, a
512Mi will cause the pod to be scheduled on a node with at least 512Mi of available memory. Availability is calculated by adding the memory requests from all existing pods on the node and subtracting that from the node’s total memory capacity.
A node is not eligible to host a new container if the sum of the workload requests, including the request from the new container, exceeds the available capacity. This remains the case even if real-time memory usage is actually very low. The available capacity has already been allocated to the existing containers to ensure that their requests can be fulfilled.
Unlike a limit, Kubernetes always allows containers to exceed their resource request. They can use any unused resource amounts that other containers have requested but are not currently being used.
Using requests and limits
The different behavior of requests and limits means that you should carefully consider the values you use. It is usually best to keep the number of requests low. Then you set the limits as high as possible without affecting the ability of your workloads to coexist.
Using a low value for resource requests gives your pods the best chance of being scheduled to a node. The scheduler has more flexibility in making allocation decisions because it is more likely that a particular node can host the container. The container gets direct access to all the redundant resources it needs, beyond the request, up to the specified limit.
Every request and every limit must be weighed to achieve the greatest effect. You need to consider the requests and limits of the other pods running in your cluster. Make sure you are aware of the total amount of resources being provided by your nodes so that you do not set limits that are either too high (risk of stability) or too low (waste of capacity).
You should always set resource limits for your Kubernetes workloads. Effective use of limits helps workloads coexist peacefully without compromising the health of your cluster.
This is especially important in the case of memory. Without limitations, a container with a faulty process can quickly consume all the memory offered by its node. Such a scenario where there is no memory could disable other pods scheduled for that node as the OS level memory manager would start killing processes to reduce memory usage.
By setting a memory limit, Kubernetes can terminate the container before it affects other workloads in the cluster, let alone external processes. You will lose your workload, but the overall cluster will gain stability.