Configure Request and Limit of Resources (CPUs & Memory)
Hai semuanya, di materi kali ini kita membahas tentang Resources Limit and Request pada suatu containers dalam Pod. Diantaranya:
- What is Resource Request and Limit?
- What is Resource types?
- What is Resource units?
- How Kubernetes applies resource requests and limits?
- Install prerequisite, Before you begin
- Specify a memory request and a memory limit
- Exceed a Container’s memory limit
- Specify a memory request that is too big for your Nodes
- If you do not specify a memory limit?
- Specify a CPU request and a CPU limit
- Specify a CPU request that is too big for your Nodes
- If you do not specify a CPU limit?
- Motivation for requests and limits
Ok langsung aja kita bahas materi yang pertama
What is Resource Request and Limit?
When you specify a Pod, you can optionally specify how much of each resource a container needs. The most common resources to specify are CPU and memory (RAM); there are others.
When you specify the resource request for containers in a Pod, the kube-scheduler uses this information to decide which node to place the Pod on. When you specify a resource limit for a container, the kubelet enforces those limits so that the running container is not allowed to use more of that resource than the limit you set. The kubelet also reserves at least the request amount of that system resource specifically for that container to use.
If the node where a Pod is running has enough of a resource available, it’s possible (and allowed) for a container to use more resource than its request
for that resource specifies. However, a container is not allowed to use more than its resource limit
.
For example, if you set a memory request of 256 MiB
for a container, and that container is in a Pod scheduled to a Node with 8GiB
of memory and no other Pods, then the container can try to use more RAM.
If you set a memory limit of 4GiB
for that container, the kubelet (and container runtime) enforce the limit. The runtime prevents the container from using more than the configured resource limit. For example: when a process in the container tries to consume more than the allowed amount of memory, the system kernel terminates the process that attempted the allocation, with an out of memory (OOM) error.
Limits can be implemented either reactively (the system intervenes once it sees a violation) or by enforcement (the system prevents the container from ever exceeding the limit). Different runtimes can have different ways to implement the same restrictions.
What is Resource types?
CPU and memory are each a resource type. A resource type has a base unit. CPU represents compute processing and is specified in units of Kubernetes CPUs. Memory is specified in units of bytes. For Linux workloads, you can specify huge page resources. Huge pages are a Linux-specific feature where the node kernel allocates blocks of memory that are much larger than the default page size.
For example, on a system where the default page size is 4KiB
, you could specify a limit, hugepages-2Mi: 80Mi
. If the container tries allocating over 40 2MiB
huge pages (a total of 80 MiB), that allocation fails.
CPU and memory are collectively referred to as compute resources, or resources. Compute resources are measurable quantities that can be requested, allocated, and consumed. They are distinct from API resources. API resources, such as Pods and Services are objects that can be read and modified through the Kubernetes API server.
For each container, you can specify resource limits and requests, including the following:
Although you can only specify requests and limits for individual containers, it is also useful to think about the overall resource requests and limits for a Pod. For a particular resource, a Pod resource request/limit is the sum of the resource requests/limits of that type for each container in the Pod.
What is Resource units?
Limits and requests for CPU resources are measured in cpu units. In Kubernetes, 1 CPU unit is equivalent to 1 physical CPU core, or 1 virtual core, depending on whether the node is a physical host or a virtual machine running inside a physical machine.
Fractional requests are allowed. When you define a container with spec.containers[].resources.requests.cpu
set to 0.5
, you are requesting half as much CPU time compared to if you asked for 1.0
CPU. For CPU resource units, the quantity expression 0.1
is equivalent to the expression 100m
, which can be read as “one hundred millicpu”. Some people say “one hundred millicores”, and this is understood to mean the same thing.
CPU resource is always specified as an absolute amount of resource, never as a relative amount. For example, 500m
CPU represents the roughly same amount of computing power whether that container runs on a single-core, dual-core, or 48-core machine.
Limits and requests for memory
are measured in bytes. You can express memory as a plain integer or as a fixed-point number using one of these quantity suffixes: E
, P
, T
, G
, M
, k
. You can also use the power-of-two equivalents: Ei
, Pi
, Ti
, Gi
, Mi
, Ki
. For example, the following represent roughly the same value:
Pay attention to the case of the suffixes. If you request 400m
of memory, this is a request for 0.4 bytes
. Someone who types that probably meant to ask for 400 mebibytes (400Mi)
or 400 megabytes (400M)
.
How Kubernetes applies resource requests and limits?
When the kubelet starts a container as part of a Pod, the kubelet passes that container’s requests and limits for memory and CPU to the container runtime.
On Linux, the container runtime typically configures kernel cgroups that apply and enforce the limits you defined.
- The CPU limit defines a hard ceiling on how much CPU time that the container can use. During each scheduling interval (time slice), the Linux kernel checks to see if this limit is exceeded; if so, the kernel waits before allowing that cgroup to resume execution.
- The CPU request typically defines a weighting. If several different containers (cgroups) want to run on a contended system, workloads with larger CPU requests are allocated more CPU time than workloads with small requests.
- The memory request is mainly used during (Kubernetes) Pod scheduling. On a node that uses
cgroups
v2, the container runtime might use the memory request as a hint to setmemory.min
andmemory.low
. - The memory limit defines a memory limit for that cgroup. If the container tries to allocate more memory than this limit, the Linux kernel out-of-memory subsystem activates and, typically, intervenes by stopping one of the processes in the container that tried to allocate memory. If that process is the container’s PID 1, and the container is marked as restartable, Kubernetes restarts the container.
- The memory limit for the Pod or container can also apply to pages in memory backed volumes, such as an
emptyDir
. The kubelet tracks tmpfsemptyDir
volumes as container memory use, rather than as local ephemeral storage.
If a container exceeds its memory request and the node that it runs on becomes short of memory overall, it is likely that the Pod the container belongs to will be evicted.
A container might or might not be allowed to exceed its CPU limit for extended periods of time. However, container runtimes don’t terminate Pods or containers for excessive CPU usage.
The kubelet reports the resource usage of a Pod as part of the Pod status
.
Install prerequisite, Before you begin
You need to have a Kubernetes cluster, and the kubectl command-line tool must be configured to communicate with your cluster. If you do not already have a cluster, you can create one by using minikube or you can use one of these Kubernetes playgrounds.
Each node in your cluster must have at least 300 MiB
of memory. A few of the steps on this page require you to run the metrics-server service in your cluster. If you have the metrics-server running, you can skip those steps.
If you are running Minikube, run the following command to enable the metrics-server:
To see whether the metrics-server is running, or another provider of the resource metrics API (metrics.k8s.io
), run the following command:
If the resource metrics API is available, the output includes a reference to metrics.k8s.io
.
And make sure, you have to access kubectl top node
and kubectl top pod
:
➜ kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
minikube 213m 10% 694Mi 17%
minikube-m02 51m 2% 169Mi 4%
➜ kubectl run webapp --image nginx --port 80
pod/webapp created
➜ kubectl get pod
NAME READY STATUS RESTARTS AGE
webapp 1/1 Running 0 9s
➜ kubectl top pod webapp
NAME CPU(cores) MEMORY(bytes)
webapp 0m 0Mi
Specify a memory request and a memory limit
To specify a memory request for a Container, include the resources:requests
field in the Container’s resource manifest. To specify a memory limit, include resources:limits
. For example, you create a Pod that has one Container. The Container has a memory request of 100 MiB
and a memory limit of 200 MiB
. Here’s the configuration file for the Pod:
Jika dijalankan hasilnya seperti berikut:
➜ kubectl apply -f .\02-workloads\01-pod\pod-resource-memory.yaml
pod/pod-resource-memory created
➜ kubectl get pod
NAME READY STATUS RESTARTS AGE
pod-resource-memory 1/1 Running 0 90s
➜ kubectl describe pod pod-resource-memory
Name: pod-resource-memory
Namespace: default
Priority: 0
Node: minikube-m02/192.168.49.3
Start Time: Tue, 07 Jun 2022 05:43:20 +0700
Labels: app=pod-resource-memory
Annotations: <none>
Status: Running
IP: 10.244.1.2
IPs:
IP: 10.244.1.2
Containers:
pod-resource-memory:
Image: polinux/stress
Port: <none>
Host Port: <none>
Command:
stress
Args:
--vm
1
--vm-bytes
150M
--vm-hang
1
State: Running
Started: Tue, 07 Jun 2022 05:43:28 +0700
Ready: True
Restart Count: 0
Limits:
memory: 200Mi
Requests:
memory: 100Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-dlp2d (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m4s default-scheduler Successfully assigned default/pod-resource-memory to minikube-m02
Normal Pulling 2m4s kubelet Pulling image "polinux/stress"
Normal Pulled 117s kubelet Successfully pulled image "polinux/stress" in 6.9189638s
Normal Created 117s kubelet Created container pod-resource-memory
Normal Started 117s kubelet Started container pod-resource-memory
➜ kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
minikube 98m 4% 1285Mi 32%
minikube-m02 28m 1% 435Mi 11%
➜ kubectl top pod pod-resource-memory
NAME CPU(cores) MEMORY(bytes)
pod-resource-memory 16m 150Mi
The output shows that the Pod is using about 150 MiB
. This is greater than the Pod’s 100 MiB
request, but within the Pod’s 200 MiB
limit.
Exceed a Container’s memory limit
A Container can exceed its memory request
if the Node has memory available. But a Container is not allowed to use more than its memory limit
. If a Container allocates more memory than its limit, the Container becomes a candidate for termination. If the Container continues to consume memory beyond its limit, the Container is terminated. If a terminated Container can be restarted, the kubelet restarts it, as with any other type of runtime failure.
In this exercise, you create a Pod that attempts to allocate more memory than its limit. Here is the configuration file for a Pod that has one Container with a memory request of 50 MiB
and a memory limit of 100 MiB
:
In the args section of the configuration file, you can see that the Container will attempt to allocate 250 MiB
of memory, which is well above the 100 MiB
limit.
Jika kita coba jalankan hasilnya seperti berikut:
➜ kubectl apply -f 02-workloads/01-pod/pod-resource-memory-more-limit.yaml
pod/pod-resource-memory-more-limit created
➜ kubectl get pod
NAME READY STATUS RESTARTS AGE
pod-resource-memory-more-limit 0/1 CrashLoopBackOff 5 (19s ago) 3m43s
➜ kubectl describe pod
Name: pod-resource-memory-more-limit
Namespace: default
Priority: 0
Node: minikube/192.168.59.101
Start Time: Sat, 11 Jun 2022 09:59:23 +0700
Labels: <none>
Annotations: <none>
Status: Running
IP: 172.17.0.4
IPs:
IP: 172.17.0.4
Containers:
pod-resource-memory-more-limit:
Image: polinux/stress
Port: <none>
Host Port: <none>
Command:
stress
Args:
--vm
1
--vm-bytes
250M
--vm-hang
1
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: OOMKilled
Exit Code: 1
Started: Sat, 11 Jun 2022 10:02:47 +0700
Finished: Sat, 11 Jun 2022 10:02:47 +0700
Ready: False
Restart Count: 5
Limits:
memory: 100Mi
Requests:
memory: 50Mi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-cg4sj (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-cg4sj:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m6s default-scheduler Successfully assigned default/pod-resource-memory-more-limit to minikube
Normal Pulled 3m57s kubelet Successfully pulled image "polinux/stress" in 7.376668096s
Normal Pulled 3m53s kubelet Successfully pulled image "polinux/stress" in 3.19955887s
Normal Pulled 3m36s kubelet Successfully pulled image "polinux/stress" in 3.142850674s
Normal Created 3m10s (x4 over 3m57s) kubelet Created container pod-resource-memory-more-limit
Normal Started 3m10s (x4 over 3m57s) kubelet Started container pod-resource-memory-more-limit
Normal Pulled 3m10s kubelet Successfully pulled image "polinux/stress" in 3.17531734s
Warning BackOff 2m34s (x8 over 3m52s) kubelet Back-off restarting failed container
Normal Pulling 2m19s (x5 over 4m5s) kubelet Pulling image "polinux/stress"
The output shows that the Container was killed because it is out of memory (OOM):
Last State: Terminated
Reason: OOMKilled
Exit Code: 1
Started: Sat, 11 Jun 2022 10:02:47 +0700
Finished: Sat, 11 Jun 2022 10:02:47 +0700
The output shows that the Container is killed, restarted, killed again, restarted again, and so on:
➜ kubectl get pod -w
NAME READY STATUS RESTARTS AGE
pod-resource-memory-more-limit 0/1 CrashLoopBackOff 1 (10s ago) 24s
pod-resource-memory-more-limit 0/1 OOMKilled 2 (20s ago) 34s
pod-resource-memory-more-limit 0/1 CrashLoopBackOff 2 (11s ago) 45s
pod-resource-memory-more-limit 0/1 OOMKilled 3 (30s ago) 64s
pod-resource-memory-more-limit 0/1 CrashLoopBackOff 3 (14s ago) 78s
cleanup:
kubectl delete pod --all
Specify a memory request that is too big for your Nodes
Memory requests and limits are associated with Containers, but it is useful to think of a Pod as having a memory request and limit. The memory request for the Pod is the sum of the memory requests for all the Containers in the Pod. Likewise, the memory limit for the Pod is the sum of the limits of all the Containers in the Pod.
Pod scheduling is based on requests. A Pod is scheduled to run on a Node only if the Node has enough available memory to satisfy the Pod’s memory request.
In this exercise, you create a Pod that has a memory request so big that it exceeds the capacity of any Node in your cluster. Here is the configuration file for a Pod that has one Container with a request for 2 GiB
of memory, which likely exceeds the capacity of any Node in your cluster:
Jika dijalankan seperti berikut:
➜ kubectl apply -f 02-workloads/01-pod/pod-resource-memory-request-big-then-node.yaml
pod/pod-resource-memory-more-limit created
➜ kubectl get pod
NAME READY STATUS RESTARTS AGE
pod-resource-memory-more-limit 0/1 Pending 0 15s
➜ kubectl get pod
NAME READY STATUS RESTARTS AGE
pod-resource-memory-more-limit 0/1 Pending 0 15s
➜ kubectl describe pod pod-resource-memory-more-limit
Name: pod-resource-memory-more-limit
Namespace: default
Priority: 0
Node: <none>
Labels: <none>
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Containers:
pod-resource-memory-more-limit:
Image: polinux/stress
Port: <none>
Host Port: <none>
Command:
stress
Args:
--vm
1
--vm-bytes
250M
--vm-hang
1
Requests:
memory: 2Gi
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-9wpjn (ro)
Conditions:
Type Status
PodScheduled False
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 34s default-scheduler 0/1 nodes are available: 1 Insufficient memory
The output shows that the Pod status is PENDING. That is, the Pod is not scheduled to run on any Node, and it will remain in the PENDING state indefinitely
If you do not specify a memory limit?
If you do not specify a memory limit for a Container, one of the following situations applies:
- The Container has no upper bound on the amount of memory it uses. The Container could use all of the memory available on the Node where it is running which in turn could invoke the OOM Killer. Further, in case of an OOM Kill, a container with no resource limits will have a greater chance of being killed.
- The Container is running in a namespace that has a default memory limit, and the Container is automatically assigned the default limit. Cluster administrators can use a LimitRange to specify a default value for the memory limit.
Jika dijalankan seperti berikut:
➜ kubectl apply -f 02-workloads/01-pod/pod-resource-memory-no-limit.yaml
pod/pod-resource-memory-no-limit created
➜ kubectl get pod
NAME READY STATUS RESTARTS AGE
pod-resource-memory-no-limit 1/1 Running 0 15s
➜ kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
latihan 281m 14% 1593Mi 80%
➜ kubectl top pod pod-resource-memory-no-limit
NAME CPU(cores) MEMORY(bytes)
pod-resource-memory-no-limit 106m 251Mi
Specify a CPU request and a CPU limit
This page shows how to assign a CPU request and a CPU limit to a container. Containers cannot use more CPU than the configured limit. Provided the system has CPU time free, a container is guaranteed to be allocated as much CPU as it requests.
Your cluster must have at least 1 CPU available for use to run the task examples:
In this exercise, you create a Pod that has one container. The container has a request of 0.5 CPU
and a limit of 1 CPU
. Here is the configuration file for the Pod:
The args section of the configuration file provides arguments for the container when it starts. The -cpus "2"
argument tells the Container to attempt to use 2 CPUs
.
Jika dijalankan hasilnya seperti berikut:
➜ kubectl apply -f 02-workloads/01-pod/pod-resource-cpu-unit.yaml
pod/pod-resource-cpu created
➜ kubectl get pod
NAME READY STATUS RESTARTS AGE
pod-resource-cpu 1/1 Running 0 9s
➜ kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
latihan 1156m 57% 1350Mi 68%
➜ kubectl top pod pod-resource-cpu
NAME CPU(cores) MEMORY(bytes)
pod-resource-cpu 999m 1Mi
Recall that by setting -cpu "2"
, you configured the Container to attempt to use 2 CPUs, but the Container is only being allowed to use about 1 CPU
. The container’s CPU use is being throttled, because the container is attempting to use more CPU resources than its limit.
Specify a CPU request that is too big for your Nodes
CPU requests and limits are associated with Containers, but it is useful to think of a Pod as having a CPU request and limit. The CPU request for a Pod is the sum of the CPU requests for all the Containers in the Pod. Likewise, the CPU limit for a Pod is the sum of the CPU limits for all the Containers in the Pod.
Pod scheduling is based on requests. A Pod is scheduled to run on a Node only if the Node has enough CPU resources available to satisfy the Pod CPU request.
In this exercise, you create a Pod that has a CPU request so big that it exceeds the capacity of any Node in your cluster. Here is the configuration file for a Pod that has one Container.
Jika dijalankan seperti berikut:
➜ kubectl apply -f 02-workloads/01-pod/pod-resource-cpu-more-than-nodes.yaml
pod/pod-resource-cpu-more-than-node created
➜ kubectl get pod
NAME READY STATUS RESTARTS AGE
pod-resource-cpu-more-than-node 0/1 Pending 0 9s
➜ kubectl describe pod pod-resource-cpu-more-than-node
Name: pod-resource-cpu-more-than-node
Namespace: default
Priority: 0
Node: <none>
Labels: <none>
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Containers:
pod-1:
Image: vish/stress
Port: <none>
Host Port: <none>
Args:
-cpus=2
Limits:
cpu: 5
Requests:
cpu: 3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-rz4cm (ro)
Conditions:
Type Status
PodScheduled False
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 30s default-scheduler 0/1 nodes are available: 1 Insufficient cpu.
The output shows that the Pod status is Pending. That is, the Pod has not been scheduled to run on any Node, and it will remain in the Pending state indefinitely
If you do not specify a CPU limit or limit?
If you do not specify a CPU limit for a Container, then one of these situations applies:
- The Container has no upper bound on the CPU resources it can use. The Container could use all of the CPU resources available on the Node where it is running.
- The Container is running in a namespace that has a default CPU limit, and the Container is automatically assigned the default limit. Cluster administrators can use a LimitRange to specify a default value for the CPU limit.
If you specify a CPU limit for a Container but do not specify a CPU request, Kubernetes automatically assigns a CPU request that matches the limit. Similarly, if a Container specifies its own memory limit, but does not specify a memory request, Kubernetes automatically assigns a memory request that matches the limit.
Motivation for requests and limits
By configuring the CPU & Memory requests and limits of the Containers that run in your cluster, you can make efficient use of the resources available on your cluster Nodes. By keeping a Pod CPU & Memory request low, you give the Pod a good chance of being scheduled. By having a CPU & Memory limit that is greater than the CPU & Memory request, you accomplish two things:
- The Pod can have bursts of activity where it makes use of CPU and memory resources that happen to be available.
- The amount of CPU & memory resources a Pod can use during a burst is limited to some reasonable amount.