Reserve resources for kubelet

During the operation of kubernetes cluster, kubelet needs to regularly send heartbeat to API server and report the operation status

In fact, there are great potential safety hazards here

Because kubelet and pods running on the node need to share CPU and ram resources, kubelet may not get enough resources to run itself in extreme cases

In other words, pods is likely to consume all the physical resources of the node, resulting in the node’s heartbeat stopping or untimely reporting. The API server may mistakenly believe that the node is offline
In an idealistic state, all pods deployed to this node should not consume 100% of physical resources
for example

100m CPU * 9, so in theory, 100m will remain
However, due to the extremely complex cluster environment, there is no guarantee that the scheduler will reserve resources for each node, and there is no guarantee that every authorized operator will notice this problem
Therefore, there is often resource overload, resulting in kubelet’s heartbeat stopping
The solution is very simple and crude

Deploy a “placeholder” application and allocate its requests CPU resources to 100m
Because there is no specific application for this placeholder application (such as nginx), it will not really consume CPU

Here, requests are easy to control
1000m single core can support
Normal application of 100m * 9, space occupying nginx of 100m * 1,
However, requests only affect container scheduling and CPU competition, and can not guarantee that CPU resources will not be exhausted
Set limits to
100m * 9 application, 100m * 1 space occupying nginx, this time is good, but limits can be oversold, so there is no guarantee that all administrators will set aside that 100m for you
Conclusion,
It’s best to use only limits, which can ensure that requests are consistent with limits by default
Because requests cannot be oversold, when nginx occupies 100m, the nine normal applications that only write limits cannot exceed 900mhttps://googleads.g.doubleclick.net/pagead/ads?client=ca-pub-7712570106118575&output=html&h=280&adk=2825196814&adf=4093481066&pi=t.aa~a.2603011395~i.46~rp.1&w=706&fwrn=4&fwrnh=100&lmt=1650205817&num_ads=1&rafmt=1&armr=3&sem=mc&pwprc=1923896069&psa=1&ad_type=text_image&format=706×280&url=https%3A%2F%2Fstudyk8s.com%2F2022%2F04%2F17%2F3766%2F&fwr=0&pra=3&rh=177&rw=706&rpe=1&resp_fmts=3&wgl=1&fa=27&uach=WyJXaW5kb3dzIiwiMTAuMC4wIiwieDg2IiwiIiwiMTAwLjAuNDg5Ni4xMjciLFtdLG51bGwsbnVsbCwiNjQiLFtbIiBOb3QgQTtCcmFuZCIsIjk5LjAuMC4wIl0sWyJDaHJvbWl1bSIsIjEwMC4wLjQ4OTYuMTI3Il0sWyJHb29nbGUgQ2hyb21lIiwiMTAwLjAuNDg5Ni4xMjciXV0sZmFsc2Vd&tt_state=W3siaXNzdWVyT3JpZ2luIjoiaHR0cHM6Ly9wYWdlYWQyLmdvb2dsZXN5bmRpY2F0aW9uLmNvbSIsInN0YXRlIjoyMCwiaGFzUmVkZW1wdGlvblJlY29yZCI6dHJ1ZX1d&dt=1650205817490&bpp=1&bdt=365&idt=1&shv=r20220413&mjsv=m202204110101&ptt=9&saldr=aa&abxe=1&cookie=ID%3De1e9e75d484277dc-22e2d1cebbd10034%3AT%3D1649940877%3ART%3D1649940877%3AS%3DALNI_MYLFH7MHMVP9tSJTFU30yFvzIVVCA&prev_fmts=0x0%2C294x245&nras=2&correlator=3409815582494&frm=20&pv=1&ga_vid=510179359.1616343974&ga_sid=1650205817&ga_hid=211755682&ga_fc=1&u_tz=480&u_his=9&u_h=1280&u_w=3072&u_ah=1240&u_aw=3072&u_cd=30&u_sd=1.25&dmc=8&adx=200&ady=1246&biw=1518&bih=1160&scr_x=0&scr_y=0&eid=44759875%2C44759926%2C44759837%2C44761793%2C31067083%2C31061828%2C21067496&oid=2&pvsid=2181360227943243&pem=820&tmod=767505671&uas=0&nvt=1&ref=https%3A%2F%2Fstudyk8s.com%2Fwp-admin%2Fpost.php%3Fpost%3D3766%26action%3Dedit&eae=0&fc=1408&brdim=1529%2C0%2C1529%2C0%2C3072%2C0%2C1550%2C1247%2C1535%2C1160&vis=1&rsz=%7C%7Cs%7C&abl=NS&fu=128&bc=31&ifi=3&uci=a!3&btvi=1&fsb=1&xpc=8KLUnNbKwo&p=https%3A//studyk8s.com&dtd=30
The above is a folk prescription to deal with this problem. Does the kubernetes native application deal with this problem
The answer is yes, but trouble

  • –Kube reserved is used to configure the amount of resources reserved for Kube components (kubelet, Kube proxy, dockerd, etc.), such as — Kube reserved = CPU = 1000m, memory = 8gi, ephemeral storage = 16gi.
  • –Kube reserved CGroup. If you set — Kube reserved, you must set the corresponding CGroup, and the CGroup directory must be created in advance, otherwise kubelet will not be created automatically, resulting in kubelet startup failure. For example, set it to Kube reserved CGroup = / kubelet service 。 If this item is not set, the above — Kube reserved will not take effect.
  • –System reserved is used to configure the amount of resources reserved for the system process, such as — system reserved = CPU = 500m, memory = 4Gi, ephemeral storage = 4Gi.
  • –System reserved CGroup. If — system reserved is set, the corresponding CGroup must be set, and the CGroup directory must be created in advance, otherwise kubelet will not be created automatically, resulting in kubelet startup failure. For example, set it to system reserved CGroup = / system slice。 If this item is not set, the above — system reserved will not take effect.
[Unit]
Description=Kubernetes Kubelet
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=docker.service
Requires=docker.service

[Service]
WorkingDirectory=/var/lib/kubelet
ExecStartPre=-/bin/mkdir -p /sys/fs/cgroup/cpuset/system.slice/kubelet.service /sys/fs/cgroup/hugetlb/system.slice/kubelet.service
ExecStart=/opt/kubernetes/bin/kubelet \
  --eviction-hard=memory.available<1024Mi,nodefs.available<10%,nodefs.inodesFree<5% \
  --system-reserved=cpu=0.5,memory=1G \
  --kube-reserved=cpu=0.5,memory=1G \
  --kube-reserved-cgroup=/system.slice/kubelet.service \
  --system-reserved-cgroup=/system.slice \
  --enforce-node-allocatable=pods,kube-reserved,system-reserved \
  --address=192.168.0.101 \
  --hostname-override=192.168.0.101 \
  --cgroup-driver=cgroupfs \
  --pod-infra-container-image=hub.breezey.top/kubernetes/pause-amd64:3.0 \
  --experimental-bootstrap-kubeconfig=/opt/kubernetes/cfg/bootstrap.kubeconfig \
  --kubeconfig=/opt/kubernetes/cfg/kubelet.kubeconfig \
  --cert-dir=/opt/kubernetes/ssl \
  --cluster-dns=192.168.0.200 \
  --cluster-domain=k8s.breezey.top. \
  --hairpin-mode=promiscuous-bridge \
  --allow-privileged=true \
  --fail-swap-on=false \
  --serialize-image-pulls=false \
  --max-pods=60 \
  --logtostderr=true \
  --v=2 
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target

And unfortunately, in the kubernetes cluster installed by RKE
Not in this way

The default is no kubelet.service

Need to add manually

Then specify through parameters

RKE kubernetes installed by default
This is the CGroup used by the kubelet process
6:cpu,cpuacct:/docker/4b28805380958bf3efb5c0d0a9c93ba42fab4d36e1be9036f3b9c4b01ef3bd62

it is below

Send a Message