반응형
1.
The cluster is broken. We tried deploying an application but it's not working. Troubleshoot and fix the issue.
Start looking at the deployments.
ontrolplane ~ ➜ k get all -A
NAMESPACE NAME READY STATUS RESTARTS AGE
default pod/app-5f58858856-hdm6s 0/1 Pending 0 45s
kube-flannel pod/kube-flannel-ds-rrw58 1/1 Running 0 2m37s
kube-system pod/coredns-768b85b76f-9xvz5 1/1 Running 0 2m37s
kube-system pod/coredns-768b85b76f-fprfj 1/1 Running 0 2m37s
kube-system pod/etcd-controlplane 1/1 Running 0 2m52s
kube-system pod/kube-apiserver-controlplane 1/1 Running 0 2m52s
kube-system pod/kube-controller-manager-controlplane 1/1 Running 0 2m52s
kube-system pod/kube-proxy-5lbmf 1/1 Running 0 2m37s
kube-system pod/kube-scheduler-controlplane 0/1 CrashLoopBackOff 2 (24s ago) 47s
controlplane ~ ✖ k describe po -n kube-system kube-scheduler-controlplane
Name: kube-scheduler-controlplane
Namespace: kube-system
Priority: 2000001000
Priority Class Name: system-node-critical
Node: controlplane/192.21.146.6
Start Time: Sun, 25 Aug 2024 12:50:42 +0000
Labels: component=kube-scheduler
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Failed 2m57s (x4 over 3m52s) kubelet Error: failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "kube-schedulerrrr": executable file not found in $PATH: unknown
Warning BackOff 2m28s (x11 over 3m50s) kubelet Back-off restarting failed container kube-scheduler in pod kube-scheduler-controlplane_kube-system(e4007249bf4f8d68970f15caac4f8d27)
Normal Pulled 2m17s (x5 over 3m52s) kubelet Container image "registry.k8s.io/kube-scheduler:v1.30.0" already present on machine
Normal Created 2m17s (x5 over 3m52s) kubelet Created container kube-scheduler
kube-scheduler 에서 오류가 있어 보인다. 이를 해결하기 위해 kube-scheduler yaml 을 확인해 본다.
controlplane /etc/kubernetes ➜ vi manifests/kube-scheduler.yaml
---
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-scheduler
tier: control-plane
name: kube-scheduler
namespace: kube-system
spec:
containers:
- command:
- kube-schedulerrrr ## 오타수정
- --authentication-kubeconfig=/etc/kubernetes/scheduler.conf
- --authorization-kubeconfig=/etc/kubernetes/scheduler.conf
- --bind-address=127.0.0.1
- --kubeconfig=/etc/kubernetes/scheduler.conf
2. Scale the deployment app to 2 pods.
controlplane /etc/kubernetes ➜ kubectl scale --replicas=2 deployment/app
deployment.apps/app scaled
3. Even though the deployment was scaled to 2, the number of PODs does not seem to increase. Investigate and fix the issue.
Inspect the component responsible for managing deployments and replicasets.
controlplane /etc/kubernetes ➜ k get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
default app-5f58858856-hdm6s 1/1 Running 0 16m
kube-flannel kube-flannel-ds-rrw58 1/1 Running 0 17m
kube-system coredns-768b85b76f-9xvz5 1/1 Running 0 17m
kube-system coredns-768b85b76f-fprfj 1/1 Running 0 17m
kube-system etcd-controlplane 1/1 Running 0 18m
kube-system kube-apiserver-controlplane 1/1 Running 0 18m
kube-system kube-controller-manager-controlplane 0/1 CrashLoopBackOff 5 (2m26s ago) 5m29s
kube-system kube-proxy-5lbmf 1/1 Running 0 17m
kube-system kube-scheduler-controlplane 1/1 Running 0 7m15s
controlplane /etc/kubernetes ➜ k logs -n kube-system kube-controller-manager-controlplane
I0825 13:06:30.755473 1 serving.go:380] Generated self-signed cert in-memory
E0825 13:06:30.755606 1 run.go:74] "command failed" err="stat /etc/kubernetes/controller-manager-XXXX.conf: no such file or directory"
vi /etc/kubernetes/manifests/kube-controller-manager.yaml
---
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
component: kube-controller-manager
tier: control-plane
name: kube-controller-manager
namespace: kube-system
spec:
containers:
- command:
- kube-controller-manager
- --allocate-node-cidrs=true
- --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
- --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
- --bind-address=127.0.0.1
- --client-ca-file=/etc/kubernetes/pki/ca.crt
- --cluster-cidr=10.244.0.0/16
- --cluster-name=kubernetes
- --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
- --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
- --controllers=*,bootstrapsigner,tokencleaner
- --kubeconfig=/etc/kubernetes/controller-manager-XXXX.conf ## 오타수정
4. Something is wrong with scaling again. We just tried scaling the deployment to 3 replicas. But it's not happening.
Investigate and fix the issue.
controlplane /etc/kubernetes ➜ k get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
default app-5f58858856-7lcsf 1/1 Running 0 2m33s
default app-5f58858856-hdm6s 1/1 Running 0 21m
kube-flannel kube-flannel-ds-rrw58 1/1 Running 0 23m
kube-system coredns-768b85b76f-9xvz5 1/1 Running 0 23m
kube-system coredns-768b85b76f-fprfj 1/1 Running 0 23m
kube-system etcd-controlplane 1/1 Running 0 23m
kube-system kube-apiserver-controlplane 1/1 Running 0 23m
kube-system kube-controller-manager-controlplane 0/1 CrashLoopBackOff 4 (30s ago) 2m8s
kube-system kube-proxy-5lbmf 1/1 Running 0 23m
kube-system kube-scheduler-controlplane 1/1 Running 0 12m
controlplane /etc/kubernetes ➜ k logs -n kube-system kube-controller-manager-controlplane
I0825 13:15:13.683980 1 serving.go:380] Generated self-signed cert in-memory
E0825 13:15:14.073764 1 run.go:74] "command failed" err="unable to load client CA provider: open /etc/kubernetes/pki/ca.crt: no such file or directory"
controlplane /etc/kubernetes/manifests ✖ vi /etc/kubernetes/manifests/kube-controller-manager.yaml
---
...
volumes:
- hostPath:
path: /etc/ssl/certs
type: DirectoryOrCreate
name: ca-certs
- hostPath:
path: /etc/ca-certificates
type: DirectoryOrCreate
name: etc-ca-certificates
- hostPath:
path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec
type: DirectoryOrCreate
name: flexvolume-dir
- hostPath:
path: /etc/kubernetes/WRONG-PKI-DIRECTORY # /etc/kubernetes/pki
type: DirectoryOrCreate
...
## 볼륨마운트 디렉토리 위치 설정이 잘못되어있다.
'IT 기술 > k8s' 카테고리의 다른 글
[cka] killer.sh 문제풀이 - (1-5) (2) | 2024.09.14 |
---|---|
[cka] TroubleShooting - Worker Node Failure (0) | 2024.08.25 |
[cka] TroubleShooting - Application Failure (0) | 2024.08.25 |
[cka] Cluster Installation using Kubeadm (0) | 2024.08.25 |
[cka] Ingress Networking - 2 (0) | 2024.08.13 |
댓글