1. We have a working Kubernetes cluster with a set of web applications running. Let us first explore the setup.
How many deployments exist in the cluster in default namespace?
controlplane ~ ➜ k get deployments.apps
NAME READY UP-TO-DATE AVAILABLE AGE
blue 3/3 3 3 77s
red 2/2 2 2 77s
answer : 2
2. What is the version of ETCD running on the cluster?
Check the ETCD Pod or Process
controlplane ~ ➜ k describe po -n kube-system etcd-controlplane
Name: etcd-controlplane
Namespace: kube-system
..
Containers:
etcd:
Container ID: containerd://7ac8255e51393f0ab3d8aef904c401ff030606158b1140cb0c38f9d1d97bdde0
Image: registry.k8s.io/etcd:3.5.12-0
..
answer : 3.5.12
3. At what address can you reach the ETCD cluster from the controlplane node?
Check the ETCD Service configuration in the ETCD POD
Command:
etcd
--advertise-client-urls=https://192.25.191.9:2379
--cert-file=/etc/kubernetes/pki/etcd/server.crt
--client-cert-auth=true
--data-dir=/var/lib/etcd
--experimental-initial-corrupt-check=true
--experimental-watch-progress-notify-interval=5s
--initial-advertise-peer-urls=https://192.25.191.9:2380
--initial-cluster=controlplane=https://192.25.191.9:2380
--key-file=/etc/kubernetes/pki/etcd/server.key
--listen-client-urls=https://127.0.0.1:2379,https://192.25.191.9:2379
--listen-metrics-urls=http://127.0.0.1:2381
--listen-peer-urls=https://192.25.191.9:2380
--name=controlplane
--peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
--peer-client-cert-auth=true
--peer-key-file=/etc/kubernetes/pki/etcd/peer.key
--peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
--snapshot-count=10000
--trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
reach the ETCD -> ETCD 가 listen 중인 client 정보.
answer : https://127.0.0.1:2379
4. Where is the ETCD server certificate file located? Note this path down as you will need to use it later
위 설정값에서 cert-file 경로
answer : /etc/kubernetes/pki/etcd/server.crt
5. Where is the ETCD CA Certificate file located? Note this path down as you will need to use it later.
위 설정값에서 ca.crt 파일의 경로
answer : /etc/kubernetes/pki/etcd/ca.crt
6. The master node in our cluster is planned for a regular maintenance reboot tonight. While we do not anticipate anything to go wrong, we are required to take the necessary backups. Take a snapshot of the ETCD database using the built-in snapshot functionality. Store the backup file at location /opt/snapshot-pre-boot.db
controlplane ~ ➜ ETCDCTL_API=3 etcdctl --endpoints=https://[127.0.0.1]:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
snapshot save /opt/snapshot-pre-boot.db
Snapshot saved at /opt/snapshot-pre-boot.db
7. Great! Let us now wait for the maintenance window to finish. Go get some sleep. (Don't go for real)
Click Ok to Continue
8. Wake up! We have a conference call! After the reboot the master nodes came back online, but none of our applications are accessible. Check the status of the applications on the cluster. What's wrong?
controlplane ~ ➜ k get pods
No resources found in default namespace.
controlplane ~ ➜ k get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 70s
controlplane ~ ➜ k get deployments.apps
전체적으로 정상적인 Application 들이 존재하지 않는다.
answer : All of the above
9. Luckily we took a backup. Restore the original state of the cluster using the backup file.
ETCDCTL_API=3 etcdctl --data-dir /var/lib/etcd-from-backup snapshot restore /opt/snapshot-pre-boot.db
etcd 백업후에 etcd static pod 설정값에서 백업한 .db 값을 보도록 변경한다.
controlplane ~ ✖ vi /etc/kubernetes/manifests/etcd.yaml
...
apiVersion: v1
kind: Pod
metadata:
annotations:
kubeadm.kubernetes.io/etcd.advertise-client-urls: https://192.25.191.9:2379
creationTimestamp: null
labels:
component: etcd
tier: control-plane
name: etcd
namespace: kube-system
...
volumes:
- hostPath:
path: /etc/kubernetes/pki/etcd
type: DirectoryOrCreate
name: etcd-certs
- hostPath:
path: /var/lib/etcd ## -> 백업한 경로로 변경 /var/lib/etcd-from-backup
type: DirectoryOrCreate
name: etcd-data
status: {}
해당 yaml 을 저장하면 자동으로 etcd application 이 재기동 되면서 kubelet도 재기동 된다.
controlplane ~ ➜ k get pods
NAME READY STATUS RESTARTS AGE
blue-fffb6db8d-2crql 1/1 Running 0 32m
blue-fffb6db8d-bc9nf 1/1 Running 0 32m
blue-fffb6db8d-hzl87 1/1 Running 0 32m
red-85c9fd5d6f-fqpq9 1/1 Running 0 32m
red-85c9fd5d6f-mm6j4 1/1 Running 0 32m
controlplane ~ ➜ k get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
blue-service NodePort 10.111.169.151 <none> 80:30082/TCP 32m
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 33m
red-service NodePort 10.101.1.98 <none> 80:30080/TCP 32m
controlplane ~ ➜ k get deployments.apps
NAME READY UP-TO-DATE AVAILABLE AGE
blue 3/3 3 3 32m
red 2/2 2 2 32m
'IT 기술 > k8s' 카테고리의 다른 글
[cka] View Certificate Details (0) | 2024.07.16 |
---|---|
[cka] Backup and Restore Methods 2 (1) | 2024.07.12 |
[cka] Cluster Upgrade Process (0) | 2024.07.11 |
[cka] OS Upgrades (0) | 2024.07.10 |
[cka] Init Containers (0) | 2024.07.06 |
댓글