본문 바로가기
Compute/kubernetis

[따배씨] 1. ETCD Backup & Restore / CKA 시험 문제 학습

by 조청유곽 2025. 1. 29.
반응형

이 포스팅은 아래의 유튜브 채널 "따배"를 따라서 학습한 내용입니다.  

 

 

 

[관련 이론]

 

 

Kubernetes etcd란?
etcd는 Kubernetes의 핵심 데이터 저장소로, 분산 키-값 저장소(distributed key-value store)이다. 

Kubernetes 클러스터의 상태(state)와 설정(configuration) 정보를 저장하는 역할을 함

 

 

etcd의 역할
Kubernetes에서 etcd는 클러스터의 상태를 저장하는 중앙 저장소
etcd에 저장되는 주요 데이터

노드(Node) 정보
Pod 및 Deployment 정보
Service 및 Ingress 정보
ConfigMap 및 Secret
Role-based Access Control (RBAC) 정보
클러스터 상태 및 이벤트 로그
즉, Kubernetes의 모든 설정과 상태 데이터가 etcd에 저장됨

 

 

[Precondition]

(1) 테스트 환경

(1.1) Rocky Linux Cluster 

: 직접 구성

[root@k8s-master ~]# k get nodes -o wide
NAME         STATUS   ROLES           AGE   VERSION   INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                            KERNEL-VERSION                  CONTAINER-RUNTIME
k8s-master   Ready    control-plane   30d   v1.27.2   192.168.56.30   <none>        Rocky Linux 8.10 (Green Obsidian)   4.18.0-553.33.1.el8_10.x86_64   containerd://1.6.32
k8s-node1    Ready    <none>          30d   v1.27.2   192.168.56.31   <none>        Rocky Linux 8.8 (Green Obsidian)    4.18.0-477.10.1.el8_8.x86_64    containerd://1.6.21
k8s-node2    Ready    <none>          30d   v1.27.2   192.168.56.32   <none>        Rocky Linux 8.8 (Green Obsidian)    4.18.0-477.10.1.el8_8.x86_64    containerd://1.6.21
[root@k8s-master ~]#

 

(1.2) Ubuntu Cluster 

: kodekloud 테스트 환경 활용

controlplane ~ ➜  kubectl get nodes -o wide
NAME           STATUS   ROLES           AGE     VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION   CONTAINER-RUNTIME
controlplane   Ready    control-plane   9m6s    v1.31.0   192.6.94.6    <none>        Ubuntu 22.04.4 LTS   5.4.0-1106-gcp   containerd://1.6.26
node01         Ready    <none>          8m31s   v1.31.0   192.6.94.9    <none>        Ubuntu 22.04.4 LTS   5.4.0-1106-gcp   containerd://1.6.26

https://learn.kodekloud.com/user/courses/udemy-labs-certified-kubernetes-administrator-with-practice-tests

 

(2) 사전 필요 설정 

   : N/A


 

 

[Question]

k8s-master
First, create a snapshot of the existing etcd instance running at https://127.0.0.1:2379, 
saving the snapshot to /data/etcd-snapshot.db
Next, restore an existing, previous snapshot located at /data/etcd-snapshot-previous.db

The following TLS certificates/key are supplied for connecting to the server with etcdctl:
CA certificate: /etc/kubernetes/pki/etcd/ca.crt
Client certificate: /etc/kubernetes/pki/etcd/server.crt
Client key: /etc/kubernetes/pki/etcd/server.key

 

 

[Solve]

(1) etcd.yaml 확인 

: /etc/kubernetes/manifests/etcd.yaml에서 snapshot 정보 확인 

controlplane ~ ➜  cat /etc/kubernetes/manifests/
etcd.yaml                     kube-controller-manager.yaml  kube-scheduler.yaml
kube-apiserver.yaml           .kubelet-keep                 

controlplane ~ ➜  cat /etc/kubernetes/manifests/etcd.yaml | grep -i file
    - --cert-file=/etc/kubernetes/pki/etcd/server.crt
    - --key-file=/etc/kubernetes/pki/etcd/server.key
    - --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
    - --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
    - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
    - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
    seccompProfile:

controlplane ~ ➜

 

 

(2) etcd snapshot 생성

: 문제에서 ca.cert / server.crt / server.key 경로를 제공, etcd.yaml에서 file로 grep하여 경로 비교(동일 경로 확인)

controlplane ~ ➜  ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key \
  snapshot save /data/etcd-snapshot.db

Snapshot saved at /data/etcd-snapshot.db

 

 

(3) etcd snapshot 복구 

: A. 복구 시 복구 타겟 경로가 기존과 동일하게 설정할 경우 etcd process stop 후 작업 

  B. 복구 시 복구 타겟 경로가 기존과 다르게 설정할 경우 etcd.yaml의 etcd-data의

       volumes.hostPath.path의 경로를 복구 타겟 경로로 변경해야함 

       이후 kubelet.service를 restart 한다. 

 

 

 

: 영상 강좌와 달리 테스트 환경에 tree가 설치되어 있지 않아서 ls로 복구 파일을 확인함 

: 테스트 환경에 복구에 사용할 원본 etcd-snapshot-previous.db는 따로 없으므로 백업한 etcd-snapshot.db를

  복사하여 복구 원본 파일로 사용함 

controlplane /data ➜  ls -al
total 3388
drwxr-xr-x 2 root root    4096 Jan 29 10:07 .
dr-xr-xr-x 1 root root    4096 Jan 29 09:49 ..
-rw-r--r-- 1 root root 3457056 Jan 29 09:49 etcd-snapshot.db

controlplane /data ➜  cp etcd-snapshot.db etcd-snapshot-previous.db

controlplane /data ➜  ls -al
total 6768
drwxr-xr-x 2 root root    4096 Jan 29 10:07 .
dr-xr-xr-x 1 root root    4096 Jan 29 09:49 ..
-rw-r--r-- 1 root root 3457056 Jan 29 09:49 etcd-snapshot.db
-rw-r--r-- 1 root root 3457056 Jan 29 10:07 etcd-snapshot-previous.db

controlplane /data ➜  export ETCDCTL_API=3
etcdctl --data-dir /data/new snapshot restore /data/etcd-snapshot-previous.db
2025-01-29 10:08:51.732198 I | mvcc: restore compact to 2730
2025-01-29 10:08:51.737049 I | etcdserver/membership: added member 8e9e05c52164694d [http://localhost:2380] to cluster cdf818194e3a8c32

controlplane /data ➜  

controlplane /data ➜  cd new

controlplane /data/new ➜  ls -al
total 12
drwx------ 3 root root 4096 Jan 29 10:08 .
drwxr-xr-x 3 root root 4096 Jan 29 10:08 ..
drwx------ 4 root root 4096 Jan 29 10:08 member

controlplane /data/new ➜  ls -al member/
snap/ wal/  

controlplane /data/new ➜  ls -al member/snap/
0000000000000001-0000000000000001.snap  db

controlplane /data/new ➜  ls -al member/wal/0000000000000000-0000000000000000.wal ^C

controlplane /data/new ✖

 

 

 

: etcd.yaml에서 etcd-data의 hostPath를 복구 타겟 경로인 /data/new로 수정함 

controlplane /data/new ➜  vi /etc/kubernetes/manifests/etcd.yaml 

  volumes:
  - hostPath:
      path: /etc/kubernetes/pki/etcd
      type: DirectoryOrCreate
    name: etcd-certs
  - hostPath:
      path: /data/new
      type: DirectoryOrCreate
    name: etcd-data
status: {}

 

 

: kubelet.service를 restart.

controlplane /data/new ➜  systemctl restart kubelet.service 

controlplane /data/new ➜  systemctl status kubelet.service 
● kubelet.service - kubelet: The Kubernetes Node Agent
     Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
    Drop-In: /usr/lib/systemd/system/kubelet.service.d
             └─10-kubeadm.conf
     Active: active (running) since Wed 2025-01-29 10:18:18 UTC; 8s ago
       Docs: https://kubernetes.io/docs/
   Main PID: 52602 (kubelet)
      Tasks: 21 (limit: 77143)
     Memory: 28.1M
     CGroup: /system.slice/kubelet.service
             └─52602 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --k>

Jan 29 10:18:19 controlplane kubelet[52602]: I0129 10:18:19.304208   52602 reconciler_common.go:245] "o>
Jan 29 10:18:19 controlplane kubelet[52602]: I0129 10:18:19.304226   52602 reconciler_common.go:245] "o>
Jan 29 10:18:19 controlplane kubelet[52602]: I0129 10:18:19.304280   52602 reconciler_common.go:245] "o>
Jan 29 10:18:19 controlplane kubelet[52602]: I0129 10:18:19.304304   52602 reconciler_common.go:245] "o>
Jan 29 10:18:19 controlplane kubelet[52602]: I0129 10:18:19.304365   52602 reconciler_common.go:245] "o>
Jan 29 10:18:19 controlplane kubelet[52602]: I0129 10:18:19.304516   52602 reconciler_common.go:245] "o>
Jan 29 10:18:19 controlplane kubelet[52602]: I0129 10:18:19.304603   52602 reconciler_common.go:245] "o>
Jan 29 10:18:19 controlplane kubelet[52602]: E0129 10:18:19.410776   52602 kubelet.go:1915] "Failed cre>
Jan 29 10:18:19 controlplane kubelet[52602]: E0129 10:18:19.410787   52602 kubelet.go:1915] "Failed cre>
Jan 29 10:18:19 controlplane kubelet[52602]: E0129 10:18:19.410868   52602 kubelet.go:1915] "Failed cre>
lines 1-22/22 (END)


 

 

[사용 커맨드 정리]

ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key \
  snapshot save /data/etcd-snapshot.db

 

export ETCDCTL_API=3
etcdctl --data-dir /data/new snapshot restore /data/etcd-snapshot-previous.db

 

systemctl restart kubelet.service

systemctl status kubelet.service 

 

반응형