etcd

Overview

DOK provides commands for Etcd backup/restore. For specific usage, please refer to dok -h. The following is the principle of Etcd backup/restore function. For Kubernetes Etcd cluster data backup and restore, you can also refer to k8s official document or VMware documentation.

Tools

DOK pre-installs the etcdctl tool for each controlplane node, the path is /usr/bin/etcdctl, and the version is consistent with the version of Etcd corresponding to k8s 3.4.13-0.

Steps

The idea of operation is to refer to this article.

yum -y install etcd
ETCDCTL_API=3 etcdctl snapshot save /tmp/snap.db \
  --endpoints=https://127.0.0.1:2379 \
  --cacert=/etc/kubernetes/pki/etcd/ca.crt \
  --cert=/etc/kubernetes/pki/etcd/server.crt \
  --key=/etc/kubernetes/pki/etcd/server.key 

# snap.db is the data snapshot file
mv /etc/kubernetes/manifests/{kube-apiserver.yaml,etcd.yaml} /tmp/
mv /var/lib/etcd /var/lib/etcd.bak

ETCDCTL_API=3 etcdctl snapshot restore /tmp/snap.db --data-dir=/var/lib/etcd

mv /tmp/{kube-apiserver.yaml,etcd.yaml} /etc/kubernetes/manifests

Backup

Because the data of the Etcd cluster is synchronized with each other on different nodes, it is only necessary to operate on the data of one node when backing up. The backup data cannot be retrieved due to the problem.

Restore

For data recovery, it is necessary to copy the backup file to all nodes, and perform a data recovery on all nodes. When the data is restored, all kube-apiservers and Etcd will be stopped, and the Etcd data will be restored one by one. Then restart Etcd, and finally start kube-apiserver.

Regular Backup

DOK will automatically deploy a crontab task about etcd on the ControlPlane node to perform backups and clean up redundant backups for etcd on a regular basis. By default, data backups are performed every 60 minutes, and backups of up to 1 day are saved.

# check crontab list on master0
crontab -l