# K8s安装过程(Debian12+Kubernetes1.28+Containerd+Calico用apt-get在线搭建1master+2node的K8s集群) > wandoubaba / 2024-10-21 截止本文发布时,kubernetes最新版本是`v1.31`,而补丁已经全部打完的最新稳定版本是`v1.28`,本文基于后者。 ## 安装准备 ### 资源准备 |准备项|内容| |---|---| |操作系统|Debian 12(bookworm)| |系统内核|6.1.0-23-amd64| |容器运行时|containerd CRI| ### 主机清单 |IP(假设)|主机名|CPU|内存| |---|---|---|---| |172.31.0.11|k8s-master01|8c|8G| |172.31.0.14|k8s-master04|8c|16G| |172.31.0.15|k8s-master05|8c|16G| ## 过程 ### 确认主机基本信息(每台主机) ```sh # 查看IP地址,确认设置为静态地址 ip addr | awk '/inet /{split($2, ip, "/"); print ip[1]}' # 查看MAC地址,确保每台主机的MAC唯一 ip link | awk '/state UP/ {getline; print $2}' # 查看主机的UUID,确保product_uuid的唯一性 sudo cat /sys/class/dmi/id/product_uuid # 查看内核版本 uname -r # 查看操作系统发行版本信息 cat /etc/os-release # 确认cpu核心数量 lscpu -p | grep -v "^#" | wc -l # 确认内存容量 free -h | awk '/Mem/{print $2}' # 确认磁盘空间 lsblk ``` ### 设置主机名并更新/etc/hosts文件(每台主机) #### 设置主机名 ```sh # 在主控制节点(k8s-master01)上执行 sudo hostnamectl set-hostname k8s-master01 # 在工作节点(k8s-node01、k8s-node02)上执行 sudo hostnamectl set-hostname k8s-node01 sudo hostnamectl set-hostname k8s-node02 ``` 主机名设置成功后可以`exit`退出终端再重新连接,或者直接执行`bash`,都可以看到修改后的效果 #### 修改/etc/hosts文件 要根据自己的真实IP进行修改 ```sh sudo bash -c 'cat <> /etc/hosts 172.31.0.11 k8s-master01 172.31.0.14 k8s-node01 172.31.0.15 k8s-node02 EOF' ``` ### 设置时区并安装时间服务(每台主机) ```sh sudo timedatectl set-timezone Asia/Shanghai sudo apt-get update && sudo apt-get install -y chrony ``` #### 配置阿里云时间服务器(可选) ```conf pool ntp1.aliyun.com iburst maxsources 4 ``` 提示:在/etc/chrony/chrony.conf中加入上述配置,将其他pool开头的配置注释掉; 重启chrony,并验证 ```sh sudo systemctl restart chrony sudo systemctl status chrony sudo chronyc sources ``` ### 禁用swap(每台主机) ```sh sudo swapoff -a ``` 还要在`/etc/fstab`文件中注释关于swapr挂载的行。 ### 禁用防火墙(每台主机) ```sh sudo ufw disable sudo apt-get remove ufw ``` ### 优化内核参数 ```sh sudo bash -c 'cat > /etc/sysctl.d/kubernetes.conf < /etc/modules-load.d/kubernetes.conf << EOF # /etc/modules-load.d/kubernetes.conf # Linux 网桥支持 br_netfilter # IPVS 加载均衡器 ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh # IPv4 连接跟踪 nf_conntrack_ipv4 # IP 表规则 ip_tables EOF' # 添加可执行权限 sudo chmod a+x /etc/modules-load.d/kubernetes.conf ``` ### 关闭安全策略服务(每台主机) ```sh # 停止 AppArmor 服务 sudo systemctl stop apparmor.service # 禁用 AppArmor 服务 sudo systemctl disable apparmor.service ``` ### 关闭防火墙(每台主机) ```sh # 禁用ufw sudo ufw disable sudo systemctl stop ufw.service sudo systemctl disable ufw.service ``` ### 安装容器运行时(每台主机) #### 下载 在查看最新版本,然后选择对应的`cri-containerd-x.x.x-linux-platform`文件下载: ```sh curl -L -O https://github.com/containerd/containerd/releases/download/v1.7.23/cri-containerd-1.7.23-linux-amd64.tar.gz ``` #### 安装 ```sh sudo tar xf cri-containerd-1.7.23-linux-amd64.tar.gz -C / ``` #### 配置 ```sh sudo mkdir /etc/containerd sudo bash -c 'containerd config default > /etc/containerd/config.toml' sudo sed -i '/sandbox_image/s/3.8/3.9/' /etc/containerd/config.toml sudo sed -i '/SystemdCgroup/s/false/true/' /etc/containerd/config.toml ``` #### 启动 ```sh # 启用并立即启动containerd服务 sudo systemctl enable --now containerd.service # 检查containerd服务的当前状态 sudo systemctl status containerd.service ``` #### 验证 ```sh # 检查containerd的版本 containerd --version # 与CRI(Container Runtime Interface)兼容的容器运行时交互的命令行工具 crictl --version # 运行符合 OCI(Open Container Initiative)标准的容器 sudo runc --version ``` ### 安装docker(每个主机,在k8s中可选,仅用于构建镜像) ```sh # Add Docker's official GPG key: sudo apt-get update sudo apt-get install -y ca-certificates curl sudo install -m 0755 -d /etc/apt/keyrings sudo curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc sudo chmod a+r /etc/apt/keyrings/docker.asc # Add the repository to Apt sources: echo \ "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \ $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \ sudo tee /etc/apt/sources.list.d/docker.list > /dev/null sudo apt-get update sudo apt-get install -y docker-ce docker-ce-cli docker-buildx-plugin docker-compose-plugin ``` ### 安装k8s组件(每个主机) 指的是安装`kubelet`、`kubeadm`、`kubectl` ```sh sudo apt-get update # apt-transport-https may be a dummy package; if so, you can skip that package sudo apt-get install -y apt-transport-https ca-certificates curl gpg # If the directory `/etc/apt/keyrings` does not exist, it should be created before the curl command, read the note below. # sudo mkdir -p -m 755 /etc/apt/keyrings curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.28/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg # This overwrites any existing configuration in /etc/apt/sources.list.d/kubernetes.list echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.28/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list sudo apt-get update sudo apt-get install -y kubelet kubeadm kubectl sudo apt-mark hold kubelet kubeadm kubectl sudo systemctl enable --now kubelet ``` #### 配置kubelet ```sh sudo bash -c 'cat > /etc/default/kubelet << EOF # 该参数指定了 kubelet 使用 systemd 作为容器运行时的 cgroup 驱动程序 KUBELET_EXTRA_ARGS="--cgroup-driver=systemd" EOF' # 这里先设置kubelet为开机自启 sudo systemctl enable kubelet ``` ### 初始化master01主机 #### 查看k82镜像(可选) ```sh sudo kubeadm config images list ``` 应该能列出以下信息: ```sh registry.k8s.io/kube-apiserver:v1.31.1 registry.k8s.io/kube-controller-manager:v1.31.1 registry.k8s.io/kube-scheduler:v1.31.1 registry.k8s.io/kube-proxy:v1.31.1 registry.k8s.io/coredns/coredns:v1.11.3 registry.k8s.io/pause:3.10 registry.k8s.io/etcd:3.5.15-0 ``` > 如果看到类似`remote version is much newer: v1.31.1; falling back to: stable-1.28`的提示说版本低,忽略它就行了。 > k8s的镜像默认是谷歌仓库地址,需要代理才可以正常访问;如果你没有代理,请使用阿里云仓库也是可以的;用--image-repository="registry.aliyuncs.com/google_containers"来指定使用阿里云镜像仓库中的镜像部署k8s集群。 #### 下载镜像(可选) ```sh sudo kubeadm config images pull ``` #### 创建k8s集群 ##### 初始化master01节点 - 要把下面的`--apiserver-advertise-address`参数换成实际的`k8s-master01`主机IP地址. - `--pod-network-cidr`参数指的是本k8s集群中要让pod使用的网段。 ```sh sudo kubeadm init --control-plane-endpoint=k8s-master01 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=172.31.0.11 --cri-socket unix:///run/containerd/containerd.sock ``` 如果执行顺利的话,应该会看到下面的信息(每次的具体参数应该不同) ```sh Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of control-plane nodes by copying certificate authorities and service account keys on each node and then running the following as root: kubeadm join k8s-master01:6443 --token 1ahq7i.sv3pqgcss8v5oecj \ --discovery-token-ca-cert-hash sha256:8bea18bff8c86d0bc23214974d6b2045c90760448cd4731c94546a9ae836e9ca \ --control-plane Then you can join any number of worker nodes by running the following on each as root: kubeadm join k8s-master01:6443 --token 1ahq7i.sv3pqgcss8v5oecj \ --discovery-token-ca-cert-hash sha256:8bea18bff8c86d0bc23214974d6b2045c90760448cd4731c94546a9ae836e9ca ``` 接下来我们就先配置一下kubectl的配置文件 ```sh mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config ``` 接下来可以查看节点状态: ```sh kubectl get nodes -o wide ``` 应该能看到类似下面的结果: ```sh # 查看节点状态 kubectl get nodes -o wide # 结果类似 NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME k8s-master01 NotReady control-plane 25m v1.28.14 172.31.0.11 Debian GNU/Linux 12 (bookworm) 6.1.0-23-cloud-amd64 containerd://1.7.23 # 查看集群信息 kubectl cluster-info # 结果类似 Kubernetes control plane is running at https://k8s-master01:6443 CoreDNS is running at https://k8s-master01:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'. # 列出所有CRI容器列表 sudo crictl ps -a # 结果类似(其中STATE一列应该都是Running) CONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID POD 6177ae20a68e6 6a89d0ef825cb 29 minutes ago Running kube-proxy 0 e8d15cb2bcd1a kube-proxy-jlndc a1a43a29df5c2 6cbf215f8d44e 30 minutes ago Running kube-scheduler 1 3163922b00a0e kube-scheduler-k8s-master01 19dfb26520340 7abec2d806048 30 minutes ago Running kube-controller-manager 1 f6df8f333fcf0 kube-controller-manager-k8s-master01 b4c7a5f9c967f 3438637c2f3ae 30 minutes ago Running kube-apiserver 0 b05316fac4cad kube-apiserver-k8s-master01 8a4c587d9b8d9 2e96e5913fc06 30 minutes ago Running etcd 0 9a8c10ea30b80 etcd-k8s-master01 ``` ### 添加worker节点(把k8s-node01和k8s-node02添加到集群) 先在`k8s-master01`上得到添加节点命令(添加`k8s-node01`和`k8s-node02`之前分别是在`k8s-master01`上执行一次) ```sh sudo kubeadm token create --print-join-command # 结果与下面的类似(每一次的token应该都是不一样的),把下面的结果复制下来,准备到worker节点上去执行 kubeadm join k8s-master01:6443 --token epvxya.fh4qmay5uwc8628a --discovery-token-ca-cert-hash sha256:8bea18bff8c86d0bc23214974d6b2045c90760448cd4731c94546a9ae836e9ca ``` 下面的操作主要在`k8s-node01`和`k8s-node02`上分别执行 ```sh # 安装nmap用于在worker节点上验证master节点上的api-server服务端口的连通性 sudo apt-get install nmap -y # 把下面的ip地址换成实际master节点主机ip nmap -p 6443 -Pn 10.31.0.11 # 结果 Starting Nmap 7.93 ( https://nmap.org ) at 2024-10-21 18:50 CST Nmap scan report for k8s-master01 (172.31.0.11) Host is up (0.00081s latency). PORT STATE SERVICE 6443/tcp open sun-sr-https Nmap done: 1 IP address (1 host up) scanned in 0.03 seconds # 把刚才在master节得得到的join命令粘贴过来执行(建议用非root用户,在前面加上sudo) sudo kubeadm join k8s-master01:6443 --token epvxya.fh4qmay5uwc8628a --discovery-token-ca-cert-hash sha256:8bea18bff8c86d0bc23214974d6b2045c90760448cd4731c94546a9ae836e9ca # 结果类似下面 This node has joined the cluster: * Certificate signing request was sent to apiserver and a response was received. * The Kubelet was informed of the new secure connection details. Run 'kubectl get nodes' on the control-plane to see this node join the cluster. ``` 再到`k8s-master01`上验证一下 ```sh kubectl get nodes # 结果类似 NAME STATUS ROLES AGE VERSION k8s-master01 NotReady control-plane 41m v1.28.14 k8s-node01 NotReady 2m6s v1.28.14 ``` 然后在master节点上再执行一次`sudo kubeadm token create --print-join-command`,在`k8s-node02`上重复上面的过程。 最后在`k8s-master01`上查看节点信息 ```sh kubectl get nodes # 结果类似下面 NAME STATUS ROLES AGE VERSION k8s-master01 NotReady control-plane 43m v1.28.14 k8s-node01 NotReady 4m26s v1.28.14 k8s-node02 NotReady 40s v1.28.14 # 查看k8s的pod信息 kubectl get pods -n kube-system # 结果类似 NAME READY STATUS RESTARTS AGE coredns-5dd5756b68-4btx5 0/1 Pending 0 45m coredns-5dd5756b68-8v2z8 0/1 Pending 0 45m etcd-k8s-master01 1/1 Running 0 45m kube-apiserver-k8s-master01 1/1 Running 0 45m kube-controller-manager-k8s-master01 1/1 Running 1 45m kube-proxy-5tqw2 1/1 Running 0 6m33s kube-proxy-864zg 1/1 Running 0 2m47s kube-proxy-jlndc 1/1 Running 0 45m kube-scheduler-k8s-master01 1/1 Running 1 45m ``` 注意到每一个node的`STATUS`列都是`NotReady`,这是因为还没有安装配置网络插件,pod间的通信有问题。 ### 安装calico网络插件(master节点) 参考文档 #### 安装Tigera Calico operator ```sh kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.28.2/manifests/tigera-operator.yaml # 结果类似下面 serviceaccount/tigera-operator created clusterrole.rbac.authorization.k8s.io/tigera-operator created clusterrolebinding.rbac.authorization.k8s.io/tigera-operator created deployment.apps/tigera-operator created # 查看集群命名空间 kubectl get ns # 结果类似下面 NAME STATUS AGE default Active 63m kube-node-lease Active 63m kube-public Active 63m kube-system Active 63m tigera-operator Active 13s # 查看tigera-operator下的pod kubectl get pods -n tigera-operator # 结果 NAME READY STATUS RESTARTS AGE tigera-operator-5cfff76b77-tdswm 1/1 Running 0 3m46s ``` #### 安装Calico ```sh curl -L -O https://raw.githubusercontent.com/projectcalico/calico/v3.26.3/manifests/custom-resources.yaml # 修改ip池,需与初始化时一致 sed -i 's/192.168.0.0/10.244.0.0/' custom-resources.yaml # 安装calico kubectl create -f custom-resources.yaml # 结果 installation.operator.tigera.io/default created apiserver.operator.tigera.io/default created ``` 再执行`watch`命令等到所有pod的`STATUS`都变成`Running` ```sh watch kubectl get pods -n calico-system # 结果 Every 2.0s: kubectl get pods -n calico-system k8s-master01: Mon Oct 21 19:22:54 2024 NAME READY STATUS RESTARTS AGE calico-kube-controllers-5846f6d55d-87n88 1/1 Running 0 85s calico-node-4mhxj 1/1 Running 0 85s calico-node-6c64k 1/1 Running 0 85s calico-node-sbzwz 1/1 Running 0 85s calico-typha-6c76968df6-lcjm6 1/1 Running 0 84s calico-typha-6c76968df6-xbnk5 1/1 Running 0 85s csi-node-driver-2vrg7 2/2 Running 0 85s csi-node-driver-gmb7m 2/2 Running 0 85s csi-node-driver-mnqvx 2/2 Running 0 85s ``` 确认pod运行状态直到所有的pod状态都是`Running` ```sh watch kubectl get pods -n calico-system # 结果类似 Every 2.0s: kubectl get pods -n calico-system k8s-master01: Mon Oct 21 19:23:47 2024 NAME READY STATUS RESTARTS AGE calico-kube-controllers-5846f6d55d-87n88 1/1 Running 0 2m18s calico-node-4mhxj 1/1 Running 0 2m18s calico-node-6c64k 1/1 Running 0 2m18s calico-node-sbzwz 1/1 Running 0 2m18s calico-typha-6c76968df6-lcjm6 1/1 Running 0 2m17s calico-typha-6c76968df6-xbnk5 1/1 Running 0 2m18s csi-node-driver-2vrg7 2/2 Running 0 2m18s csi-node-driver-gmb7m 2/2 Running 0 2m18s csi-node-driver-mnqvx 2/2 Running 0 2m18s ``` 用`ctrl+c`退出`watch`状态,再查看k8s系统的pod状态 ```sh kubectl get pods -n kube-system -o wide # 结果类似 NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES coredns-5dd5756b68-4btx5 1/1 Running 0 72m 10.244.58.196 k8s-node02 coredns-5dd5756b68-8v2z8 1/1 Running 0 72m 10.244.58.193 k8s-node02 etcd-k8s-master01 1/1 Running 0 72m 172.31.0.11 k8s-master01 kube-apiserver-k8s-master01 1/1 Running 0 72m 172.31.0.11 k8s-master01 kube-controller-manager-k8s-master01 1/1 Running 1 72m 172.31.0.11 k8s-master01 kube-proxy-5tqw2 1/1 Running 0 33m 172.31.0.14 k8s-node01 kube-proxy-864zg 1/1 Running 0 29m 172.31.0.15 k8s-node02 kube-proxy-jlndc 1/1 Running 0 72m 172.31.0.11 k8s-master01 kube-scheduler-k8s-master01 1/1 Running 1 72m 172.31.0.11 k8s-master01 ``` 清理taints(污点) ```sh kubectl taint nodes --all node-role.kubernetes.io/control-plane- ``` 再次确认集群节点 ```sh kubectl get nodes -o wide # 结果类似 NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME k8s-master01 Ready control-plane 75m v1.28.14 172.31.0.11 Debian GNU/Linux 12 (bookworm) 6.1.0-23-cloud-amd64 containerd://1.7.23 k8s-node01 Ready 35m v1.28.14 172.31.0.14 Debian GNU/Linux 12 (bookworm) 6.1.0-23-cloud-amd64 containerd://1.7.23 k8s-node02 Ready 32m v1.28.14 172.31.0.15 Debian GNU/Linux 12 (bookworm) 6.1.0-23-cloud-amd64 containerd://1.7.23 ```