Skip to content


k8s_安装5_主体和CNI

五、k8s集群初始化

 安装 kubernetes 组件

所有节点均执行
由于kubernetes的镜像源在国外,速度比较慢,这里切换成国内的镜像源
创建/etc/yum.repos.d/kubernetes.repo文件并添加如下内容:

cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes] 
name=Kubernetes 
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 
enabled=1 
gpgcheck=0 
repo_gpgcheck=0 
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg 
       http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg 
EOF

wget -P /etc/yum.repos.d/ http://mirrors.aliyun.com/repo/epel-archive-6.repo

安装kubeadm、kubelet和kubectl

此处指定版本为 kubectl-1.28.2,如果不指定默认下载最新版本。

yum install --setopt=obsoletes=0 kubelet-1.28.2  kubeadm-1.28.2  kubectl-1.28.2 

配置kubelet的cgroup

# 编辑/etc/sysconfig/kubelet,添加下面的配置 
cat > /etc/sysconfig/kubelet << EOF
KUBELET_CGROUP_ARGS="--cgroup-driver=systemd" 
KUBE_PROXY_MODE="ipvs" 
EOF

设置kubelet开机自启

systemctl enable kubelet
kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"28", GitVersion:"v1.28.2", GitCommit:"89a4ea3e1e4ddd7f7572286090359983e0387b2f", GitTreeState:"clean", BuildDate:"2023-09-13T09:34:32Z", GoVersion:"go1.20.8", Compiler:"gc", Platform:"linux/amd64"}

准备集群镜像

以下步骤是在 master 节点进行,如果你可以访问到 http://k8s.gcr.io ,那么就可以直接跳过该步骤。

这种方法是先下载镜像,由于默认拉取镜像地址k8s.gcr.io国内无法访问,
所以在安装kubernetes集群之前,提前准备好集群需要的镜像,所需镜像可以通过下面命令查看

# 要下载镜像版本列表
kubeadm config images list
registry.k8s.io/kube-apiserver:v1.28.2
registry.k8s.io/kube-controller-manager:v1.28.2
registry.k8s.io/kube-scheduler:v1.28.2
registry.k8s.io/kube-proxy:v1.28.2
registry.k8s.io/pause:3.9
registry.k8s.io/etcd:3.5.9-0
registry.k8s.io/coredns/coredns:v1.10.1

方式一

推荐在harbor仓库机进行镜像操作,操作docker比contained更方便
节省带宽、时间、空间、不暴露密码、命令更简单。

docker images

docker添加私仓地址

vi /etc/docker/daemon.json
"registry-mirrors":["http://hub-mirror.c.163.com"],
"insecure-registries":["repo.k8s.local:5100"],

重启docker

systemctl daemon-reload
systemctl restart docker
docker info
Client: Docker Engine - Community
 Version:    24.0.6

登录harbor

使用harbor中新建的开发或维护用户登录

docker login  http://repo.k8s.local:5100               
Username: k8s_user1
Password: 
WARNING! Your password will be stored unencrypted in /root/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded

登录成功后,下次就不需再登录

测试

手动推送私仓
docker pull registry.aliyuncs.com/google_containers/kube-apiserver:v1.28.2 
docker tag registry.aliyuncs.com/google_containers/kube-apiserver:v1.28.2  repo.k8s.local:5100/google_containers/kube-apiserver:v1.28.2 
docker push repo.k8s.local:5100/google_containers/kube-apiserver:v1.28.2

docker pull busybox
docker tag busybox repo.k8s.local:5100/google_containers/busybox:9.9
docker images |tail -1
docker push repo.k8s.local:5100/google_containers/busybox:9.9

去Harbor查看是否有对应的镜像,如果有表示成功

在master上测试从私仓拉取

ctr -n k8s.io i pull -k --plain-http repo.k8s.local:5100/google_containers/busybox:9.9
ctr -n k8s.io images ls  |grep busybox
repo.k8s.local:5100/google_containers/busybox:9.9                                                                                       application/vnd.docker.distribution.manifest.v2+json      sha256:023917ec6a886d0e8e15f28fb543515a5fcd8d938edb091e8147db4efed388ee 2.1 MiB   linux/amd64 
批量打标

不想改布署yml文件

# 下载镜像,重新标记为官方TAG,删除被标记的阿里云的镜像
#!/bin/bash
images=$(kubeadm config images list --kubernetes-version=1.28.2 | awk -F'/' '{print $NF}')
for imageName in ${images[@]} ; do
    docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName 
    docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName registry.k8s.io/$imageName
    docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName
done
批量打标并推送私仓

repo.k8s.local:5100下已准备好 google_containers 仓库名
如在harbor仓库安装,可以在master导出文本到本地images.txt中,本地脚本中注释kubeadm 行,直接读images.txt

#下载镜像,重新标记为私有仓库,并推送到私有仓库,删除被标记的阿里云的镜像
vi docker_images.sh
#!/bin/bash
imagesfile=images.txt
$(kubeadm config images list --kubernetes-version=1.28.2 | awk -F'/' '{print $NF}' > ${imagesfile})
images=$(cat ${imagesfile})
for i in ${images}
do
echo ${i}
docker pull registry.aliyuncs.com/google_containers/$i
docker tag registry.aliyuncs.com/google_containers/$i repo.k8s.local:5100/google_containers/$i
docker push repo.k8s.local:5100/google_containers/$i
docker rmi registry.aliyuncs.com/google_containers/$i
done
chmod +x docker_images.sh
sh docker_images.sh

kubeadm config images list –kubernetes-version=1.28.2 | awk -F’/’ ‘{print $NF}’ > images.txt

查看下载的镜像
docker images

方式二

在master和node上用containers操作
containers镜像导入命令和docker有所不同,推私仓拉取时需带–all-platforms,此时会占用大量空间和带宽。

# 导入好镜像以后用crictl查看ctr也能查看,但是不直观
crictl images
ctr -n k8s.io images ls  
ctr -n k8s.io image import 加镜像名字 # 或者导入自己的镜像仓库在pull下来

#测试打标成官方,
ctr -n k8s.io i pull -k registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.28.2
ctr -n k8s.io i tag  registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.28.2 registry.k8s.io/kube-apiserver:v1.28.2
ctr -n k8s.io i rm registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.28.2 

测试推送私仓,拉取时需加--all-platforms
ctr -n k8s.io i pull -k --all-platforms registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.28.2
ctr -n k8s.io i tag  registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.28.2 repo.k8s.local:5100/google_containers/kube-apiserver:v1.28.2
ctr -n k8s.io i push -k --user k8s_user1:k8s_Uus1 --plain-http repo.k8s.local:5100/google_containers/kube-apiserver:v1.28.2

错误1

ctr: content digest sha256:07742a71be5e2ac5dc434618fa720ba38bebb463e3bdc0c58b600b4f7716bc3d: not found

拉取镜像、导出镜像时,都加上–all-platforms

错误2

ctr: failed to do request: Head "https://repo.k8s.local:5100/v2/google_containers/kube-apiserver/blobs/sha256:2248d40e3af29ab47f33357e4ecdc9dca9a89daea07ac3a5a76de583ed06c776": http: server gave HTTP response to HTTPS client

如harbor没开https,推送时加上 -user ${harboruser}:${harborpwd} –plain-http

#下载镜像,重新标记为私有仓库,并推送到私有仓库,删除被标记的阿里云的镜像
vi ctr_images.sh
#!/bin/bash
harboruser=k8s_user1
harborpwd=k8s_Uus1
images=$(kubeadm config images list --kubernetes-version=1.28.2 | awk -F'/' '{print $NF}')
for imageName in ${images[@]} ; do
    ctr -n k8s.io i pull -k -all-platforms registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName 
    #ctr -n k8s.io i tag registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName registry.k8s.io/$imageName
    ctr -n k8s.io i tag registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName repo.k8s.local:5100/google_containers/$imageName
    ctr -n k8s.io i push --user ${harboruser}:${harborpwd} --plain-http repo.k8s.local:5100/google_containers/$imageName
    #ctr -n k8s.io i rm registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName
done
chmod +x ctr_images.sh
./ctr_images.sh
crictl images
FATA[0000] validate service connection: CRI v1 image API is not implemented for endpoint "unix:///var/run/containerd/containerd.sock": rpc error: code = Unimplemented desc = unknown service runtime.v1.ImageService 

修复错误

#找到runtime_type 写入"io.containerd.runtime.v1.linux"

vi /etc/containerd/config.toml
      [plugins."io.containerd.grpc.v1.cri".containerd.default_runtime]
        base_runtime_spec = ""
        cni_conf_dir = ""
        cni_max_conf_num = 0
        container_annotations = []
        pod_annotations = []
        privileged_without_host_devices = false
        privileged_without_host_devices_all_devices_allowed = false
        runtime_engine = ""
        runtime_path = ""
        runtime_root = ""
        runtime_type = "io.containerd.runtime.v1.linux"
        sandbox_mode = ""
        snapshotter = ""
systemctl restart containerd
crictl images               
IMAGE               TAG                 IMAGE ID            SIZE

还有种方法是关闭cri插件,还没试过

cat /etc/containerd/config.toml |grep cri
sed -i -r '/cri/s/(.*)/#\1/' /etc/containerd/config.toml

创建集群

目前生产部署Kubernetes集群主要有两种方式:

    1. kubeadm安装
      Kubeadm 是一个 K8s 部署工具,提供 kubeadm init 和 kubeadm join,用于快速部 署 Kubernetes 集群。
      官方地址:https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm/
    1. 二进制包安装
      从 github 下载发行版的二进制包,手动部署每个组件,组成 Kubernetes 集群。
      Kubeadm 降低部署门槛,但屏蔽了很多细节,遇到问题很难排查。如果想更容易可 控,推荐使用二进制包部署 Kubernetes 集群,虽然手动部署麻烦点,期间可以学习很 多工作原理,也利于后期维护。

kubeadm 安装

初始化集群

下面的操作只需要在master节点上执行即可

systemctl start kubelet && systemctl enable kubelet && systemctl is-active kubelet
kubeadm config print init-defaults | tee kubernetes-init.yaml

kubeadm 创建集群

参考: https://kubernetes.io/zh-cn/docs/reference/setup-tools/kubeadm/kubeadm-init/

A. 命令行:
kubeadm init \
--kubernetes-version=v1.28.2 \
--image-repository registry.aliyuncs.com/google_containers \
--pod-network-cidr=10.244.0.0/16 \
--service-cidr=10.96.0.0/16 \
--apiserver-advertise-address=192.168.244.4 \
--cri-socket unix:///var/run/containerd/containerd.sock
kubeadm init 选项说明

为控制平面选择一个特定的 Kubernetes 版本。
–kubernetes-version string 默认值:"stable-1"
选择用于拉取控制平面镜像的容器仓库。
–image-repository string 默认值:"registry.k8s.io"
指明 Pod 网络可以使用的 IP 地址段。如果设置了这个参数,控制平面将会为每一个节点自动分配 CIDR。
–pod-network-cidr string
为服务的虚拟 IP 地址另外指定 IP 地址段。
–service-cidr string 默认值:"10.96.0.0/12"
API 服务器所公布的其正在监听的 IP 地址。如果未设置,则使用默认网络接口。
–apiserver-advertise-address string
要连接的 CRI 套接字的路径。如果为空,则 kubeadm 将尝试自动检测此值; 仅当安装了多个 CRI 或具有非标准 CRI 套接字时,才使用此选项。
–cri-socket string

不做任何更改;只输出将要执行的操作。
–dry-run
指定节点的名称。
–node-name string
为服务另外指定域名,例如:"myorg.internal"。
–service-dns-domain string 默认值:"cluster.local"

B. 配置文件

命令行方式指定参数执行起来非常冗余,尤其参数比较多的情况,此时可以将所有的配置要求写到配置文件中,部署的时候指定对应的配置文件即可,假设取名kubernetes-init.yaml:
写好配置文件后,直接执行:

vi kubernetes-init.yaml

apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 192.168.244.4
  bindPort: 6443
nodeRegistration:
  criSocket: unix:///var/run/containerd/containerd.sock
  imagePullPolicy: IfNotPresent
  name: node
  taints: null
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.28.2
networking:
  dnsDomain: cluster.local
  serviceSubnet: 10.96.0.0/16
scheduler: {}

kubeadm init –config kubeadm.yaml

查看启动service

cat /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
# Note: This dropin only works with kubeadm and kubelet v1.11+
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/sysconfig/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS

防雪崩,同时限制pod、k8s系统组件、linux系统守护进程资源

https://kubernetes.io/zh-cn/docs/concepts/scheduling-eviction/node-pressure-eviction/
当工作节点的磁盘,内存资源使用率达到阈值时会出发POD驱逐
工作节点修改/var/lib/kubelet/config.yaml 无效.需修改 /var/lib/kubelet/kubeadm-flags.env

默认的驱逐阈值如下:
–eviction-hard default
imagefs.available<15%,memory.available<100Mi,nodefs.available<10%,nodefs.inodesFress<5%

vi /var/lib/kubelet/config.yaml
只限pod,线上应增加memory.available值,留出kube和system和使用内存。
硬驱逐1G,软驱逐2G

enforceNodeAllocatable:
- pods
evictionHard:
  memory.available: "300Mi"
  nodefs.available: "10%"
  nodefs.inodesFree: "5%"
  imagefs.available: "10%"
evictionMinimumReclaim:
  memory.available: "0Mi"
  nodefs.available: "500Mi"
  imagefs.available: "2Gi"
evictionSoft:
  memory.available: "800Mi"
  nodefs.available: "15%"
  nodefs.inodesFree: "10%"
  imagefs.available: "15%"
evictionSoftGracePeriod:
  memory.available: "120s"
  nodefs.available: "120s"
  nodefs.inodesFree: "120s"
  imagefs.available: "120s"
evictionMaxPodGracePeriod: 30

systemctl restart kubelet
systemctl status kubelet

报错
grace period must be specified for the soft eviction threshold nodefs.available"
如果配制了软驱逐evictionSoft那么相应的evictionSoftGracePeriod也需配制

evictionSoft:
  nodefs.available: "15%"
  nodefs.inodesFree: "10%"
  imagefs.available: "15%"
  • enforceNodeAllocatable[]string
    此标志设置 kubelet 需要执行的各类节点可分配资源策略。此字段接受一组选项列表。 可接受的选项有 none、pods、system-reserved 和 kube-reserved。
    如果设置了 none,则字段值中不可以包含其他选项。
    如果列表中包含 system-reserved,则必须设置 systemReservedCgroup。
    如果列表中包含 kube-reserved,则必须设置 kubeReservedCgroup。
    这个字段只有在 cgroupsPerQOS被设置为 true 才被支持。
    默认值:["pods"]

  • evictionHardmap[string]string
    evictionHard 是一个映射,是从信号名称到定义硬性驱逐阈值的映射。 例如:{"memory.available": "300Mi"}。 如果希望显式地禁用,可以在任意资源上将其阈值设置为 0% 或 100%。
    默认值:
    memory.available: "100Mi"
    nodefs.available: "10%"
    nodefs.inodesFree: "5%"
    imagefs.available: "15%"

  • evictionSoftmap[string]string
    evictionSoft 是一个映射,是从信号名称到定义软性驱逐阈值的映射。 例如:{"memory.available": "300Mi"}。
    默认值:nil

  • evictionSoftGracePeriodmap[string]string
    evictionSoftGracePeriod 是一个映射,是从信号名称到每个软性驱逐信号的宽限期限。 例如:{"memory.available": "30s"}。
    默认值:nil

  • evictionPressureTransitionPeriod
    evictionPressureTransitionPeriod 设置 kubelet 离开驱逐压力状况之前必须要等待的时长。
    默认值:"5m"

  • evictionMaxPodGracePeriod int32
    evictionMaxPodGracePeriod 是指达到软性逐出阈值而引起 Pod 终止时, 可以赋予的宽限期限最大值(按秒计)。这个值本质上限制了软性逐出事件发生时, Pod 可以获得的 terminationGracePeriodSeconds。
    注意:由于 Issue #64530 的原因,系统中存在一个缺陷,即此处所设置的值会在软性逐出时覆盖 Pod 的宽限期设置,从而有可能增加 Pod 上原本设置的宽限期限时长。 这个缺陷会在未来版本中修复。
    默认值:0

  • evictionMinimumReclaimmap[string]string
    evictionMinimumReclaim 是一个映射,定义信号名称与最小回收量数值之间的关系。 最小回收量指的是资源压力较大而执行 Pod 驱逐操作时,kubelet 对给定资源的最小回收量。 例如:{"imagefs.available": "2Gi"}。
    默认值:nil

根据提示创建必要文件

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
export KUBECONFIG=/etc/kubernetes/admin.conf

此步骤执行完成之后,使用命令docker images查看系统中的镜像,可以我们需要的镜像均已安装完成。

记下输出,后面加入时使用

kubeadm join 192.168.244.4:6443 --token rynvll.2rdb5z78if3mtlgd \
        --discovery-token-ca-cert-hash sha256:42739a7aaff927af9dc3b77a5e684a93d1c6485a79b8d23c33d978476c6902e2 

删除或重新配置集群

kubeadm reset
rm -fr ~/.kube/  /etc/kubernetes/* var/lib/etcd/*

node节点也可以执行kubectl,如重置也一同删除

scp -r ~/.kube node01.k8s.local:~/
scp -r ~/.kube node02.k8s.local:~/

node节点加入集群

将 node 节点加入 master 中的集群(该命令在工作节点node中执行)。

yum install --setopt=obsoletes=0 kubelet-1.28.2  kubeadm-1.28.2  kubectl-1.28.2 

 Package                                                    Arch                                       Version                                           Repository                                      Size
==============================================================================================================================================================================================================
Installing:
 kubeadm                                                    x86_64                                     1.28.2-0                                          kubernetes                                      11 M
 kubectl                                                    x86_64                                     1.28.2-0                                          kubernetes                                      11 M
 kubelet                                                    x86_64                                     1.28.2-0                                          kubernetes                                      21 M
Installing for dependencies:
 conntrack-tools                                            x86_64                                     1.4.4-7.el7                                       base                                           187 k
 cri-tools                                                  x86_64                                     1.26.0-0                                          kubernetes                                     8.6 M
 kubernetes-cni                                             x86_64                                     1.2.0-0                                           kubernetes                                      17 M
 libnetfilter_cthelper                                      x86_64                                     1.0.0-11.el7                                      base                                            18 k
 libnetfilter_cttimeout                                     x86_64                                     1.0.0-7.el7                                       base                                            18 k
 libnetfilter_queue                                         x86_64                                     1.0.2-2.el7_2                                     base                                            23 k
 socat                                                      x86_64                                     1.7.3.2-2.el7                                     base                                           290 k

Transaction Summary
==============================================================================================================================================================================================================
systemctl enable --now kubelet
Created symlink from /etc/systemd/system/multi-user.target.wants/kubelet.service to /usr/lib/systemd/system/kubelet.service.

将node1节点加入集群

如果没有令牌,可以通过在控制平面节点上运行以下命令来获取令牌:

kubeadm token list
#默认情况下,令牌会在24小时后过期。如果要在当前令牌过期后将节点加入集群
#kubeadm token create
kubeadm token create --print-join-command

kubeadm join 192.168.244.4:6443 --token rynvll.2rdb5z78if3mtlgd \
        --discovery-token-ca-cert-hash sha256:42739a7aaff927af9dc3b77a5e684a93d1c6485a79b8d23c33d978476c6902e2 

[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

将node2节点加入集群

kubeadm join 192.168.244.4:6443 --token rynvll.2rdb5z78if3mtlgd \
        --discovery-token-ca-cert-hash sha256:42739a7aaff927af9dc3b77a5e684a93d1c6485a79b8d23c33d978476c6902e2 

查看集群状态 此时的集群状态为NotReady,这是因为还没有配置网络插件

kubectl get nodes
NAME                 STATUS     ROLES           AGE    VERSION
master01.k8s.local   NotReady   control-plane   5m2s   v1.28.2
node01.k8s.local     NotReady   <none>          37s    v1.28.2
kubectl get pods -A
NAMESPACE     NAME                                         READY   STATUS    RESTARTS   AGE
kube-system   coredns-66f779496c-9x9cf                     0/1     Pending   0          5m21s
kube-system   coredns-66f779496c-xwgtx                     0/1     Pending   0          5m21s
kube-system   etcd-master01.k8s.local                      1/1     Running   0          5m34s
kube-system   kube-apiserver-master01.k8s.local            1/1     Running   0          5m34s
kube-system   kube-controller-manager-master01.k8s.local   1/1     Running   0          5m34s
kube-system   kube-proxy-58mfp                             1/1     Running   0          72s
kube-system   kube-proxy-z9tpc                             1/1     Running   0          5m21s
kube-system   kube-scheduler-master01.k8s.local            1/1     Running   0          5m36s

删除子节点(在master主节点上操作)

# 假设这里删除 node3 子节点
kubectl drain node3 --delete-local-data --force --ignore-daemonsets
kubectl delete node node3

然后在删除的子节点上操作重置k8s(重置k8s会删除一些配置文件),这里在node3子节点上操作

kubeadm reset

然后在被删除的子节点上手动删除k8s配置文件、flannel网络配置文件 和 flannel网口:

rm -rf /etc/cni/net.d/
rm -rf /root/.kube/config
#删除cni网络
ifconfig cni0 down
ip link delete cni0
ifconfig flannel.1 down
ip link delete flannel.1

安装 Pod 网络插件

flannel/calico/cilium

如需将flannel网络切换calico网络
kubelet 配置必须增加 –network-plugin=cni 选项
kubec-proxy 组件不能采用 –masquerade-all 启动,因为会与 Calico policy 冲突,并且需要加上–proxy-mode=ipvs(ipvs模式),–masquerade-all=true(表示ipvs proxier将伪装访问服务群集IP的所有流量,)

安装flannel

# 最好提前下载镜像(所有节点)
wget -k https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

---
kind: Namespace
apiVersion: v1
metadata:
  name: kube-flannel
  labels:
    k8s-app: flannel
    pod-security.kubernetes.io/enforce: privileged
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  labels:
    k8s-app: flannel
  name: flannel
rules:
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - nodes/status
  verbs:
  - patch
- apiGroups:
  - networking.k8s.io
  resources:
  - clustercidrs
  verbs:
  - list
  - watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  labels:
    k8s-app: flannel
  name: flannel
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: flannel
subjects:
- kind: ServiceAccount
  name: flannel
  namespace: kube-flannel
---
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: flannel
  name: flannel
  namespace: kube-flannel
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: kube-flannel-cfg
  namespace: kube-flannel
  labels:
    tier: node
    k8s-app: flannel
    app: flannel
data:
  cni-conf.json: |
    {
      "name": "cbr0",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "portmap",
          "capabilities": {
            "portMappings": true
          }
        }
      ]
    }
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-flannel-ds
  namespace: kube-flannel
  labels:
    tier: node
    app: flannel
    k8s-app: flannel
spec:
  selector:
    matchLabels:
      app: flannel
  template:
    metadata:
      labels:
        tier: node
        app: flannel
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/os
                operator: In
                values:
                - linux
      hostNetwork: true
      priorityClassName: system-node-critical
      tolerations:
      - operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel
      initContainers:
      - name: install-cni-plugin
        image: docker.io/flannel/flannel-cni-plugin:v1.2.0
        command:
        - cp
        args:
        - -f
        - /flannel
        - /opt/cni/bin/flannel
        volumeMounts:
        - name: cni-plugin
          mountPath: /opt/cni/bin
      - name: install-cni
        image: docker.io/flannel/flannel:v0.22.3
        command:
        - cp
        args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        volumeMounts:
        - name: cni
          mountPath: /etc/cni/net.d
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      containers:
      - name: kube-flannel
        image: docker.io/flannel/flannel:v0.22.3
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        resources:
          requests:
            cpu: "100m"
            memory: "50Mi"
        securityContext:
          privileged: false
          capabilities:
            add: ["NET_ADMIN", "NET_RAW"]
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: EVENT_QUEUE_DEPTH
          value: "5000"
        volumeMounts:
        - name: run
          mountPath: /run/flannel
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
        - name: xtables-lock
          mountPath: /run/xtables.lock
      volumes:
      - name: run
        hostPath:
          path: /run/flannel
      - name: cni-plugin
        hostPath:
          path: /opt/cni/bin
      - name: cni
        hostPath:
          path: /etc/cni/net.d
      - name: flannel-cfg
        configMap:
          name: kube-flannel-cfg
      - name: xtables-lock
        hostPath:
          path: /run/xtables.lock
          type: FileOrCreate

修改Network项的值,改为和–pod-network-cidr一样的值

vxlan/host-gw
host-gw 的性能损失大约在 10% 左右,而其他所有基于 VXLAN“隧道”机制的网络方案,性能损失都在 20%~30% 左右
Flannel host-gw 模式必须要求集群宿主机之间是二层连通的.就是node1和node2在一个局域网.通过arp协议可以访问到.

工作方式首选vxlan+DirectRouting

  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan",
        "Directrouting": true
      }
    }

准备镜像

由于有时国内网络的问题,需要修改image的地址,把所有的docker.io改为dockerproxy.com,或者下载下来后打标签

方式一。改地址

修改kube-flannel.yml中地址为能下载的

image: docker.io/flannel/flannel-cni-plugin:v1.2.0
image: docker.io/flannel/flannel:v0.22.3
方式二。下载到本地
docker pull docker.io/flannel/flannel-cni-plugin:v1.2.0
docker pull docker.io/flannel/flannel:v0.22.3

container 操作下载到本地

ctr -n k8s.io i pull -k docker.io/flannel/flannel-cni-plugin:v1.2.0
ctr -n k8s.io i pull -k docker.io/flannel/flannel:v0.22.3
方式三。下载后推送到本地私人仓库

批量打标签
如在harbor仓库安装,可以导出到images.txt中再执行

# cat docker_flannel.sh 
#!/bin/bash
imagesfile=images.txt
$(grep image kube-flannel.yml | grep -v '#' | awk -F '/' '{print $NF}' > ${imagesfile})
images=$(cat ${imagesfile})

for i in ${images}
do
docker pull flannel/$i
docker tag flannel/$i repo.k8s.local:5100/google_containers/$i
docker push repo.k8s.local:5100/google_containers/$i
docker rmi flannel/$i
done
#执行脚本文件:
sh docker_flannel.sh
查看镜像
crictl images 
IMAGE                                                                         TAG                 IMAGE ID            SIZE
docker.io/flannel/flannel-cni-plugin                                          v1.2.0              a55d1bad692b7       3.88MB
docker.io/flannel/flannel                                                     v0.22.3             e23f7ca36333c       27MB
应用安装
kubectl apply -f kube-flannel.yml

namespace/kube-flannel created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created
查看pods
kubectl get pods -n kube-flannel
NAME                    READY   STATUS    RESTARTS   AGE
kube-flannel-ds-flgpn   1/1     Running   0          58s
kube-flannel-ds-qdw64   1/1     Running   0          58s
验证安装结果,仅在master节点执行,二进制tar.gz包安装所有节点都要操作
kubectl get pods -n kube-system
NAME                                         READY   STATUS    RESTARTS   AGE
coredns-66f779496c-9x9cf                     1/1     Running   0          118m
coredns-66f779496c-xwgtx                     1/1     Running   0          118m
etcd-master01.k8s.local                      1/1     Running   0          118m
kube-apiserver-master01.k8s.local            1/1     Running   0          118m
kube-controller-manager-master01.k8s.local   1/1     Running   0          118m
kube-proxy-58mfp                             1/1     Running   0          114m
kube-proxy-z9tpc                             1/1     Running   0          118m
kube-scheduler-master01.k8s.local            1/1     Running   0          118m

systemctl status containerd
systemctl status kubelet
systemctl restart kubelet
kubectl get nodes

run.go:74] "command failed" err="failed to run Kubelet: validate service connection: validate CRI v1 runtime API for endpoint \"unix:///var/run/containerd/containerd.sock\

检查containerd服务有无错误

failed to load plugin io.containerd.grpc.v1.cri" error="invalid plugin config: `mirrors` cannot be set when `config_path` is provided"

/etc/containerd/config.toml中config_path 和 registry.mirrors 取一配制

    [plugins."io.containerd.grpc.v1.cri".registry]
      #config_path = "/etc/containerd/certs.d"

      [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
         endpoint = ["https://registry.cn-hangzhou.aliyuncs.com"]
transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: no such file or directory

/usr/lib/systemd/system/kubelet.service 里的Unit.After 改成 After=containerd.target如是docker改成After=docker.target
我这里修改之前的值是After=network-online.target

err="failed to run Kubelet: running with swap on is not supported, please disable swap

永久关闭swap

swapoff -a && sed -ri 's/.*swap.*/#&/' /etc/fstab
"Failed to run kubelet" err="failed to run Kubelet: misconfiguration: kubelet cgroup driver: \"systemd\" is different from docker cgroup driver: \"cgroupfs\""

经过分析后发现,是因为“kebernetes默认设置cgroup驱动为systemd,而docker服务的cgroup驱动为cgroupfs”,有两种决解决方式,方式一,将docker的服务配置文件修改为何kubernetes的相同,方式二是修改kebernetes的配置文件为cgroupfs,这里采用第一种。
修改docker服务的配置文件,“/etc/docker/daemon.json ”文件,添加如下
"exec-opts": ["native.cgroupdriver=systemd"]

node节点可执行kubectl

scp -r ~/.kube node02.k8s.local:~/

请忽略,安装自动补全后再设置

vi ~/.bashrc

source <(kubectl completion bash)
command -v kubecolor >/dev/null 2>&1 && alias k="kubecolor"
complete -o default -F __start_kubectl k
export PATH="/root/.krew/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin"

[ -f ~/.kube/aliases.sh ] && source ~/.kube/aliases.sh

source ~/.bashrc

Posted in 安装k8s/kubernetes.

Tagged with , , .


No Responses (yet)

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.



Some HTML is OK

or, reply to this post via trackback.