五、k8s集群初始化
安装 kubernetes 组件
所有节点均执行
由于kubernetes的镜像源在国外,速度比较慢,这里切换成国内的镜像源
创建/etc/yum.repos.d/kubernetes.repo文件并添加如下内容:
cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=http://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
http://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
wget -P /etc/yum.repos.d/ http://mirrors.aliyun.com/repo/epel-archive-6.repo
安装kubeadm、kubelet和kubectl
此处指定版本为 kubectl-1.28.2,如果不指定默认下载最新版本。
yum install --setopt=obsoletes=0 kubelet-1.28.2 kubeadm-1.28.2 kubectl-1.28.2
配置kubelet的cgroup
# 编辑/etc/sysconfig/kubelet,添加下面的配置
cat > /etc/sysconfig/kubelet << EOF
KUBELET_CGROUP_ARGS="--cgroup-driver=systemd"
KUBE_PROXY_MODE="ipvs"
EOF
设置kubelet开机自启
systemctl enable kubelet
kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"28", GitVersion:"v1.28.2", GitCommit:"89a4ea3e1e4ddd7f7572286090359983e0387b2f", GitTreeState:"clean", BuildDate:"2023-09-13T09:34:32Z", GoVersion:"go1.20.8", Compiler:"gc", Platform:"linux/amd64"}
准备集群镜像
以下步骤是在 master 节点进行,如果你可以访问到 http://k8s.gcr.io ,那么就可以直接跳过该步骤。
这种方法是先下载镜像,由于默认拉取镜像地址k8s.gcr.io国内无法访问,
所以在安装kubernetes集群之前,提前准备好集群需要的镜像,所需镜像可以通过下面命令查看
# 要下载镜像版本列表
kubeadm config images list
registry.k8s.io/kube-apiserver:v1.28.2
registry.k8s.io/kube-controller-manager:v1.28.2
registry.k8s.io/kube-scheduler:v1.28.2
registry.k8s.io/kube-proxy:v1.28.2
registry.k8s.io/pause:3.9
registry.k8s.io/etcd:3.5.9-0
registry.k8s.io/coredns/coredns:v1.10.1
方式一
推荐在harbor仓库机进行镜像操作,操作docker比contained更方便
节省带宽、时间、空间、不暴露密码、命令更简单。
docker images
docker添加私仓地址
vi /etc/docker/daemon.json
"registry-mirrors":["http://hub-mirror.c.163.com"],
"insecure-registries":["repo.k8s.local:5100"],
重启docker
systemctl daemon-reload
systemctl restart docker
docker info
Client: Docker Engine - Community
Version: 24.0.6
登录harbor
使用harbor中新建的开发或维护用户登录
docker login http://repo.k8s.local:5100
Username: k8s_user1
Password:
WARNING! Your password will be stored unencrypted in /root/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store
Login Succeeded
登录成功后,下次就不需再登录
测试
手动推送私仓
docker pull registry.aliyuncs.com/google_containers/kube-apiserver:v1.28.2
docker tag registry.aliyuncs.com/google_containers/kube-apiserver:v1.28.2 repo.k8s.local:5100/google_containers/kube-apiserver:v1.28.2
docker push repo.k8s.local:5100/google_containers/kube-apiserver:v1.28.2
docker pull busybox
docker tag busybox repo.k8s.local:5100/google_containers/busybox:9.9
docker images |tail -1
docker push repo.k8s.local:5100/google_containers/busybox:9.9
去Harbor查看是否有对应的镜像,如果有表示成功
在master上测试从私仓拉取
ctr -n k8s.io i pull -k --plain-http repo.k8s.local:5100/google_containers/busybox:9.9
ctr -n k8s.io images ls |grep busybox
repo.k8s.local:5100/google_containers/busybox:9.9 application/vnd.docker.distribution.manifest.v2+json sha256:023917ec6a886d0e8e15f28fb543515a5fcd8d938edb091e8147db4efed388ee 2.1 MiB linux/amd64
批量打标
不想改布署yml文件
# 下载镜像,重新标记为官方TAG,删除被标记的阿里云的镜像
#!/bin/bash
images=$(kubeadm config images list --kubernetes-version=1.28.2 | awk -F'/' '{print $NF}')
for imageName in ${images[@]} ; do
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName registry.k8s.io/$imageName
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName
done
批量打标并推送私仓
repo.k8s.local:5100下已准备好 google_containers 仓库名
如在harbor仓库安装,可以在master导出文本到本地images.txt中,本地脚本中注释kubeadm 行,直接读images.txt
#下载镜像,重新标记为私有仓库,并推送到私有仓库,删除被标记的阿里云的镜像
vi docker_images.sh
#!/bin/bash
imagesfile=images.txt
$(kubeadm config images list --kubernetes-version=1.28.2 | awk -F'/' '{print $NF}' > ${imagesfile})
images=$(cat ${imagesfile})
for i in ${images}
do
echo ${i}
docker pull registry.aliyuncs.com/google_containers/$i
docker tag registry.aliyuncs.com/google_containers/$i repo.k8s.local:5100/google_containers/$i
docker push repo.k8s.local:5100/google_containers/$i
docker rmi registry.aliyuncs.com/google_containers/$i
done
chmod +x docker_images.sh
sh docker_images.sh
kubeadm config images list –kubernetes-version=1.28.2 | awk -F’/’ ‘{print $NF}’ > images.txt
查看下载的镜像
docker images
方式二
在master和node上用containers操作
containers镜像导入命令和docker有所不同,推私仓拉取时需带–all-platforms,此时会占用大量空间和带宽。
# 导入好镜像以后用crictl查看ctr也能查看,但是不直观
crictl images
ctr -n k8s.io images ls
ctr -n k8s.io image import 加镜像名字 # 或者导入自己的镜像仓库在pull下来
#测试打标成官方,
ctr -n k8s.io i pull -k registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.28.2
ctr -n k8s.io i tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.28.2 registry.k8s.io/kube-apiserver:v1.28.2
ctr -n k8s.io i rm registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.28.2
测试推送私仓,拉取时需加--all-platforms
ctr -n k8s.io i pull -k --all-platforms registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.28.2
ctr -n k8s.io i tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.28.2 repo.k8s.local:5100/google_containers/kube-apiserver:v1.28.2
ctr -n k8s.io i push -k --user k8s_user1:k8s_Uus1 --plain-http repo.k8s.local:5100/google_containers/kube-apiserver:v1.28.2
错误1
ctr: content digest sha256:07742a71be5e2ac5dc434618fa720ba38bebb463e3bdc0c58b600b4f7716bc3d: not found
拉取镜像、导出镜像时,都加上–all-platforms
错误2
ctr: failed to do request: Head "https://repo.k8s.local:5100/v2/google_containers/kube-apiserver/blobs/sha256:2248d40e3af29ab47f33357e4ecdc9dca9a89daea07ac3a5a76de583ed06c776": http: server gave HTTP response to HTTPS client
如harbor没开https,推送时加上 -user ${harboruser}:${harborpwd} –plain-http
#下载镜像,重新标记为私有仓库,并推送到私有仓库,删除被标记的阿里云的镜像
vi ctr_images.sh
#!/bin/bash
harboruser=k8s_user1
harborpwd=k8s_Uus1
images=$(kubeadm config images list --kubernetes-version=1.28.2 | awk -F'/' '{print $NF}')
for imageName in ${images[@]} ; do
ctr -n k8s.io i pull -k -all-platforms registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName
#ctr -n k8s.io i tag registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName registry.k8s.io/$imageName
ctr -n k8s.io i tag registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName repo.k8s.local:5100/google_containers/$imageName
ctr -n k8s.io i push --user ${harboruser}:${harborpwd} --plain-http repo.k8s.local:5100/google_containers/$imageName
#ctr -n k8s.io i rm registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName
done
chmod +x ctr_images.sh
./ctr_images.sh
crictl images
FATA[0000] validate service connection: CRI v1 image API is not implemented for endpoint "unix:///var/run/containerd/containerd.sock": rpc error: code = Unimplemented desc = unknown service runtime.v1.ImageService
修复错误
#找到runtime_type 写入"io.containerd.runtime.v1.linux"
vi /etc/containerd/config.toml
[plugins."io.containerd.grpc.v1.cri".containerd.default_runtime]
base_runtime_spec = ""
cni_conf_dir = ""
cni_max_conf_num = 0
container_annotations = []
pod_annotations = []
privileged_without_host_devices = false
privileged_without_host_devices_all_devices_allowed = false
runtime_engine = ""
runtime_path = ""
runtime_root = ""
runtime_type = "io.containerd.runtime.v1.linux"
sandbox_mode = ""
snapshotter = ""
systemctl restart containerd
crictl images
IMAGE TAG IMAGE ID SIZE
还有种方法是关闭cri插件,还没试过
cat /etc/containerd/config.toml |grep cri
sed -i -r '/cri/s/(.*)/#\1/' /etc/containerd/config.toml
创建集群
目前生产部署Kubernetes集群主要有两种方式:
-
- kubeadm安装
Kubeadm 是一个 K8s 部署工具,提供 kubeadm init 和 kubeadm join,用于快速部 署 Kubernetes 集群。
官方地址:https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm/
- kubeadm安装
-
- 二进制包安装
从 github 下载发行版的二进制包,手动部署每个组件,组成 Kubernetes 集群。
Kubeadm 降低部署门槛,但屏蔽了很多细节,遇到问题很难排查。如果想更容易可 控,推荐使用二进制包部署 Kubernetes 集群,虽然手动部署麻烦点,期间可以学习很 多工作原理,也利于后期维护。
- 二进制包安装
kubeadm 安装
初始化集群
下面的操作只需要在master节点上执行即可
systemctl start kubelet && systemctl enable kubelet && systemctl is-active kubelet
kubeadm config print init-defaults | tee kubernetes-init.yaml
kubeadm 创建集群
参考: https://kubernetes.io/zh-cn/docs/reference/setup-tools/kubeadm/kubeadm-init/
A. 命令行:
kubeadm init \
--kubernetes-version=v1.28.2 \
--image-repository registry.aliyuncs.com/google_containers \
--pod-network-cidr=10.244.0.0/16 \
--service-cidr=10.96.0.0/16 \
--apiserver-advertise-address=192.168.244.4 \
--cri-socket unix:///var/run/containerd/containerd.sock
kubeadm init 选项说明
为控制平面选择一个特定的 Kubernetes 版本。
–kubernetes-version string 默认值:"stable-1"
选择用于拉取控制平面镜像的容器仓库。
–image-repository string 默认值:"registry.k8s.io"
指明 Pod 网络可以使用的 IP 地址段。如果设置了这个参数,控制平面将会为每一个节点自动分配 CIDR。
–pod-network-cidr string
为服务的虚拟 IP 地址另外指定 IP 地址段。
–service-cidr string 默认值:"10.96.0.0/12"
API 服务器所公布的其正在监听的 IP 地址。如果未设置,则使用默认网络接口。
–apiserver-advertise-address string
要连接的 CRI 套接字的路径。如果为空,则 kubeadm 将尝试自动检测此值; 仅当安装了多个 CRI 或具有非标准 CRI 套接字时,才使用此选项。
–cri-socket string
不做任何更改;只输出将要执行的操作。
–dry-run
指定节点的名称。
–node-name string
为服务另外指定域名,例如:"myorg.internal"。
–service-dns-domain string 默认值:"cluster.local"
B. 配置文件
命令行方式指定参数执行起来非常冗余,尤其参数比较多的情况,此时可以将所有的配置要求写到配置文件中,部署的时候指定对应的配置文件即可,假设取名kubernetes-init.yaml:
写好配置文件后,直接执行:
vi kubernetes-init.yaml
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.244.4
bindPort: 6443
nodeRegistration:
criSocket: unix:///var/run/containerd/containerd.sock
imagePullPolicy: IfNotPresent
name: node
taints: null
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns: {}
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.28.2
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/16
scheduler: {}
kubeadm init –config kubeadm.yaml
查看启动service
cat /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
# Note: This dropin only works with kubeadm and kubelet v1.11+
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/sysconfig/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
防雪崩,同时限制pod、k8s系统组件、linux系统守护进程资源
https://kubernetes.io/zh-cn/docs/concepts/scheduling-eviction/node-pressure-eviction/
当工作节点的磁盘,内存资源使用率达到阈值时会出发POD驱逐
工作节点修改/var/lib/kubelet/config.yaml 无效.需修改 /var/lib/kubelet/kubeadm-flags.env
默认的驱逐阈值如下:
–eviction-hard default
imagefs.available<15%,memory.available<100Mi,nodefs.available<10%,nodefs.inodesFress<5%
vi /var/lib/kubelet/config.yaml
只限pod,线上应增加memory.available值,留出kube和system和使用内存。
硬驱逐1G,软驱逐2G
enforceNodeAllocatable:
- pods
evictionHard:
memory.available: "300Mi"
nodefs.available: "10%"
nodefs.inodesFree: "5%"
imagefs.available: "10%"
evictionMinimumReclaim:
memory.available: "0Mi"
nodefs.available: "500Mi"
imagefs.available: "2Gi"
evictionSoft:
memory.available: "800Mi"
nodefs.available: "15%"
nodefs.inodesFree: "10%"
imagefs.available: "15%"
evictionSoftGracePeriod:
memory.available: "120s"
nodefs.available: "120s"
nodefs.inodesFree: "120s"
imagefs.available: "120s"
evictionMaxPodGracePeriod: 30
systemctl restart kubelet
systemctl status kubelet
报错
grace period must be specified for the soft eviction threshold nodefs.available"
如果配制了软驱逐evictionSoft那么相应的evictionSoftGracePeriod也需配制
evictionSoft:
nodefs.available: "15%"
nodefs.inodesFree: "10%"
imagefs.available: "15%"
-
enforceNodeAllocatable[]string
此标志设置 kubelet 需要执行的各类节点可分配资源策略。此字段接受一组选项列表。 可接受的选项有 none、pods、system-reserved 和 kube-reserved。
如果设置了 none,则字段值中不可以包含其他选项。
如果列表中包含 system-reserved,则必须设置 systemReservedCgroup。
如果列表中包含 kube-reserved,则必须设置 kubeReservedCgroup。
这个字段只有在 cgroupsPerQOS被设置为 true 才被支持。
默认值:["pods"] -
evictionHardmap[string]string
evictionHard 是一个映射,是从信号名称到定义硬性驱逐阈值的映射。 例如:{"memory.available": "300Mi"}。 如果希望显式地禁用,可以在任意资源上将其阈值设置为 0% 或 100%。
默认值:
memory.available: "100Mi"
nodefs.available: "10%"
nodefs.inodesFree: "5%"
imagefs.available: "15%" -
evictionSoftmap[string]string
evictionSoft 是一个映射,是从信号名称到定义软性驱逐阈值的映射。 例如:{"memory.available": "300Mi"}。
默认值:nil -
evictionSoftGracePeriodmap[string]string
evictionSoftGracePeriod 是一个映射,是从信号名称到每个软性驱逐信号的宽限期限。 例如:{"memory.available": "30s"}。
默认值:nil -
evictionPressureTransitionPeriod
evictionPressureTransitionPeriod 设置 kubelet 离开驱逐压力状况之前必须要等待的时长。
默认值:"5m" -
evictionMaxPodGracePeriod int32
evictionMaxPodGracePeriod 是指达到软性逐出阈值而引起 Pod 终止时, 可以赋予的宽限期限最大值(按秒计)。这个值本质上限制了软性逐出事件发生时, Pod 可以获得的 terminationGracePeriodSeconds。
注意:由于 Issue #64530 的原因,系统中存在一个缺陷,即此处所设置的值会在软性逐出时覆盖 Pod 的宽限期设置,从而有可能增加 Pod 上原本设置的宽限期限时长。 这个缺陷会在未来版本中修复。
默认值:0 -
evictionMinimumReclaimmap[string]string
evictionMinimumReclaim 是一个映射,定义信号名称与最小回收量数值之间的关系。 最小回收量指的是资源压力较大而执行 Pod 驱逐操作时,kubelet 对给定资源的最小回收量。 例如:{"imagefs.available": "2Gi"}。
默认值:nil
根据提示创建必要文件
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
export KUBECONFIG=/etc/kubernetes/admin.conf
此步骤执行完成之后,使用命令docker images查看系统中的镜像,可以我们需要的镜像均已安装完成。
记下输出,后面加入时使用
kubeadm join 192.168.244.4:6443 --token rynvll.2rdb5z78if3mtlgd \
--discovery-token-ca-cert-hash sha256:42739a7aaff927af9dc3b77a5e684a93d1c6485a79b8d23c33d978476c6902e2
删除或重新配置集群
kubeadm reset
rm -fr ~/.kube/ /etc/kubernetes/* var/lib/etcd/*
node节点也可以执行kubectl,如重置也一同删除
scp -r ~/.kube node01.k8s.local:~/
scp -r ~/.kube node02.k8s.local:~/
node节点加入集群
将 node 节点加入 master 中的集群(该命令在工作节点node中执行)。
yum install --setopt=obsoletes=0 kubelet-1.28.2 kubeadm-1.28.2 kubectl-1.28.2
Package Arch Version Repository Size
==============================================================================================================================================================================================================
Installing:
kubeadm x86_64 1.28.2-0 kubernetes 11 M
kubectl x86_64 1.28.2-0 kubernetes 11 M
kubelet x86_64 1.28.2-0 kubernetes 21 M
Installing for dependencies:
conntrack-tools x86_64 1.4.4-7.el7 base 187 k
cri-tools x86_64 1.26.0-0 kubernetes 8.6 M
kubernetes-cni x86_64 1.2.0-0 kubernetes 17 M
libnetfilter_cthelper x86_64 1.0.0-11.el7 base 18 k
libnetfilter_cttimeout x86_64 1.0.0-7.el7 base 18 k
libnetfilter_queue x86_64 1.0.2-2.el7_2 base 23 k
socat x86_64 1.7.3.2-2.el7 base 290 k
Transaction Summary
==============================================================================================================================================================================================================
systemctl enable --now kubelet
Created symlink from /etc/systemd/system/multi-user.target.wants/kubelet.service to /usr/lib/systemd/system/kubelet.service.
将node1节点加入集群
如果没有令牌,可以通过在控制平面节点上运行以下命令来获取令牌:
kubeadm token list
#默认情况下,令牌会在24小时后过期。如果要在当前令牌过期后将节点加入集群
#kubeadm token create
kubeadm token create --print-join-command
kubeadm join 192.168.244.4:6443 --token rynvll.2rdb5z78if3mtlgd \
--discovery-token-ca-cert-hash sha256:42739a7aaff927af9dc3b77a5e684a93d1c6485a79b8d23c33d978476c6902e2
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
将node2节点加入集群
kubeadm join 192.168.244.4:6443 --token rynvll.2rdb5z78if3mtlgd \
--discovery-token-ca-cert-hash sha256:42739a7aaff927af9dc3b77a5e684a93d1c6485a79b8d23c33d978476c6902e2
查看集群状态 此时的集群状态为NotReady,这是因为还没有配置网络插件
kubectl get nodes
NAME STATUS ROLES AGE VERSION
master01.k8s.local NotReady control-plane 5m2s v1.28.2
node01.k8s.local NotReady <none> 37s v1.28.2
kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-66f779496c-9x9cf 0/1 Pending 0 5m21s
kube-system coredns-66f779496c-xwgtx 0/1 Pending 0 5m21s
kube-system etcd-master01.k8s.local 1/1 Running 0 5m34s
kube-system kube-apiserver-master01.k8s.local 1/1 Running 0 5m34s
kube-system kube-controller-manager-master01.k8s.local 1/1 Running 0 5m34s
kube-system kube-proxy-58mfp 1/1 Running 0 72s
kube-system kube-proxy-z9tpc 1/1 Running 0 5m21s
kube-system kube-scheduler-master01.k8s.local 1/1 Running 0 5m36s
删除子节点(在master主节点上操作)
# 假设这里删除 node3 子节点
kubectl drain node3 --delete-local-data --force --ignore-daemonsets
kubectl delete node node3
然后在删除的子节点上操作重置k8s(重置k8s会删除一些配置文件),这里在node3子节点上操作
kubeadm reset
然后在被删除的子节点上手动删除k8s配置文件、flannel网络配置文件 和 flannel网口:
rm -rf /etc/cni/net.d/
rm -rf /root/.kube/config
#删除cni网络
ifconfig cni0 down
ip link delete cni0
ifconfig flannel.1 down
ip link delete flannel.1
安装 Pod 网络插件
flannel/calico/cilium
如需将flannel网络切换calico网络
kubelet 配置必须增加 –network-plugin=cni 选项
kubec-proxy 组件不能采用 –masquerade-all 启动,因为会与 Calico policy 冲突,并且需要加上–proxy-mode=ipvs(ipvs模式),–masquerade-all=true(表示ipvs proxier将伪装访问服务群集IP的所有流量,)
安装flannel
# 最好提前下载镜像(所有节点)
wget -k https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
---
kind: Namespace
apiVersion: v1
metadata:
name: kube-flannel
labels:
k8s-app: flannel
pod-security.kubernetes.io/enforce: privileged
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
labels:
k8s-app: flannel
name: flannel
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
- apiGroups:
- networking.k8s.io
resources:
- clustercidrs
verbs:
- list
- watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
labels:
k8s-app: flannel
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-flannel
---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: flannel
name: flannel
namespace: kube-flannel
---
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-flannel
labels:
tier: node
k8s-app: flannel
app: flannel
data:
cni-conf.json: |
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "vxlan"
}
}
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds
namespace: kube-flannel
labels:
tier: node
app: flannel
k8s-app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
hostNetwork: true
priorityClassName: system-node-critical
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni-plugin
image: docker.io/flannel/flannel-cni-plugin:v1.2.0
command:
- cp
args:
- -f
- /flannel
- /opt/cni/bin/flannel
volumeMounts:
- name: cni-plugin
mountPath: /opt/cni/bin
- name: install-cni
image: docker.io/flannel/flannel:v0.22.3
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: docker.io/flannel/flannel:v0.22.3
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN", "NET_RAW"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: EVENT_QUEUE_DEPTH
value: "5000"
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
- name: xtables-lock
mountPath: /run/xtables.lock
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni-plugin
hostPath:
path: /opt/cni/bin
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
- name: xtables-lock
hostPath:
path: /run/xtables.lock
type: FileOrCreate
修改Network项的值,改为和–pod-network-cidr一样的值
vxlan/host-gw
host-gw 的性能损失大约在 10% 左右,而其他所有基于 VXLAN“隧道”机制的网络方案,性能损失都在 20%~30% 左右
Flannel host-gw 模式必须要求集群宿主机之间是二层连通的.就是node1和node2在一个局域网.通过arp协议可以访问到.
工作方式首选vxlan+DirectRouting
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "vxlan",
"Directrouting": true
}
}
准备镜像
由于有时国内网络的问题,需要修改image的地址,把所有的docker.io改为dockerproxy.com,或者下载下来后打标签
方式一。改地址
修改kube-flannel.yml中地址为能下载的
image: docker.io/flannel/flannel-cni-plugin:v1.2.0
image: docker.io/flannel/flannel:v0.22.3
方式二。下载到本地
docker pull docker.io/flannel/flannel-cni-plugin:v1.2.0
docker pull docker.io/flannel/flannel:v0.22.3
container 操作下载到本地
ctr -n k8s.io i pull -k docker.io/flannel/flannel-cni-plugin:v1.2.0
ctr -n k8s.io i pull -k docker.io/flannel/flannel:v0.22.3
方式三。下载后推送到本地私人仓库
批量打标签
如在harbor仓库安装,可以导出到images.txt中再执行
# cat docker_flannel.sh
#!/bin/bash
imagesfile=images.txt
$(grep image kube-flannel.yml | grep -v '#' | awk -F '/' '{print $NF}' > ${imagesfile})
images=$(cat ${imagesfile})
for i in ${images}
do
docker pull flannel/$i
docker tag flannel/$i repo.k8s.local:5100/google_containers/$i
docker push repo.k8s.local:5100/google_containers/$i
docker rmi flannel/$i
done
#执行脚本文件:
sh docker_flannel.sh
查看镜像
crictl images
IMAGE TAG IMAGE ID SIZE
docker.io/flannel/flannel-cni-plugin v1.2.0 a55d1bad692b7 3.88MB
docker.io/flannel/flannel v0.22.3 e23f7ca36333c 27MB
应用安装
kubectl apply -f kube-flannel.yml
namespace/kube-flannel created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created
查看pods
kubectl get pods -n kube-flannel
NAME READY STATUS RESTARTS AGE
kube-flannel-ds-flgpn 1/1 Running 0 58s
kube-flannel-ds-qdw64 1/1 Running 0 58s
验证安装结果,仅在master节点执行,二进制tar.gz包安装所有节点都要操作
kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-66f779496c-9x9cf 1/1 Running 0 118m
coredns-66f779496c-xwgtx 1/1 Running 0 118m
etcd-master01.k8s.local 1/1 Running 0 118m
kube-apiserver-master01.k8s.local 1/1 Running 0 118m
kube-controller-manager-master01.k8s.local 1/1 Running 0 118m
kube-proxy-58mfp 1/1 Running 0 114m
kube-proxy-z9tpc 1/1 Running 0 118m
kube-scheduler-master01.k8s.local 1/1 Running 0 118m
systemctl status containerd
systemctl status kubelet
systemctl restart kubelet
kubectl get nodes
run.go:74] "command failed" err="failed to run Kubelet: validate service connection: validate CRI v1 runtime API for endpoint \"unix:///var/run/containerd/containerd.sock\
检查containerd服务有无错误
failed to load plugin io.containerd.grpc.v1.cri" error="invalid plugin config: `mirrors` cannot be set when `config_path` is provided"
/etc/containerd/config.toml中config_path 和 registry.mirrors 取一配制
[plugins."io.containerd.grpc.v1.cri".registry]
#config_path = "/etc/containerd/certs.d"
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://registry.cn-hangzhou.aliyuncs.com"]
transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: no such file or directory
/usr/lib/systemd/system/kubelet.service 里的Unit.After 改成 After=containerd.target如是docker改成After=docker.target
我这里修改之前的值是After=network-online.target
err="failed to run Kubelet: running with swap on is not supported, please disable swap
永久关闭swap
swapoff -a && sed -ri 's/.*swap.*/#&/' /etc/fstab
"Failed to run kubelet" err="failed to run Kubelet: misconfiguration: kubelet cgroup driver: \"systemd\" is different from docker cgroup driver: \"cgroupfs\""
经过分析后发现,是因为“kebernetes默认设置cgroup驱动为systemd,而docker服务的cgroup驱动为cgroupfs”,有两种决解决方式,方式一,将docker的服务配置文件修改为何kubernetes的相同,方式二是修改kebernetes的配置文件为cgroupfs,这里采用第一种。
修改docker服务的配置文件,“/etc/docker/daemon.json ”文件,添加如下
"exec-opts": ["native.cgroupdriver=systemd"]
node节点可执行kubectl
scp -r ~/.kube node02.k8s.local:~/
请忽略,安装自动补全后再设置
vi ~/.bashrc
source <(kubectl completion bash)
command -v kubecolor >/dev/null 2>&1 && alias k="kubecolor"
complete -o default -F __start_kubectl k
export PATH="/root/.krew/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin"
[ -f ~/.kube/aliases.sh ] && source ~/.kube/aliases.sh
source ~/.bashrc
No Responses (yet)
Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.