通过 kubeadm 安装 kubenetes
- 1: 在 debian13 上安装 kubenetes
- 1.1: 在线安装 kubenetes
- 1.1.1: 准备工作
- 1.1.2: 安装命令行
- 1.1.3: 初始化集群
- 1.1.4: 安装 dashboard
- 1.1.5: 安装 metrics server
- 1.1.6: 安装监控
- 1.2: 预热安装 kubenetes
1 - 在 debian13 上安装 kubenetes
有三种安装方式:
-
在线安装: 最标准的安装方法,最大的问题就是需要联网+科学上网,速度慢,中途有被墙/被dns污染的风险
-
预热安装: 在在线安装的基础上,提前准备好安装文件和镜像文件,速度快,而且不需要用到镜像仓库。需要充分的提前准备,最好结合 pve 模板一起使用
-
离线安装: 需要提前下载好所有需要的文件到本地或者本地镜像仓库,速度快,但是同样需要充分的提前准备,而且需要用到 harbor 之类的镜像仓库
1.1 - 在线安装 kubenetes
参考官方文档:
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
1.1.1 - 准备工作
系统更新
确保更新debian系统到最新,移除不再需要的软件,清理无用的安装包:
sudo apt update && sudo apt full-upgrade -y
sudo apt autoremove
sudo apt autoclean
如果更新了内核,最好重启一下。
swap 分区
安装 Kubernetes 要求机器不能有 swap 分区。
参考:
开启模块
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
# sysctl params required by setup, params persist across reboots
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
# Apply sysctl params without reboot
sudo sysctl --system
container runtime
Kubernetes 支持多种 container runtime,这里暂时继续使用 docker engine + cri-dockerd。
参考:
https://kubernetes.io/docs/setup/production-environment/container-runtimes/
安装 docker
docker 的安装参考:
https://skyao.net/learning-docker/docs/installation/debian13/
最方便的方式就是使用离线安装包进行安装。
安装 containerd
TODO:后面考虑换 containerd
安装 helm
参考:
https://helm.sh/docs/intro/install/#from-apt-debianubuntu
选择用官方脚本安装 Helm(以便绕过无法访问的 apt 仓库):
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
输出为:
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 11929 100 11929 0 0 48738 0 --:--:-- --:--:-- --:--:-- 48889
Downloading https://get.helm.sh/helm-v3.20.2-linux-amd64.tar.gz
Verifying checksum... Done.
Preparing to install helm into /usr/local/bin
helm installed into /usr/local/bin/helm
验证版本:
helm version
版本显示为 v3.20.2:
version.BuildInfo{Version:"v3.20.2", GitCommit:"8fb76d6ab555577e98e23b7500009537a471feee", GitTreeState:"clean", GoVersion:"go1.25.9"}
1.1.2 - 安装命令行
参考: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
安装 kubeadm / kubelet / kubectl
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl gpg
假定要安装的 kubernetes 版本为 1.35:
export K8S_VERSION=1.35
# sudo mkdir -p -m 755 /etc/apt/keyrings
curl -fsSL https://pkgs.k8s.io/core:/stable:/v${K8S_VERSION}/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v${K8S_VERSION}/deb/ /" | sudo tee /etc/apt/sources.list.d/kubernetes.list
开始安装 kubelet kubeadm kubectl:
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
禁止这三个程序的自动更新:
sudo apt-mark hold kubelet kubeadm kubectl
验证安装:
kubectl version --client && echo && kubeadm version
输出为:
Client Version: v1.35.4
Kustomize Version: v5.7.1
kubeadm version: &version.Info{Major:"1", Minor:"35", EmulationMajor:"", EmulationMinor:"", MinCompatibilityMajor:"", MinCompatibilityMinor:"", GitVersion:"v1.35.4", GitCommit:"7b8c6cf0edd376b3d7c2f255142977c7f93db258", GitTreeState:"clean", BuildDate:"2026-04-15T18:03:27Z", GoVersion:"go1.25.9", Compiler:"gc", Platform:"linux/amd64"}
在运行 kubeadm 之前,先启动 kubelet 服务:
sudo systemctl enable --now kubelet
安装后配置
优化 zsh
vi ~/.zshrc
增加以下内容:
# k8s auto complete
alias k=kubectl
complete -F __start_kubectl k
执行:
source ~/.zshrc
之后即可使用,此时用 k 这个别名来执行 kubectl 命令时也可以实现自动完成,非常的方便。
取消更新
kubeadm / kubelet / kubectl 的版本没有必要升级到最新,因此可以取消他们的自动更新。
sudo vi /etc/apt/sources.list.d/kubernetes.list
注释掉里面的内容。
备注:前面执行 apt-mark hold 后已经不会再更新了,但依然会拖慢 apt update 的速度,因此还是需要手动注释。
常见问题
prod-cdn.packages.k8s.io 无法访问
偶然会遇到 prod-cdn.packages.k8s.io 无法访问的问题,此时的报错如下:
sudo apt-get update
Hit:1 http://mirrors.ustc.edu.cn/debian bookworm InRelease
Hit:2 http://mirrors.ustc.edu.cn/debian bookworm-updates InRelease
Hit:3 http://security.debian.org/debian-security bookworm-security InRelease
Ign:4 https://prod-cdn.packages.k8s.io/repositories/isv:/kubernetes:/core:/stable:/v1.32/deb InRelease
Ign:4 https://prod-cdn.packages.k8s.io/repositories/isv:/kubernetes:/core:/stable:/v1.32/deb InRelease
Ign:4 https://prod-cdn.packages.k8s.io/repositories/isv:/kubernetes:/core:/stable:/v1.32/deb InRelease
Err:4 https://prod-cdn.packages.k8s.io/repositories/isv:/kubernetes:/core:/stable:/v1.32/deb InRelease
Could not connect to prod-cdn.packages.k8s.io:443 (221.228.32.13), connection timed out
Reading package lists... Done
W: Failed to fetch https://pkgs.k8s.io/core:/stable:/v1.32/deb/InRelease Could not connect to prod-cdn.packages.k8s.io:443 (221.228.32.13), connection timed out
W: Some index files failed to download. They have been ignored, or old ones used instead.
首先排除是网络问题,因为实际配好网络代理,也依然无法访问。
后来发现,在不同地区的机器上 ping prod-cdn.packages.k8s.io 的 ip 地址是不一样的,
$ ping prod-cdn.packages.k8s.io
Pinging dkhzw6k7x6ord.cloudfront.net [108.139.10.84] with 32 bytes of data:
Reply from 108.139.10.84: bytes=32 time=164ms TTL=242
Reply from 108.139.10.84: bytes=32 time=166ms TTL=242
......
# 这个地址无法访问
$ ping prod-cdn.packages.k8s.io
PING dkhzw6k7x6ord.cloudfront.net (221.228.32.13) 56(84) bytes of data.
64 bytes from 221.228.32.13 (221.228.32.13): icmp_seq=1 ttl=57 time=9.90 ms
64 bytes from 221.228.32.13 (221.228.32.13): icmp_seq=2 ttl=57 time=11.4 ms
......
因此考虑通过修改 /etc/hosts 文件来避开 dns 解析的问题:
sudo vi /etc/hosts
添加如下内容:
108.139.10.84 prod-cdn.packages.k8s.io
这样在出现问题的这台机器上,强制将 prod-cdn.packages.k8s.io 解析到 108.139.10.84 这个 ip 地址,这样就可以访问了。
1.1.3 - 初始化集群
参考官方文档:
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/
需要注意的问题
coredns 的版本问题
[sudo] password for sky:
[init] Using Kubernetes version: v1.34.2
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action beforehand using 'kubeadm config images pull'
W1127 11:14:12.643662 17241 checks.go:827] detected that the sandbox image "registry.k8s.io/pause:3.10" of the container runtime is inconsistent with that used by kubeadm. It is recommended to use "192.168.3.193:5000/k8s-proxy/pause:3.10.1" as the CRI sandbox image.
[preflight] Some fatal errors occurred:
[ERROR ImagePull]: failed to pull image 192.168.3.193:5000/k8s-proxy/coredns:v1.12.1: failed to pull image 192.168.3.193:5000/k8s-proxy/coredns:v1.12.1: Error response from daemon: failed to resolve reference "192.168.3.193:5000/k8s-proxy/coredns:v1.12.1": 192.168.3.193:5000/k8s-proxy/coredns:v1.12.1: not found
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
error: error execution phase preflight: preflight checks failed
To see the stack trace of this error execute with --v=5 or higher
这里 coredns 报错, 我们用 kubeadm config images pull 可以看到需要用到的镜像:
$ kubeadm config images pull
[config/images] Pulled registry.k8s.io/kube-apiserver:v1.35.4
[config/images] Pulled registry.k8s.io/kube-controller-manager:v1.35.4
[config/images] Pulled registry.k8s.io/kube-scheduler:v1.35.4
[config/images] Pulled registry.k8s.io/kube-proxy:v1.35.4
[config/images] Pulled registry.k8s.io/coredns/coredns:v1.13.1
[config/images] Pulled registry.k8s.io/pause:3.10.1
[config/images] Pulled registry.k8s.io/etcd:3.6.6-0
Kubernetes 大多数核心组件镜像(kube-apiserver、kube-scheduler、etcd、pause 等)都是单层路径,而 CoreDNS 特别用了两层路径 coredns/coredns。这会导致在使用代理仓库时出错:
$ kubeadm config images pull --image-repository=192.168.3.193:5000/k8s-proxy
[config/images] Pulled 192.168.3.193:5000/k8s-proxy/kube-apiserver:v1.34.2
[config/images] Pulled 192.168.3.193:5000/k8s-proxy/kube-controller-manager:v1.34.2
[config/images] Pulled 192.168.3.193:5000/k8s-proxy/kube-scheduler:v1.34.2
[config/images] Pulled 192.168.3.193:5000/k8s-proxy/kube-proxy:v1.34.2
error: failed to pull image "192.168.3.193:5000/k8s-proxy/coredns:v1.12.1": failed to pull image 192.168.3.193:5000/k8s-proxy/coredns:v1.12.1: Error response from daemon: failed to resolve reference "192.168.3.193:5000/k8s-proxy/coredns:v1.12.1": 192.168.3.193:5000/k8s-proxy/coredns:v1.12.1: not found
To see the stack trace of this error execute with --v=5 or higher
要修订这个问题,就需要告知 kubeadm coredns 的代理仓库
初始化集群
准备 kubeadm.yaml
pod-network-cidr 尽量用 10.244.0.0/16 这个范围,不然有些网络插件会需要额外的配置。
cri-socket 的配置参考:
因为前面用的 Docker Engine 和 cri-dockerd ,因此这里的 cri-socket 需要指定为 “unix:///var/run/cri-dockerd.sock”。
apiserver-advertise-address 需要指定为当前节点的 IP 地址,因为当前节点是单节点,因此这里指定为当前机器的 IP 地址如 192.168.3.100。
新建一个 kubeadm.yaml 文件:
mkdir -p ~/work/soft/k8s/
cd ~/work/soft/k8s/
vi kubeadm.yaml
内容为:
apiVersion: kubeadm.k8s.io/v1beta4
kind: InitConfiguration
nodeRegistration:
criSocket: unix:///var/run/cri-dockerd.sock
---
apiVersion: kubeadm.k8s.io/v1beta4
kind: ClusterConfiguration
kubernetesVersion: v1.35.4
imageRepository: 192.168.3.193:5000/k8s-proxy
networking:
podSubnet: 10.244.0.0/16
dns:
imageRepository: 192.168.3.193:5000/k8s-proxy/coredns
imageTag: v1.13.1
etcd:
local:
imageRepository: 192.168.3.193:5000/k8s-proxy
imageTag: 3.6.6-0
执行 init
执行:
sudo kubeadm init --config=kubeadm.yaml
输出为:
[init] Using Kubernetes version: v1.35.4
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [debian13 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.3.232]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [debian13 localhost] and IPs [192.168.3.232 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [debian13 localhost] and IPs [192.168.3.232 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/instance-config.yaml"
[patches] Applied patch of type "application/strategic-merge-patch+json" to target "kubeletconfiguration"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 500.416007ms
[control-plane-check] Waiting for healthy control plane components. This can take up to 4m0s
[control-plane-check] Checking kube-apiserver at https://192.168.3.232:6443/livez
[control-plane-check] Checking kube-controller-manager at https://127.0.0.1:10257/healthz
[control-plane-check] Checking kube-scheduler at https://127.0.0.1:10259/livez
[control-plane-check] kube-controller-manager is healthy after 1.001317452s
[control-plane-check] kube-scheduler is healthy after 1.502089525s
[control-plane-check] kube-apiserver is healthy after 3.000477876s
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node debian13 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node debian13 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: ki33kg.5vpalsrlaa9bjvn2
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.3.232:6443 --token ki33kg.5vpalsrlaa9bjvn2 \
--discovery-token-ca-cert-hash sha256:dba4e956a1e678da2319e29ce1b880bc4b21771cd5635dfa3117b7c2f409af29
根据提示操作:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
对于测试用的单节点,去除 master/control-plane 的污点:
kubectl taint nodes --all node-role.kubernetes.io/control-plane-
执行:
kubectl get node
能看到此时节点的状态会是 NotReady:
NAME STATUS ROLES AGE VERSION
debian13 NotReady control-plane 53s v1.35.4
执行:
kubectl describe node debian13
能看到节点的错误信息:
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
MemoryPressure False Wed, 22 Apr 2026 15:24:55 +0800 Wed, 22 Apr 2026 15:24:53 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Wed, 22 Apr 2026 15:24:55 +0800 Wed, 22 Apr 2026 15:24:53 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Wed, 22 Apr 2026 15:24:55 +0800 Wed, 22 Apr 2026 15:24:53 +0800 KubeletHasSufficientPID kubelet has sufficient PID available
Ready False Wed, 22 Apr 2026 15:24:55 +0800 Wed, 22 Apr 2026 15:24:53 +0800 KubeletNotReady container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
需要继续安装网络插件。
安装网络插件
安装 flannel
参考官方文档: https://github.com/flannel-io/flannel#deploying-flannel-with-kubectl
kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml
如果一切正常,就能看到 k8s 集群内的 pod 都启动完成状态为 Running:
$ k get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-flannel kube-flannel-ds-mdzgj 1/1 Running 0 39s
kube-system coredns-7f58b9688b-9rdzj 1/1 Running 0 2m44s
kube-system coredns-7f58b9688b-lv4jr 1/1 Running 0 2m44s
kube-system etcd-debian13 1/1 Running 0 2m51s
kube-system kube-apiserver-debian13 1/1 Running 1 (5m8s ago) 2m52s
kube-system kube-controller-manager-debian13 1/1 Running 0 2m51s
kube-system kube-proxy-qzhkw 1/1 Running 0 2m44s
kube-system kube-scheduler-debian13 1/1 Running 0 2m51s
如果发现 kube-flannel-ds pod 的状态总是 CrashLoopBackOff:
k get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-flannel kube-flannel-ds-ts6n8 0/1 CrashLoopBackOff 2 (22s ago) 42s
继续查看 pod 的具体错误信息:
k describe pods -n kube-flannel kube-flannel-ds-ts6n8
发现报错 “Back-off restarting failed container kube-flannel in pod kube-flannel”:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 117s default-scheduler Successfully assigned kube-flannel/kube-flannel-ds-ts6n8 to debian12
Normal Pulled 116s kubelet Container image "ghcr.io/flannel-io/flannel-cni-plugin:v1.6.2-flannel1" already present on machine
Normal Created 116s kubelet Created container: install-cni-plugin
Normal Started 116s kubelet Started container install-cni-plugin
Normal Pulled 115s kubelet Container image "ghcr.io/flannel-io/flannel:v0.26.4" already present on machine
Normal Created 115s kubelet Created container: install-cni
Normal Started 115s kubelet Started container install-cni
Normal Pulled 28s (x5 over 114s) kubelet Container image "ghcr.io/flannel-io/flannel:v0.26.4" already present on machine
Normal Created 28s (x5 over 114s) kubelet Created container: kube-flannel
Normal Started 28s (x5 over 114s) kubelet Started container kube-flannel
Warning BackOff 2s (x10 over 110s) kubelet Back-off restarting failed container kube-flannel in pod kube-flannel-ds-ts6n8_kube-flannel(1e03c200-2062-4838
此时应该去检查准备工作中 “开启模块” 一节的内容是不是有疏漏。
补救之后,就能看到 kube-flannel-ds 这个 pod 正常运行了:
k get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-flannel kube-flannel-ds-ts6n8 1/1 Running 7 (9m27s ago) 15m
1.1.4 - 安装 dashboard
This project is now archived and no longer maintained due to lack of active maintainers and contributors.
安装 dashboard
参考:
https://github.com/kubernetes/dashboard/#installation
在下面地址上查看当前 dashboard 的版本:
https://github.com/kubernetes/dashboard/releases
根据对 kubernetes 版本的兼容情况选择对应的 dashboard 的版本:
- kubernetes-dashboard-7.14.0 ,没看到兼容说明, 但应该支持最新的 k8s 1.34 吧
最新版本需要用 helm 进行安装:
helm repo add kubernetes-dashboard https://kubernetes.github.io/dashboard/
helm upgrade --install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard --create-namespace --namespace kubernetes-dashboard
输出为:
"kubernetes-dashboard" has been added to your repositories
Release "kubernetes-dashboard" does not exist. Installing it now.
NAME: kubernetes-dashboard
LAST DEPLOYED: Thu Nov 27 14:09:11 2025
NAMESPACE: kubernetes-dashboard
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
*************************************************************************************************
*** PLEASE BE PATIENT: Kubernetes Dashboard may need a few minutes to get up and become ready ***
*************************************************************************************************
Congratulations! You have just installed Kubernetes Dashboard in your cluster.
To access Dashboard run:
kubectl -n kubernetes-dashboard port-forward svc/kubernetes-dashboard-kong-proxy 8443:443
NOTE: In case port-forward command does not work, make sure that kong service name is correct.
Check the services in Kubernetes Dashboard namespace using:
kubectl -n kubernetes-dashboard get svc
Dashboard will be available at:
https://localhost:8443
此时 dashboard 的 service 和 pod 情况:
kubectl -n kubernetes-dashboard get services
输出为:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes-dashboard-api ClusterIP 10.102.87.32 <none> 8000/TCP 108s
kubernetes-dashboard-auth ClusterIP 10.108.108.108 <none> 8000/TCP 108s
kubernetes-dashboard-kong-proxy ClusterIP 10.111.46.43 <none> 443/TCP 108s
kubernetes-dashboard-metrics-scraper ClusterIP 10.98.190.255 <none> 8000/TCP 107s
kubernetes-dashboard-web ClusterIP 10.103.159.121 <none> 8000/TCP 108s
查看 pod 的情况:
kubectl -n kubernetes-dashboard get pods
等待两三分钟之后,pod 启动完成,输出为:
NAME READY STATUS RESTARTS AGE
kubernetes-dashboard-api-7994c5cb69-bhdnj 1/1 Running 0 16s
kubernetes-dashboard-auth-764494db59-fqt88 1/1 Running 0 16s
kubernetes-dashboard-kong-9849c64bd-jvwv2 1/1 Running 0 16s
kubernetes-dashboard-metrics-scraper-7685fd8b77-gcq8c 1/1 Running 0 16s
kubernetes-dashboard-web-5c9f966b98-nbjxv 1/1 Running 0 16s
为了方便,使用 node port 来访问 dashboard,需要执行:
kubectl -n kubernetes-dashboard edit service kubernetes-dashboard-kong-proxy
然后修改 type: ClusterIP 为 type: NodePort。然后看一下具体分配的 node port 是哪个:
kubectl -n kubernetes-dashboard get service kubernetes-dashboard-kong-proxy
输出为:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes-dashboard-kong-proxy NodePort 10.107.131.211 <none> 443:30652/TCP 95s
现在可以用浏览器直接访问:
https://192.168.3.100:30652/

创建用户并登录 dashboard
创建 admin-user 用户:
mkdir -p ~/work/soft/k8s
cd ~/work/soft/k8s
vi dashboard-adminuser.yaml
内容为:
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin-user
namespace: kubernetes-dashboard
执行:
k create -f dashboard-adminuser.yaml
然后绑定角色:
vi dashboard-adminuser-binding.yaml
内容为:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: admin-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: admin-user
namespace: kubernetes-dashboard
执行:
k create -f dashboard-adminuser-binding.yaml
然后创建 token :
kubectl -n kubernetes-dashboard create token admin-user
输出为:
eyJhbGciOiJSUzI1NiIsImtpZCI6Ik9sWnJsTk5UNE9JVlVmRFMxMUpwNC1tUlVndTl5Zi1WQWtmMjIzd2hDNmcifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjoxNzQxMTEyNDg4LCJpYXQiOjE3NDExMDg4ODgsImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwianRpIjoiNDU5ZGQxNjctNWI5OS00MWIzLTgzZWEtNGIxMGY3MTc5ZjEyIiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsInNlcnZpY2VhY2NvdW50Ijp7Im5hbWUiOiJhZG1pbi11c2VyIiwidWlkIjoiZjMxN2VhZTItNTNiNi00MGZhLWI3MWYtMzZiNDI1YmY4YWQ0In19LCJuYmYiOjE3NDExMDg4ODgsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlcm5ldGVzLWRhc2hib2FyZDphZG1pbi11c2VyIn0.TYzOdrMFXcSEeVMbc1ewIA13JVi4FUYoRN7rSH5OstbVfKIF48X_o1RWxOGM_AurhgLxuKZHzmns3K_pX_OR3u1URfK6-gGos4iAQY-H1yntfRmzzsip_FbZh95EYFGTN43gw21jTyfem3OKBXXLgzsnVT_29uMnJzSnCDnrAciVKMoCEUP6x2RSHQhp6PrxrIrx_NMB3vojEZYq3AysQoNqYYjRDd4MnDRClm03dNvW5lvKSgNCVmZFje_EEa2EhI2X6d3X8zx6tHwT5M4-T3hMmyIpzHUwf3ixeZR85rhorMbskNVvRpH6VLH6BXP31c3NMeSgYk3BG8d7UjCYxQ
这个 token 就可以用在 kubernetes-dashboard 的登录页面上了。
为了方便,将这个 token 存储在 Secret :
vi dashboard-adminuser-secret.yaml
内容为:
apiVersion: v1
kind: Secret
metadata:
name: admin-user
namespace: kubernetes-dashboard
annotations:
kubernetes.io/service-account.name: "admin-user"
type: kubernetes.io/service-account-token
执行:
k create -f dashboard-adminuser-secret.yaml
之后就可以用命令随时获取这个 token 了:
kubectl get secret admin-user -n kubernetes-dashboard -o jsonpath="{.data.token}" | base64 -d
备注:复制 token 的时候,不要复制最后的那个 % 字符,否则会报错。

1.1.5 - 安装 metrics server
参考:https://github.com/kubernetes-sigs/metrics-server/#installation
安装 metrics server
下载:
mkdir -p ~/work/soft/k8s
cd ~/work/soft/k8s
wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
修改下载下来的 components.yaml, 增加 --kubelet-insecure-tls 并修改 --kubelet-preferred-address-types:
template:
metadata:
labels:
k8s-app: metrics-server
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP # 修改这行,默认是InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --kubelet-insecure-tls # 增加这行
然后安装:
k apply -f components.yaml
稍等片刻看是否启动:
$ kubectl get pod -n kube-system | grep metrics-server
metrics-server-7c9977449d-h4psq 1/1 Running 0 34s
验证一下,查看 service 信息
$ kubectl describe svc metrics-server -n kube-system
Name: metrics-server
Namespace: kube-system
Labels: k8s-app=metrics-server
Annotations: <none>
Selector: k8s-app=metrics-server
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.97.226.236
IPs: 10.97.226.236
Port: https 443/TCP
TargetPort: https/TCP
Endpoints: 10.244.0.9:10250
Session Affinity: None
Internal Traffic Policy: Cluster
Events: <none>
简单验证一下基本使用:
kubectl top nodes
kubectl top pods -n kube-system
正常能看到类似如下的输出:
$ kubectl top nodes
NAME CPU(cores) CPU(%) MEMORY(bytes) MEMORY(%)
debian13 161m 4% 1040Mi 13%
$ kubectl top pods -n kube-system
NAME CPU(cores) MEMORY(bytes)
coredns-848fbff4f8-2lx6w 1m 15Mi
coredns-848fbff4f8-lgr6d 1m 16Mi
etcd-debian13 7m 47Mi
kube-apiserver-debian13 13m 241Mi
kube-controller-manager-debian13 6m 53Mi
kube-proxy-xc4mn 1m 17Mi
kube-scheduler-debian13 3m 23Mi
metrics-server-7c9977449d-h4psq 1m 18Mi
如果出现下面的错误:
error: Metrics API not available
可以稍等片刻,等 metrics-server 启动后,再尝试查看。
参考资料
1.1.6 - 安装监控
参考:https://github.com/prometheus-operator/prometheus-operator
https://computingforgeeks.com/setup-prometheus-and-grafana-on-kubernetes/
1.2 - 预热安装 kubenetes
原理
所谓预热安装,就是在在线安装的基础上,在执行 kubeadmin init 之前,提前准备好所有的安装文件和镜像文件,然后制作成 pve 模板。
之后就可以重用该模板,在需要时创建虚拟机,在虚拟机中执行 kubeadmin init 即可快速安装 kubenetes。
原则上,在执行 kubeadmin init 之前的各种准备工作都可以参考在线安装的方式。而在 kubeadmin init 之后的安装工作,就只能通过提前准备安装文件和提前下载镜像文件等方式来加速。为最大化提升速度,所有的镜像文件都将通过 habor 进行代理。
准备工作
-
安装 docker: 参考 https://skyao.net/learning-docker/docs/installation/debian13/ ,在线安装和离线安装都可以。
-
安装 kubeadm: 参考前面的在线安装方式,或者直接用后面的离线安装方式,将 cri-dockerd / helm 和kubeadm / kubelete / kubectl 安装好。
预下载镜像文件
k8s cluster
kubeadm config images pull --cri-socket unix:///var/run/cri-dockerd.sock --config=kubeadm.yaml
这样就可以提前下载好 kubeadm init 时需要的镜像文件:
[config/images] Pulled 192.168.3.193:5000/k8s-proxy/kube-apiserver:v1.35.4
[config/images] Pulled 192.168.3.193:5000/k8s-proxy/kube-controller-manager:v1.35.4
[config/images] Pulled 192.168.3.193:5000/k8s-proxy/kube-scheduler:v1.35.4
[config/images] Pulled 192.168.3.193:5000/k8s-proxy/kube-proxy:v1.35.4
[config/images] Pulled 192.168.3.193:5000/k8s-proxy/coredns/coredns:v1.13.1
[config/images] Pulled 192.168.3.193:5000/k8s-proxy/pause:3.10.1
[config/images] Pulled 192.168.3.193:5000/k8s-proxy/etcd:3.6.6-0
准备 kubeadm.yaml 文件备用,内容同在线安装:
apiVersion: kubeadm.k8s.io/v1beta4
kind: InitConfiguration
nodeRegistration:
criSocket: unix:///var/run/cri-dockerd.sock
---
apiVersion: kubeadm.k8s.io/v1beta4
kind: ClusterConfiguration
kubernetesVersion: v1.35.4
imageRepository: 192.168.3.193:5000/k8s-proxy
networking:
podSubnet: 10.244.0.0/16
dns:
imageRepository: 192.168.3.193:5000/k8s-proxy/coredns
imageTag: v1.13.1
etcd:
local:
imageRepository: 192.168.3.193:5000/k8s-proxy
imageTag: 3.6.6-0
flannel
下载原始的 kube-flannel.yml 文件:
wget https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml
修改 kube-flannel.yml, 将原有的镜像文件地址从
image: ghcr.io/flannel-io/flannel:v0.28.4
image: ghcr.io/flannel-io/flannel-cni-plugin:v1.9.1-flannel1
修改为:
image: 192.168.3.193:5000/ghcr.io/flannel-io/flannel:v0.28.4
image: 192.168.3.193:5000/ghcr.io/flannel-io/flannel-cni-plugin:v1.9.1-flannel1
下载 flannel 需要的镜像文件:
docker pull 192.168.3.193:5000/ghcr.io/flannel-io/flannel-cni-plugin:v1.9.1-flannel1
docker pull 192.168.3.193:5000/ghcr.io/flannel-io/flannel:v0.28.4
metrics-server
下载原始的 components.yaml 文件, 保存为 metrics-server-components.yaml:
wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
mv components.yaml metrics-server-components.yaml
修改内容如下:
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=10250
- --kubelet-preferred-address-types=InternalIP # 修改
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --kubelet-insecure-tls # 添加
image: 192.168.3.193:5000/registry.k8s.io/metrics-server/metrics-server:v0.8.1 # 修改
下载 metrics-server 需要的镜像文件:
docker pull 192.168.3.193:5000/registry.k8s.io/metrics-server/metrics-server:v0.8.1
如果制作过程中,下载了多余的镜像,可以用如下命令先清空,再重新拉取需要的镜像:
docker rmi -f $(docker images -q)
安装
手工安装
执行 kubeadm init 命令:
cd ~/work/soft/k8s/
sudo kubeadm init --config=kubeadm.yaml
配置 kube config:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
配置 flannel 网络:
kubectl apply -f ~/work/soft/k8s/kube-flannel.yml
去除污点:
kubectl taint nodes --all node-role.kubernetes.io/control-plane-
安装 metrics-server:
kubectl apply -f ~/work/soft/k8s/metrics-server-components.yaml
kubectl wait --namespace kube-system \
--for=condition=Ready \
--selector=k8s-app=metrics-server \
--timeout=300s pod
echo "metrics-server installed, have a try:"
echo
echo "kubectl top nodes"
echo
kubectl top nodes
echo
echo "kubectl top pods -n kube-system"
echo
kubectl top pods -n kube-system
脚本自动安装
cd ~/work/soft/k8s/
vi install_k8s_prewarm.zsh
内容如下:
#!/usr/bin/env zsh
# Kubernetes 自动化安装脚本 (Debian 13 + Helm + Metrics Server)
# 使用方法: sudo ./install_k8s_prewarm.zsh
# 获取脚本所在绝对路径
K8S_INSTALL_PATH=$(cd "$(dirname "$0")"; pwd)
MANIFESTS_PATH="$K8S_INSTALL_PATH/menifests"
echo "🔍 检测到安装文件目录: $K8S_INSTALL_PATH"
# 检查是否以 root 执行
if [[ $EUID -ne 0 ]]; then
echo "❌ 此脚本必须以 root 身份运行"
exit 1
fi
# 安装日志
mkdir -p "$K8S_INSTALL_PATH/logs"
LOG_FILE="$K8S_INSTALL_PATH/logs/k8s_install_$(date +%Y%m%d_%H%M%S).log"
exec > >(tee -a "$LOG_FILE") 2>&1
echo "📅 开始安装 Kubernetes 集群 - $(date)"
echo "📁 资源目录: $K8S_INSTALL_PATH"
# 步骤1: kubeadm 初始化
echo "🚀 正在初始化 Kubernetes 控制平面..."
kubeadm_init() {
sudo kubeadm init --config="$MANIFESTS_PATH/kubeadm.yaml"
if [[ $? -ne 0 ]]; then
echo "❌ kubeadm init 失败"
exit 1
fi
}
kubeadm_init
sleep 3
# 步骤2: 配置 kubectl
echo "⚙️ 为 root 用户配置 kubectl..."
mkdir -p $HOME/.kube
cp /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
echo "⚙️ 为当前用户配置 kubectl..."
CURRENT_USER_HOME=$(getent passwd $SUDO_USER | cut -d: -f6)
mkdir -p $CURRENT_USER_HOME/.kube
cp /etc/kubernetes/admin.conf $CURRENT_USER_HOME/.kube/config
chown $(id -u $SUDO_USER):$(id -g $SUDO_USER) $CURRENT_USER_HOME/.kube/config
# 步骤3: 安装 Flannel 网络插件
echo "🌐 正在安装 Flannel 网络..."
kubectl apply -f "$MANIFESTS_PATH/kube-flannel.yml" || {
echo "❌ Flannel 安装失败"
exit 1
}
sleep 3
# 步骤4: 去除控制平面污点
echo "✨ 去除控制平面污点..."
kubectl taint nodes --all node-role.kubernetes.io/control-plane- || {
echo "⚠️ 去除污点失败 (可能不影响功能)"
}
# 步骤5: 安装 Metrics Server
echo "📈 正在安装 Metrics Server..."
kubectl apply -f "$MANIFESTS_PATH/metrics-server-components.yaml" || {
echo "❌ Metrics Server 安装失败"
exit 1
}
# 等待 Metrics Server 就绪
echo "⏳ 等待 Metrics Server 就绪 (最多5分钟)..."
kubectl rollout status deployment metrics-server \
--namespace kube-system \
--timeout=300s || {
echo "❌ Metrics Server 启动超时"
exit 1
}
# 验证安装
echo "✅ 安装完成!"
sleep 5
echo ""
echo "🛠️ 验证命令:"
echo "kubectl top nodes"
kubectl top nodes
echo ""
echo "kubectl top pods -n kube-system"
kubectl top pods -n kube-system
echo ""
echo "安装日志: $LOG_FILE"
增加执行权限:
chmod +x install_k8s_prewarm.zsh
执行:
sudo ./install_k8s_prewarm.zsh