这是本节的多页打印视图。 点击此处打印.

返回本页常规视图.

在 debian13 上安装 kubenetes

在 debian13 上用 kubeadm 安装 kubenetes

有三种安装方式:

  1. 在线安装: 最标准的安装方法,最大的问题就是需要联网+科学上网,速度慢,中途有被墙/被dns污染的风险

  2. 预热安装: 在在线安装的基础上,提前准备好安装文件和镜像文件,速度快,而且不需要用到镜像仓库。需要充分的提前准备,最好结合 pve 模板一起使用

  3. 离线安装: 需要提前下载好所有需要的文件到本地或者本地镜像仓库,速度快,但是同样需要充分的提前准备,而且需要用到 harbor 之类的镜像仓库

1 - 在线安装 kubenetes

在 debian13 上用 kubeadm 在线安装 kubenetes

参考官方文档:

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/

1.1 - 准备工作

在 debian13 上安装 kubenetes 之前的准备工作

系统更新

确保更新debian系统到最新,移除不再需要的软件,清理无用的安装包:

sudo apt update && sudo apt full-upgrade -y
sudo apt autoremove
sudo apt autoclean

如果更新了内核,最好重启一下。

swap 分区

安装 Kubernetes 要求机器不能有 swap 分区。

参考:

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#swap-configuration

开启模块

cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF

sudo modprobe overlay
sudo modprobe br_netfilter

# sysctl params required by setup, params persist across reboots
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables  = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward                 = 1
EOF

# Apply sysctl params without reboot
sudo sysctl --system

container runtime

Kubernetes 支持多种 container runtime,这里暂时继续使用 docker engine + cri-dockerd。

参考:

https://kubernetes.io/docs/setup/production-environment/container-runtimes/

安装 docker

docker 的安装参考:

https://skyao.net/learning-docker/docs/installation/debian13/

最方便的方式就是使用离线安装包进行安装。

安装 containerd

TODO:后面考虑换 containerd

安装 helm

参考:

https://helm.sh/docs/intro/install/#from-apt-debianubuntu

选择用官方脚本安装 Helm(以便绕过无法访问的 apt 仓库):

curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

输出为:

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 11929  100 11929    0     0  48738      0 --:--:-- --:--:-- --:--:-- 48889
Downloading https://get.helm.sh/helm-v3.20.2-linux-amd64.tar.gz
Verifying checksum... Done.
Preparing to install helm into /usr/local/bin
helm installed into /usr/local/bin/helm

验证版本:

helm version

版本显示为 v3.20.2:

version.BuildInfo{Version:"v3.20.2", GitCommit:"8fb76d6ab555577e98e23b7500009537a471feee", GitTreeState:"clean", GoVersion:"go1.25.9"}

1.2 - 安装命令行

在 debian13 上安装 kubeadm / kubelet / kubectl

参考: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/

安装 kubeadm / kubelet / kubectl

sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl gpg

假定要安装的 kubernetes 版本为 1.35:

export K8S_VERSION=1.35

# sudo mkdir -p -m 755 /etc/apt/keyrings
curl -fsSL https://pkgs.k8s.io/core:/stable:/v${K8S_VERSION}/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v${K8S_VERSION}/deb/ /" | sudo tee /etc/apt/sources.list.d/kubernetes.list

开始安装 kubelet kubeadm kubectl:

sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl

禁止这三个程序的自动更新:

sudo apt-mark hold kubelet kubeadm kubectl

验证安装:

kubectl version --client && echo && kubeadm version

输出为:

Client Version: v1.35.4
Kustomize Version: v5.7.1

kubeadm version: &version.Info{Major:"1", Minor:"35", EmulationMajor:"", EmulationMinor:"", MinCompatibilityMajor:"", MinCompatibilityMinor:"", GitVersion:"v1.35.4", GitCommit:"7b8c6cf0edd376b3d7c2f255142977c7f93db258", GitTreeState:"clean", BuildDate:"2026-04-15T18:03:27Z", GoVersion:"go1.25.9", Compiler:"gc", Platform:"linux/amd64"}

在运行 kubeadm 之前,先启动 kubelet 服务:

sudo systemctl enable --now kubelet

安装后配置

优化 zsh

vi ~/.zshrc

增加以下内容:

# k8s auto complete
alias k=kubectl
complete -F __start_kubectl k

执行:

source ~/.zshrc

之后即可使用,此时用 k 这个别名来执行 kubectl 命令时也可以实现自动完成,非常的方便。

取消更新

kubeadm / kubelet / kubectl 的版本没有必要升级到最新,因此可以取消他们的自动更新。

sudo vi /etc/apt/sources.list.d/kubernetes.list

注释掉里面的内容。

备注:前面执行 apt-mark hold 后已经不会再更新了,但依然会拖慢 apt update 的速度,因此还是需要手动注释。

常见问题

prod-cdn.packages.k8s.io 无法访问

偶然会遇到 prod-cdn.packages.k8s.io 无法访问的问题,此时的报错如下:

sudo apt-get update
Hit:1 http://mirrors.ustc.edu.cn/debian bookworm InRelease
Hit:2 http://mirrors.ustc.edu.cn/debian bookworm-updates InRelease
Hit:3 http://security.debian.org/debian-security bookworm-security InRelease
Ign:4 https://prod-cdn.packages.k8s.io/repositories/isv:/kubernetes:/core:/stable:/v1.32/deb  InRelease
Ign:4 https://prod-cdn.packages.k8s.io/repositories/isv:/kubernetes:/core:/stable:/v1.32/deb  InRelease
Ign:4 https://prod-cdn.packages.k8s.io/repositories/isv:/kubernetes:/core:/stable:/v1.32/deb  InRelease
Err:4 https://prod-cdn.packages.k8s.io/repositories/isv:/kubernetes:/core:/stable:/v1.32/deb  InRelease
  Could not connect to prod-cdn.packages.k8s.io:443 (221.228.32.13), connection timed out
Reading package lists... Done
W: Failed to fetch https://pkgs.k8s.io/core:/stable:/v1.32/deb/InRelease  Could not connect to prod-cdn.packages.k8s.io:443 (221.228.32.13), connection timed out
W: Some index files failed to download. They have been ignored, or old ones used instead.

首先排除是网络问题,因为实际配好网络代理,也依然无法访问。

后来发现,在不同地区的机器上 ping prod-cdn.packages.k8s.io 的 ip 地址是不一样的,

$ ping prod-cdn.packages.k8s.io

Pinging dkhzw6k7x6ord.cloudfront.net [108.139.10.84] with 32 bytes of data:
Reply from 108.139.10.84: bytes=32 time=164ms TTL=242
Reply from 108.139.10.84: bytes=32 time=166ms TTL=242
......

# 这个地址无法访问
$ ping prod-cdn.packages.k8s.io
PING dkhzw6k7x6ord.cloudfront.net (221.228.32.13) 56(84) bytes of data.
64 bytes from 221.228.32.13 (221.228.32.13): icmp_seq=1 ttl=57 time=9.90 ms
64 bytes from 221.228.32.13 (221.228.32.13): icmp_seq=2 ttl=57 time=11.4 ms
......

因此考虑通过修改 /etc/hosts 文件来避开 dns 解析的问题:

sudo vi /etc/hosts

添加如下内容:

108.139.10.84 prod-cdn.packages.k8s.io

这样在出现问题的这台机器上,强制将 prod-cdn.packages.k8s.io 解析到 108.139.10.84 这个 ip 地址,这样就可以访问了。

1.3 - 初始化集群

在 debian13 上初始化 kubernetes 集群

参考官方文档:

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/

需要注意的问题

coredns 的版本问题

[sudo] password for sky: 
[init] Using Kubernetes version: v1.34.2
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action beforehand using 'kubeadm config images pull'
W1127 11:14:12.643662   17241 checks.go:827] detected that the sandbox image "registry.k8s.io/pause:3.10" of the container runtime is inconsistent with that used by kubeadm. It is recommended to use "192.168.3.193:5000/k8s-proxy/pause:3.10.1" as the CRI sandbox image.
[preflight] Some fatal errors occurred:
	[ERROR ImagePull]: failed to pull image 192.168.3.193:5000/k8s-proxy/coredns:v1.12.1: failed to pull image 192.168.3.193:5000/k8s-proxy/coredns:v1.12.1: Error response from daemon: failed to resolve reference "192.168.3.193:5000/k8s-proxy/coredns:v1.12.1": 192.168.3.193:5000/k8s-proxy/coredns:v1.12.1: not found
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
error: error execution phase preflight: preflight checks failed
To see the stack trace of this error execute with --v=5 or higher

这里 coredns 报错, 我们用 kubeadm config images pull 可以看到需要用到的镜像:

$ kubeadm config images pull

[config/images] Pulled registry.k8s.io/kube-apiserver:v1.35.4
[config/images] Pulled registry.k8s.io/kube-controller-manager:v1.35.4
[config/images] Pulled registry.k8s.io/kube-scheduler:v1.35.4
[config/images] Pulled registry.k8s.io/kube-proxy:v1.35.4
[config/images] Pulled registry.k8s.io/coredns/coredns:v1.13.1
[config/images] Pulled registry.k8s.io/pause:3.10.1
[config/images] Pulled registry.k8s.io/etcd:3.6.6-0

Kubernetes 大多数核心组件镜像(kube-apiserver、kube-scheduler、etcd、pause 等)都是单层路径,而 CoreDNS 特别用了两层路径 coredns/coredns。这会导致在使用代理仓库时出错:

$ kubeadm config images pull --image-repository=192.168.3.193:5000/k8s-proxy

[config/images] Pulled 192.168.3.193:5000/k8s-proxy/kube-apiserver:v1.34.2
[config/images] Pulled 192.168.3.193:5000/k8s-proxy/kube-controller-manager:v1.34.2
[config/images] Pulled 192.168.3.193:5000/k8s-proxy/kube-scheduler:v1.34.2
[config/images] Pulled 192.168.3.193:5000/k8s-proxy/kube-proxy:v1.34.2
error: failed to pull image "192.168.3.193:5000/k8s-proxy/coredns:v1.12.1": failed to pull image 192.168.3.193:5000/k8s-proxy/coredns:v1.12.1: Error response from daemon: failed to resolve reference "192.168.3.193:5000/k8s-proxy/coredns:v1.12.1": 192.168.3.193:5000/k8s-proxy/coredns:v1.12.1: not found
To see the stack trace of this error execute with --v=5 or higher

要修订这个问题,就需要告知 kubeadm coredns 的代理仓库

初始化集群

准备 kubeadm.yaml

pod-network-cidr 尽量用 10.244.0.0/16 这个范围,不然有些网络插件会需要额外的配置。

cri-socket 的配置参考:

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#installing-runtime

因为前面用的 Docker Engine 和 cri-dockerd ,因此这里的 cri-socket 需要指定为 “unix:///var/run/cri-dockerd.sock”。

apiserver-advertise-address 需要指定为当前节点的 IP 地址,因为当前节点是单节点,因此这里指定为当前机器的 IP 地址如 192.168.3.100。

新建一个 kubeadm.yaml 文件:

mkdir -p ~/work/soft/k8s/
cd ~/work/soft/k8s/
vi kubeadm.yaml

内容为:

apiVersion: kubeadm.k8s.io/v1beta4
kind: InitConfiguration
nodeRegistration:
  criSocket: unix:///var/run/cri-dockerd.sock

---
apiVersion: kubeadm.k8s.io/v1beta4
kind: ClusterConfiguration
kubernetesVersion: v1.35.4
imageRepository: 192.168.3.193:5000/k8s-proxy
networking:
  podSubnet: 10.244.0.0/16
dns:
  imageRepository: 192.168.3.193:5000/k8s-proxy/coredns
  imageTag: v1.13.1
etcd:
  local:
    imageRepository: 192.168.3.193:5000/k8s-proxy
    imageTag: 3.6.6-0

执行 init

执行:

sudo kubeadm init --config=kubeadm.yaml

输出为:

[init] Using Kubernetes version: v1.35.4
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [debian13 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.3.232]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [debian13 localhost] and IPs [192.168.3.232 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [debian13 localhost] and IPs [192.168.3.232 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "super-admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/instance-config.yaml"
[patches] Applied patch of type "application/strategic-merge-patch+json" to target "kubeletconfiguration"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests"
[kubelet-check] Waiting for a healthy kubelet at http://127.0.0.1:10248/healthz. This can take up to 4m0s
[kubelet-check] The kubelet is healthy after 500.416007ms
[control-plane-check] Waiting for healthy control plane components. This can take up to 4m0s
[control-plane-check] Checking kube-apiserver at https://192.168.3.232:6443/livez
[control-plane-check] Checking kube-controller-manager at https://127.0.0.1:10257/healthz
[control-plane-check] Checking kube-scheduler at https://127.0.0.1:10259/livez
[control-plane-check] kube-controller-manager is healthy after 1.001317452s
[control-plane-check] kube-scheduler is healthy after 1.502089525s
[control-plane-check] kube-apiserver is healthy after 3.000477876s
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node debian13 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node debian13 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: ki33kg.5vpalsrlaa9bjvn2
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.3.232:6443 --token ki33kg.5vpalsrlaa9bjvn2 \
	--discovery-token-ca-cert-hash sha256:dba4e956a1e678da2319e29ce1b880bc4b21771cd5635dfa3117b7c2f409af29 

根据提示操作:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

对于测试用的单节点,去除 master/control-plane 的污点:

kubectl taint nodes --all node-role.kubernetes.io/control-plane-

执行:

kubectl get node  

能看到此时节点的状态会是 NotReady:

NAME       STATUS     ROLES           AGE   VERSION
debian13   NotReady   control-plane   53s   v1.35.4

执行:

kubectl describe node debian13

能看到节点的错误信息:

Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Wed, 22 Apr 2026 15:24:55 +0800   Wed, 22 Apr 2026 15:24:53 +0800   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Wed, 22 Apr 2026 15:24:55 +0800   Wed, 22 Apr 2026 15:24:53 +0800   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Wed, 22 Apr 2026 15:24:55 +0800   Wed, 22 Apr 2026 15:24:53 +0800   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            False   Wed, 22 Apr 2026 15:24:55 +0800   Wed, 22 Apr 2026 15:24:53 +0800   KubeletNotReady              container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

需要继续安装网络插件。

安装网络插件

安装 flannel

参考官方文档: https://github.com/flannel-io/flannel#deploying-flannel-with-kubectl

kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml

如果一切正常,就能看到 k8s 集群内的 pod 都启动完成状态为 Running:

$ k get pods -A

NAMESPACE      NAME                               READY   STATUS    RESTARTS       AGE
kube-flannel   kube-flannel-ds-mdzgj              1/1     Running   0              39s
kube-system    coredns-7f58b9688b-9rdzj           1/1     Running   0              2m44s
kube-system    coredns-7f58b9688b-lv4jr           1/1     Running   0              2m44s
kube-system    etcd-debian13                      1/1     Running   0              2m51s
kube-system    kube-apiserver-debian13            1/1     Running   1 (5m8s ago)   2m52s
kube-system    kube-controller-manager-debian13   1/1     Running   0              2m51s
kube-system    kube-proxy-qzhkw                   1/1     Running   0              2m44s
kube-system    kube-scheduler-debian13            1/1     Running   0              2m51s

如果发现 kube-flannel-ds pod 的状态总是 CrashLoopBackOff:

 k get pods -A
NAMESPACE      NAME                               READY   STATUS              RESTARTS        AGE
kube-flannel   kube-flannel-ds-ts6n8              0/1     CrashLoopBackOff    2 (22s ago)     42s

继续查看 pod 的具体错误信息:

k describe pods -n kube-flannel kube-flannel-ds-ts6n8

发现报错 “Back-off restarting failed container kube-flannel in pod kube-flannel”:

Events:
  Type     Reason     Age                 From               Message
  ----     ------     ----                ----               -------
  Normal   Scheduled  117s                default-scheduler  Successfully assigned kube-flannel/kube-flannel-ds-ts6n8 to debian12
  Normal   Pulled     116s                kubelet            Container image "ghcr.io/flannel-io/flannel-cni-plugin:v1.6.2-flannel1" already present on machine
  Normal   Created    116s                kubelet            Created container: install-cni-plugin
  Normal   Started    116s                kubelet            Started container install-cni-plugin
  Normal   Pulled     115s                kubelet            Container image "ghcr.io/flannel-io/flannel:v0.26.4" already present on machine
  Normal   Created    115s                kubelet            Created container: install-cni
  Normal   Started    115s                kubelet            Started container install-cni
  Normal   Pulled     28s (x5 over 114s)  kubelet            Container image "ghcr.io/flannel-io/flannel:v0.26.4" already present on machine
  Normal   Created    28s (x5 over 114s)  kubelet            Created container: kube-flannel
  Normal   Started    28s (x5 over 114s)  kubelet            Started container kube-flannel
  Warning  BackOff    2s (x10 over 110s)  kubelet            Back-off restarting failed container kube-flannel in pod kube-flannel-ds-ts6n8_kube-flannel(1e03c200-2062-4838

此时应该去检查准备工作中 “开启模块” 一节的内容是不是有疏漏。

补救之后,就能看到 kube-flannel-ds 这个 pod 正常运行了:

k get pods -A
NAMESPACE      NAME                               READY   STATUS    RESTARTS        AGE
kube-flannel   kube-flannel-ds-ts6n8              1/1     Running   7 (9m27s ago)   15m

1.4 - 安装 dashboard

安装 kubernetes 的 dashboard

This project is now archived and no longer maintained due to lack of active maintainers and contributors.


安装 dashboard

参考:

https://github.com/kubernetes/dashboard/#installation

在下面地址上查看当前 dashboard 的版本:

https://github.com/kubernetes/dashboard/releases

根据对 kubernetes 版本的兼容情况选择对应的 dashboard 的版本:

  • kubernetes-dashboard-7.14.0 ,没看到兼容说明, 但应该支持最新的 k8s 1.34 吧

最新版本需要用 helm 进行安装:

helm repo add kubernetes-dashboard https://kubernetes.github.io/dashboard/
helm upgrade --install kubernetes-dashboard kubernetes-dashboard/kubernetes-dashboard --create-namespace --namespace kubernetes-dashboard

输出为:

"kubernetes-dashboard" has been added to your repositories
Release "kubernetes-dashboard" does not exist. Installing it now.
NAME: kubernetes-dashboard
LAST DEPLOYED: Thu Nov 27 14:09:11 2025
NAMESPACE: kubernetes-dashboard
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
*************************************************************************************************
*** PLEASE BE PATIENT: Kubernetes Dashboard may need a few minutes to get up and become ready ***
*************************************************************************************************

Congratulations! You have just installed Kubernetes Dashboard in your cluster.

To access Dashboard run:
  kubectl -n kubernetes-dashboard port-forward svc/kubernetes-dashboard-kong-proxy 8443:443

NOTE: In case port-forward command does not work, make sure that kong service name is correct.
      Check the services in Kubernetes Dashboard namespace using:
        kubectl -n kubernetes-dashboard get svc

Dashboard will be available at:
  https://localhost:8443

此时 dashboard 的 service 和 pod 情况:

kubectl -n kubernetes-dashboard get services

输出为:

NAME                                   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
kubernetes-dashboard-api               ClusterIP   10.102.87.32     <none>        8000/TCP   108s
kubernetes-dashboard-auth              ClusterIP   10.108.108.108   <none>        8000/TCP   108s
kubernetes-dashboard-kong-proxy        ClusterIP   10.111.46.43     <none>        443/TCP    108s
kubernetes-dashboard-metrics-scraper   ClusterIP   10.98.190.255    <none>        8000/TCP   107s
kubernetes-dashboard-web               ClusterIP   10.103.159.121   <none>        8000/TCP   108s

查看 pod 的情况:

kubectl -n kubernetes-dashboard get pods

等待两三分钟之后,pod 启动完成,输出为:

NAME                                                    READY   STATUS    RESTARTS   AGE
kubernetes-dashboard-api-7994c5cb69-bhdnj               1/1     Running   0          16s
kubernetes-dashboard-auth-764494db59-fqt88              1/1     Running   0          16s
kubernetes-dashboard-kong-9849c64bd-jvwv2               1/1     Running   0          16s
kubernetes-dashboard-metrics-scraper-7685fd8b77-gcq8c   1/1     Running   0          16s
kubernetes-dashboard-web-5c9f966b98-nbjxv               1/1     Running   0          16s

为了方便,使用 node port 来访问 dashboard,需要执行:

kubectl -n kubernetes-dashboard edit service kubernetes-dashboard-kong-proxy

然后修改 type: ClusterIPtype: NodePort。然后看一下具体分配的 node port 是哪个:

kubectl -n kubernetes-dashboard get service kubernetes-dashboard-kong-proxy

输出为:

NAME                              TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)         AGE
kubernetes-dashboard-kong-proxy   NodePort   10.107.131.211   <none>        443:30652/TCP   95s

现在可以用浏览器直接访问:

https://192.168.3.100:30652/

创建用户并登录 dashboard

参考:Creating sample user

创建 admin-user 用户:

mkdir -p ~/work/soft/k8s
cd ~/work/soft/k8s

vi dashboard-adminuser.yaml

内容为:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin-user
  namespace: kubernetes-dashboard

执行:

k create -f dashboard-adminuser.yaml

然后绑定角色:

vi dashboard-adminuser-binding.yaml

内容为:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: admin-user
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: admin-user
  namespace: kubernetes-dashboard

执行:

k create -f dashboard-adminuser-binding.yaml

然后创建 token :

kubectl -n kubernetes-dashboard create token admin-user

输出为:

eyJhbGciOiJSUzI1NiIsImtpZCI6Ik9sWnJsTk5UNE9JVlVmRFMxMUpwNC1tUlVndTl5Zi1WQWtmMjIzd2hDNmcifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjLmNsdXN0ZXIubG9jYWwiXSwiZXhwIjoxNzQxMTEyNDg4LCJpYXQiOjE3NDExMDg4ODgsImlzcyI6Imh0dHBzOi8va3ViZXJuZXRlcy5kZWZhdWx0LnN2Yy5jbHVzdGVyLmxvY2FsIiwianRpIjoiNDU5ZGQxNjctNWI5OS00MWIzLTgzZWEtNGIxMGY3MTc5ZjEyIiwia3ViZXJuZXRlcy5pbyI6eyJuYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsInNlcnZpY2VhY2NvdW50Ijp7Im5hbWUiOiJhZG1pbi11c2VyIiwidWlkIjoiZjMxN2VhZTItNTNiNi00MGZhLWI3MWYtMzZiNDI1YmY4YWQ0In19LCJuYmYiOjE3NDExMDg4ODgsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlcm5ldGVzLWRhc2hib2FyZDphZG1pbi11c2VyIn0.TYzOdrMFXcSEeVMbc1ewIA13JVi4FUYoRN7rSH5OstbVfKIF48X_o1RWxOGM_AurhgLxuKZHzmns3K_pX_OR3u1URfK6-gGos4iAQY-H1yntfRmzzsip_FbZh95EYFGTN43gw21jTyfem3OKBXXLgzsnVT_29uMnJzSnCDnrAciVKMoCEUP6x2RSHQhp6PrxrIrx_NMB3vojEZYq3AysQoNqYYjRDd4MnDRClm03dNvW5lvKSgNCVmZFje_EEa2EhI2X6d3X8zx6tHwT5M4-T3hMmyIpzHUwf3ixeZR85rhorMbskNVvRpH6VLH6BXP31c3NMeSgYk3BG8d7UjCYxQ

这个 token 就可以用在 kubernetes-dashboard 的登录页面上了。

为了方便,将这个 token 存储在 Secret :

vi dashboard-adminuser-secret.yaml

内容为:

apiVersion: v1
kind: Secret
metadata:
  name: admin-user
  namespace: kubernetes-dashboard
  annotations:
    kubernetes.io/service-account.name: "admin-user"   
type: kubernetes.io/service-account-token

执行:

k create -f dashboard-adminuser-secret.yaml

之后就可以用命令随时获取这个 token 了:

kubectl get secret admin-user -n kubernetes-dashboard -o jsonpath="{.data.token}" | base64 -d

备注:复制 token 的时候,不要复制最后的那个 % 字符,否则会报错。

1.5 - 安装 metrics server

安装 kubernetes 的 metrics server

参考:https://github.com/kubernetes-sigs/metrics-server/#installation

安装 metrics server

下载:

mkdir -p ~/work/soft/k8s
cd ~/work/soft/k8s
wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

修改下载下来的 components.yaml, 增加 --kubelet-insecure-tls 并修改 --kubelet-preferred-address-types

  template:
    metadata:
      labels:
        k8s-app: metrics-server
    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-preferred-address-types=InternalIP   # 修改这行,默认是InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        - --kubelet-insecure-tls  # 增加这行

然后安装:

k apply -f components.yaml

稍等片刻看是否启动:

$ kubectl get pod -n kube-system | grep metrics-server

metrics-server-7c9977449d-h4psq    1/1     Running   0          34s

验证一下,查看 service 信息

$ kubectl describe svc metrics-server -n kube-system

Name:                     metrics-server
Namespace:                kube-system
Labels:                   k8s-app=metrics-server
Annotations:              <none>
Selector:                 k8s-app=metrics-server
Type:                     ClusterIP
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.97.226.236
IPs:                      10.97.226.236
Port:                     https  443/TCP
TargetPort:               https/TCP
Endpoints:                10.244.0.9:10250
Session Affinity:         None
Internal Traffic Policy:  Cluster
Events:                   <none>

简单验证一下基本使用:

kubectl top nodes
kubectl top pods -n kube-system 

正常能看到类似如下的输出:

$ kubectl top nodes
NAME       CPU(cores)   CPU(%)   MEMORY(bytes)   MEMORY(%)   
debian13   161m         4%       1040Mi          13%   

$ kubectl top pods -n kube-system 
NAME                               CPU(cores)   MEMORY(bytes)   
coredns-848fbff4f8-2lx6w           1m           15Mi            
coredns-848fbff4f8-lgr6d           1m           16Mi            
etcd-debian13                      7m           47Mi            
kube-apiserver-debian13            13m          241Mi           
kube-controller-manager-debian13   6m           53Mi            
kube-proxy-xc4mn                   1m           17Mi            
kube-scheduler-debian13            3m           23Mi            
metrics-server-7c9977449d-h4psq    1m           18Mi

如果出现下面的错误:

error: Metrics API not available

可以稍等片刻,等 metrics-server 启动后,再尝试查看。

参考资料

1.6 - 安装监控

安装 prometheus 和 grafana 以监控 kubernetes 集群

参考:https://github.com/prometheus-operator/prometheus-operator

https://computingforgeeks.com/setup-prometheus-and-grafana-on-kubernetes/

2 - 预热安装 kubenetes

在 debian13 上用 kubeadm 预热安装 kubenetes

原理

所谓预热安装,就是在在线安装的基础上,在执行 kubeadmin init 之前,提前准备好所有的安装文件和镜像文件,然后制作成 pve 模板。

之后就可以重用该模板,在需要时创建虚拟机,在虚拟机中执行 kubeadmin init 即可快速安装 kubenetes。

原则上,在执行 kubeadmin init 之前的各种准备工作都可以参考在线安装的方式。而在 kubeadmin init 之后的安装工作,就只能通过提前准备安装文件和提前下载镜像文件等方式来加速。为最大化提升速度,所有的镜像文件都将通过 habor 进行代理。

准备工作

预下载镜像文件

k8s cluster

kubeadm config images pull --cri-socket unix:///var/run/cri-dockerd.sock --config=kubeadm.yaml

这样就可以提前下载好 kubeadm init 时需要的镜像文件:

[config/images] Pulled 192.168.3.193:5000/k8s-proxy/kube-apiserver:v1.35.4
[config/images] Pulled 192.168.3.193:5000/k8s-proxy/kube-controller-manager:v1.35.4
[config/images] Pulled 192.168.3.193:5000/k8s-proxy/kube-scheduler:v1.35.4
[config/images] Pulled 192.168.3.193:5000/k8s-proxy/kube-proxy:v1.35.4
[config/images] Pulled 192.168.3.193:5000/k8s-proxy/coredns/coredns:v1.13.1
[config/images] Pulled 192.168.3.193:5000/k8s-proxy/pause:3.10.1
[config/images] Pulled 192.168.3.193:5000/k8s-proxy/etcd:3.6.6-0

准备 kubeadm.yaml 文件备用,内容同在线安装:

apiVersion: kubeadm.k8s.io/v1beta4
kind: InitConfiguration
nodeRegistration:
  criSocket: unix:///var/run/cri-dockerd.sock

---
apiVersion: kubeadm.k8s.io/v1beta4
kind: ClusterConfiguration
kubernetesVersion: v1.35.4
imageRepository: 192.168.3.193:5000/k8s-proxy
networking:
  podSubnet: 10.244.0.0/16
dns:
  imageRepository: 192.168.3.193:5000/k8s-proxy/coredns
  imageTag: v1.13.1
etcd:
  local:
    imageRepository: 192.168.3.193:5000/k8s-proxy
    imageTag: 3.6.6-0

flannel

下载原始的 kube-flannel.yml 文件:

wget https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml

修改 kube-flannel.yml, 将原有的镜像文件地址从

image: ghcr.io/flannel-io/flannel:v0.28.4
image: ghcr.io/flannel-io/flannel-cni-plugin:v1.9.1-flannel1

修改为:

image: 192.168.3.193:5000/ghcr.io/flannel-io/flannel:v0.28.4
image: 192.168.3.193:5000/ghcr.io/flannel-io/flannel-cni-plugin:v1.9.1-flannel1

下载 flannel 需要的镜像文件:

docker pull 192.168.3.193:5000/ghcr.io/flannel-io/flannel-cni-plugin:v1.9.1-flannel1
docker pull 192.168.3.193:5000/ghcr.io/flannel-io/flannel:v0.28.4

metrics-server

下载原始的 components.yaml 文件, 保存为 metrics-server-components.yaml:

wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml 
mv components.yaml metrics-server-components.yaml

修改内容如下:

    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=10250
        - --kubelet-preferred-address-types=InternalIP   # 修改
        - --kubelet-use-node-status-port
        - --metric-resolution=15s
        - --kubelet-insecure-tls # 添加
        image: 192.168.3.193:5000/registry.k8s.io/metrics-server/metrics-server:v0.8.1 # 修改

下载 metrics-server 需要的镜像文件:

docker pull 192.168.3.193:5000/registry.k8s.io/metrics-server/metrics-server:v0.8.1

如果制作过程中,下载了多余的镜像,可以用如下命令先清空,再重新拉取需要的镜像:

docker rmi -f $(docker images -q)

安装

手工安装

执行 kubeadm init 命令:

cd ~/work/soft/k8s/

sudo kubeadm init --config=kubeadm.yaml

配置 kube config:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

配置 flannel 网络:

kubectl apply -f ~/work/soft/k8s/kube-flannel.yml

去除污点:

kubectl taint nodes --all node-role.kubernetes.io/control-plane-

安装 metrics-server:

kubectl apply -f ~/work/soft/k8s/metrics-server-components.yaml

kubectl wait --namespace kube-system \
  --for=condition=Ready \
  --selector=k8s-app=metrics-server \
  --timeout=300s pod
echo "metrics-server installed, have a try:"
echo
echo "kubectl top nodes"
echo
kubectl top nodes
echo
echo "kubectl top pods -n kube-system"
echo
kubectl top pods -n kube-system

脚本自动安装

cd ~/work/soft/k8s/
vi install_k8s_prewarm.zsh

内容如下:

#!/usr/bin/env zsh

# Kubernetes 自动化安装脚本 (Debian 13 + Helm + Metrics Server)
# 使用方法: sudo ./install_k8s_prewarm.zsh

# 获取脚本所在绝对路径
K8S_INSTALL_PATH=$(cd "$(dirname "$0")"; pwd)
MANIFESTS_PATH="$K8S_INSTALL_PATH/menifests"
echo "🔍 检测到安装文件目录: $K8S_INSTALL_PATH"

# 检查是否以 root 执行
if [[ $EUID -ne 0 ]]; then
  echo "❌ 此脚本必须以 root 身份运行"
  exit 1
fi

# 安装日志
mkdir -p "$K8S_INSTALL_PATH/logs"
LOG_FILE="$K8S_INSTALL_PATH/logs/k8s_install_$(date +%Y%m%d_%H%M%S).log"
exec > >(tee -a "$LOG_FILE") 2>&1

echo "📅 开始安装 Kubernetes 集群 - $(date)"
echo "📁 资源目录: $K8S_INSTALL_PATH"

# 步骤1: kubeadm 初始化
echo "🚀 正在初始化 Kubernetes 控制平面..."
kubeadm_init() {
  sudo kubeadm init --config="$MANIFESTS_PATH/kubeadm.yaml"

  if [[ $? -ne 0 ]]; then
    echo "❌ kubeadm init 失败"
    exit 1
  fi
}
kubeadm_init
sleep 3

# 步骤2: 配置 kubectl
echo "⚙️ 为 root 用户配置 kubectl..."
mkdir -p $HOME/.kube
cp /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
echo "⚙️ 为当前用户配置 kubectl..."
CURRENT_USER_HOME=$(getent passwd $SUDO_USER | cut -d: -f6)
mkdir -p $CURRENT_USER_HOME/.kube
cp /etc/kubernetes/admin.conf $CURRENT_USER_HOME/.kube/config
chown $(id -u $SUDO_USER):$(id -g $SUDO_USER) $CURRENT_USER_HOME/.kube/config

# 步骤3: 安装 Flannel 网络插件
echo "🌐 正在安装 Flannel 网络..."
kubectl apply -f "$MANIFESTS_PATH/kube-flannel.yml" || {
  echo "❌ Flannel 安装失败"
  exit 1
}
sleep 3

# 步骤4: 去除控制平面污点
echo "✨ 去除控制平面污点..."
kubectl taint nodes --all node-role.kubernetes.io/control-plane- || {
  echo "⚠️ 去除污点失败 (可能不影响功能)"
}

# 步骤5: 安装 Metrics Server
echo "📈 正在安装 Metrics Server..."
kubectl apply -f "$MANIFESTS_PATH/metrics-server-components.yaml" || {
  echo "❌ Metrics Server 安装失败"
  exit 1
}

# 等待 Metrics Server 就绪
echo "⏳ 等待 Metrics Server 就绪 (最多5分钟)..."
kubectl rollout status deployment metrics-server \
  --namespace kube-system \
  --timeout=300s || {
  echo "❌ Metrics Server 启动超时"
  exit 1
}

# 验证安装
echo "✅ 安装完成!"
sleep 5
echo ""
echo "🛠️  验证命令:"
echo "kubectl top nodes"
kubectl top nodes
echo ""
echo "kubectl top pods -n kube-system"
kubectl top pods -n kube-system

echo ""
echo "安装日志: $LOG_FILE"

增加执行权限:

chmod +x install_k8s_prewarm.zsh

执行:

sudo ./install_k8s_prewarm.zsh