首先,查看node节点的日志,路径在/var/log/message
复制复制复制复制复制复制复制复制复制复制
Jun 1 11:32:34 apm-slave03 dockerd-current: time="2018-06-01T11:32:34.830329738+08:00" level=error msg="Handler for GET /containers/b532d65bd2ff380035560a33e435414b66ccfbfbbf6f3c9d51cb2f0add57b2d2/json returned error: No such container: b532d65bd2ff380035560a33e435414b66ccfbfbbf6f3c9d51cb2f0add57b2d2" Jun 1 11:32:44 apm-slave03 kubelet: I0601 11:32:44.160859 20744 docker_manager.go:2495] checking backoff for container "kubernetes-dashboard" in pod "kubernetes-dashboard-latest-4167338039-95kb9" Jun 1 11:32:44 apm-slave03 kubelet: I0601 11:32:44.161188 20744 docker_manager.go:2509] Back-off 5m0s restarting failed container=kubernetes-dashboard pod=kubernetes-dashboard-latest-4167338039-95kb9_kube-system(c2097a18-654a-11e8-8d29-005056bc2ad1) Jun 1 11:32:44 apm-slave03 kubelet: E0601 11:32:44.161302 20744 pod_workers.go:184] Error syncing pod c2097a18-654a-11e8-8d29-005056bc2ad1, skipping: failed to "StartContainer" for "kubernetes-dashboard" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kubernetes-dashboard pod=kubernetes-dashboard-latest-4167338039-95kb9_kube-system(c2097a18-654a-11e8-8d29-005056bc2ad1)"
这里基本上看不出问题,继续查看pod
复制复制复制复制复制复制复制复制复制
kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE kubernetes-dashboard-latest-4167338039-bbwp4 0/1 ContainerCreating 0 47s
删除
复制复制复制复制复制复制复制复制
kubectl delete pod kubernetes-dashboard-latest-4167338039-bbwp4 -n kube-system
接着查看
复制复制复制复制复制复制复制
kubectl get pods -n kube-system | grep -v Running NAME READY STATUS RESTARTS AGE kubernetes-dashboard-latest-4167338039-95kb9 0/1 Error 0 4s
居然有重启了一个,于是乎连续删了5次,依旧重启,这有点匪夷所思.
查看其中一个pod的描述信息:
复制复制复制复制复制复制
kubectl describe pod kubernetes-dashboard-latest-4167338039-95kb9 -n kube-system Name: kubernetes-dashboard-latest-4167338039-95kb9 Namespace: kube-system Node: apm-slave03/10.10.202.159 Start Time: Fri, 01 Jun 2018 11:20:39 +0800 Labels: k8s-app=kubernetes-dashboard kubernetes.io/cluster-service=true pod-template-hash=4167338039 version=latest Status: Running IP: 10.0.48.2 Controllers: ReplicaSet/kubernetes-dashboard-latest-4167338039 Containers: kubernetes-dashboard: Container ID: docker://fe222c62c496d1348b9da4d17da474721d941279c7bd476596a0e041353ccd55 Image: registry.cn-hangzhou.aliyuncs.com/google-containers/kubernetes-dashboard-amd64:v1.4.2 Image ID: docker-pullable://registry.cn-hangzhou.aliyuncs.com/google-containers/kubernetes-dashboard-amd64@sha256:8c9cafe41e0846c589a28ee337270d4e97d486058c17982314354556492f2c69 Port: 9090/TCP Args: --apiserver-host=http://apm-slave02:8080 Limits: cpu: 100m memory: 50Mi Requests: cpu: 100m memory: 50Mi State: Terminated Reason: Error Exit Code: 1 Started: Fri, 01 Jun 2018 11:21:04 +0800 Finished: Fri, 01 Jun 2018 11:21:06 +0800 Last State: Terminated Reason: Error Exit Code: 1 Started: Fri, 01 Jun 2018 11:20:43 +0800 Finished: Fri, 01 Jun 2018 11:20:45 +0800 Ready: False ▽ Restart Count: 2 Liveness: http-get http://:9090/ delay=30s timeout=30s period=10s #success=1 #failure=3 Volume Mounts: Environment Variables: Conditions: Type Status Initialized True Ready False PodScheduled True No volumes. QoS Class: Guaranteed Tolerations: Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 28s 28s 1 {default-scheduler } Normal Scheduled Successfully assigned kubernetes-dashboard-latest-4167338039-95kb9 to apm-slave03 28s 28s 1 {kubelet apm-slave03} spec.containers{kubernetes-dashboard} Normal Created Created container with docker id f880c9e76de7; Security:[seccomp=unconfined] 27s 27s 1 {kubelet apm-slave03} spec.containers{kubernetes-dashboard} Normal Started Started container with docker id f880c9e76de7 24s 24s 1 {kubelet apm-slave03} spec.containers{kubernetes-dashboard} Normal Created Created container with docker id 16f258977612; Security:[seccomp=unconfined] 24s 24s 1 {kubelet apm-slave03} spec.containers{kubernetes-dashboard} Normal Started Started container with docker id 16f258977612 22s 18s 2 {kubelet apm-slave03} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "kubernetes-dashboard" with CrashLoopBackOff: "Back-off 10s restarting failed container=kubernetes-dashboard pod=kubernetes-dashboard-latest-4167338039-95kb9_kube-system(c2097a18-654a-11e8-8d29-005056bc2ad1)" 28s 3s 4 {kubelet apm-slave03} Warning MissingClusterDNS kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to DNSDefault policy. 28s 3s 3 {kubelet apm-slave03} spec.containers{kubernetes-dashboard} Normal Pulled Container image "registry.cn-hangzhou.aliyuncs.com/google-containers/kubernetes-dashboard-amd64:v1.4.2" already present on machine 3s 3s 1 {kubelet apm-slave03} spec.containers{kubernetes-dashboard} Normal Created Created container with docker id fe222c62c496; Security:[seccomp=unconfined] 3s 3s 1 {kubelet apm-slave03} spec.containers{kubernetes-dashboard} Normal Started Started container with docker id fe222c62c496 22s 0s 3 {kubelet apm-slave03} spec.containers{kubernetes-dashboard} Warning BackOff Back-off restarting failed docker container 0s 0s 1 {kubelet apm-slave03} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "kubernetes-dashboard" with CrashLoopBackOff: "Back-off 20s restarting failed container=kubernetes-dashboard pod=kubernetes-dashboard-latest-4167338039-95kb9_kube-system(c2097a18-654a-11e8-8d29-005056bc2ad1)"
似乎也没看到想要的东西,于是面向Google了一把,最终结论是需要删除deployments才能完全删除那个pod,于是先看下长啥样~
复制复制复制复制复制
kubectl get deployments --all-namespaces NAMESPACE NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE kube-system kubernetes-dashboard-latest 1 1 1 0 2d
发现Name居然是我第一次创建的deployment,并不是我现在创建的,于是(咬牙切齿)删除之
复制复制复制复制
kubectl delete deployments kubernetes-dashboard-latest -n kube-system deployment "kubernetes-dashboard-latest" deleted kubectl get deployments --all-namespaces No resources found. kubectl get pods -n kube-system | grep -v Running No resources found.
无限重启pod的问题总算解决!
重新创建deployments
复制复制复制
kubectl get deployments --all-namespaces NAMESPACE NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE kube-system kubernetes-dashboard 1 1 1 1 5s