coredns pods have CrashLoopBackOff or Error state
I'm trying to set up the Kubernetes master, by issuing:
kubeadm init --pod-network-cidr=192.168.0.0/16
- followed by: Installing a pod network add-on (Calico)
- followed by: Master Isolation
issue: coredns
pods have CrashLoopBackOff
or Error
state:
# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-node-lflwx 2/2 Running 0 2d
coredns-576cbf47c7-nm7gc 0/1 CrashLoopBackOff 69 2d
coredns-576cbf47c7-nwcnx 0/1 CrashLoopBackOff 69 2d
etcd-suey.nknwn.local 1/1 Running 0 2d
kube-apiserver-suey.nknwn.local 1/1 Running 0 2d
kube-controller-manager-suey.nknwn.local 1/1 Running 0 2d
kube-proxy-xkgdr 1/1 Running 0 2d
kube-scheduler-suey.nknwn.local 1/1 Running 0 2d
#
I tried with Troubleshooting kubeadm - Kubernetes, however my node isn't running SELinux
and my Docker is up to date.
# docker --version
Docker version 18.06.1-ce, build e68fc7a
#
kubectl
's describe
:
# kubectl -n kube-system describe pod coredns-576cbf47c7-nwcnx
Name: coredns-576cbf47c7-nwcnx
Namespace: kube-system
Priority: 0
PriorityClassName: <none>
Node: suey.nknwn.local/192.168.86.81
Start Time: Sun, 28 Oct 2018 22:39:46 -0400
Labels: k8s-app=kube-dns
pod-template-hash=576cbf47c7
Annotations: cni.projectcalico.org/podIP: 192.168.0.30/32
Status: Running
IP: 192.168.0.30
Controlled By: ReplicaSet/coredns-576cbf47c7
Containers:
coredns:
Container ID: docker://ec65b8f40c38987961e9ed099dfa2e8bb35699a7f370a2cda0e0d522a0b05e79
Image: k8s.gcr.io/coredns:1.2.2
Image ID: docker-pullable://k8s.gcr.io/coredns@sha256:3e2be1cec87aca0b74b7668bbe8c02964a95a402e45ceb51b2252629d608d03a
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Running
Started: Wed, 31 Oct 2018 23:28:58 -0400
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Wed, 31 Oct 2018 23:21:35 -0400
Finished: Wed, 31 Oct 2018 23:23:54 -0400
Ready: True
Restart Count: 103
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-xvq8b (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-xvq8b:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-xvq8b
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: CriticalAddonsOnly
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Killing 54m (x10 over 4h19m) kubelet, suey.nknwn.local Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.
Warning Unhealthy 9m56s (x92 over 4h20m) kubelet, suey.nknwn.local Liveness probe failed: HTTP probe failed with statuscode: 503
Warning BackOff 5m4s (x173 over 4h10m) kubelet, suey.nknwn.local Back-off restarting failed container
# kubectl -n kube-system describe pod coredns-576cbf47c7-nm7gc
Name: coredns-576cbf47c7-nm7gc
Namespace: kube-system
Priority: 0
PriorityClassName: <none>
Node: suey.nknwn.local/192.168.86.81
Start Time: Sun, 28 Oct 2018 22:39:46 -0400
Labels: k8s-app=kube-dns
pod-template-hash=576cbf47c7
Annotations: cni.projectcalico.org/podIP: 192.168.0.31/32
Status: Running
IP: 192.168.0.31
Controlled By: ReplicaSet/coredns-576cbf47c7
Containers:
coredns:
Container ID: docker://0f2db8d89a4c439763e7293698d6a027a109bf556b806d232093300952a84359
Image: k8s.gcr.io/coredns:1.2.2
Image ID: docker-pullable://k8s.gcr.io/coredns@sha256:3e2be1cec87aca0b74b7668bbe8c02964a95a402e45ceb51b2252629d608d03a
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Running
Started: Wed, 31 Oct 2018 23:29:11 -0400
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Wed, 31 Oct 2018 23:21:58 -0400
Finished: Wed, 31 Oct 2018 23:24:08 -0400
Ready: True
Restart Count: 102
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-xvq8b (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-xvq8b:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-xvq8b
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: CriticalAddonsOnly
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Killing 44m (x12 over 4h18m) kubelet, suey.nknwn.local Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.
Warning BackOff 4m58s (x170 over 4h9m) kubelet, suey.nknwn.local Back-off restarting failed container
Warning Unhealthy 8s (x102 over 4h19m) kubelet, suey.nknwn.local Liveness probe failed: HTTP probe failed with statuscode: 503
#
kubectl
's log
:
# kubectl -n kube-system logs -f coredns-576cbf47c7-nm7gc
E1101 03:31:58.974836 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:31:58.974836 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:31:58.974857 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:32:29.975493 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:32:29.976732 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:32:29.977788 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:33:00.976164 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:33:00.977415 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:33:00.978332 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
2018/11/01 03:33:08 [INFO] SIGTERM: Shutting down servers then terminating
E1101 03:33:31.976864 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:33:31.978080 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:33:31.979156 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
#
# kubectl -n kube-system log -f coredns-576cbf47c7-gqdgd
.:53
2018/11/05 04:04:13 [INFO] CoreDNS-1.2.2
2018/11/05 04:04:13 [INFO] linux/amd64, go1.11, eb51e8b
CoreDNS-1.2.2
linux/amd64, go1.11, eb51e8b
2018/11/05 04:04:13 [INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769
2018/11/05 04:04:19 [FATAL] plugin/loop: Seen "HINFO IN 3597544515206064936.6415437575707023337." more than twice, loop detected
# kubectl -n kube-system log -f coredns-576cbf47c7-hhmws
.:53
2018/11/05 04:04:18 [INFO] CoreDNS-1.2.2
2018/11/05 04:04:18 [INFO] linux/amd64, go1.11, eb51e8b
CoreDNS-1.2.2
linux/amd64, go1.11, eb51e8b
2018/11/05 04:04:18 [INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769
2018/11/05 04:04:24 [FATAL] plugin/loop: Seen "HINFO IN 6900627972087569316.7905576541070882081." more than twice, loop detected
#
describe
(apiserver
):
# kubectl -n kube-system describe pod kube-apiserver-suey.nknwn.local
Name: kube-apiserver-suey.nknwn.local
Namespace: kube-system
Priority: 2000000000
PriorityClassName: system-cluster-critical
Node: suey.nknwn.local/192.168.87.20
Start Time: Fri, 02 Nov 2018 00:28:44 -0400
Labels: component=kube-apiserver
tier=control-plane
Annotations: kubernetes.io/config.hash: 2433a531afe72165364aace3b746ea4c
kubernetes.io/config.mirror: 2433a531afe72165364aace3b746ea4c
kubernetes.io/config.seen: 2018-11-02T00:28:43.795663261-04:00
kubernetes.io/config.source: file
scheduler.alpha.kubernetes.io/critical-pod:
Status: Running
IP: 192.168.87.20
Containers:
kube-apiserver:
Container ID: docker://659456385a1a859f078d36f4d1b91db9143d228b3bc5b3947a09460a39ce41fc
Image: k8s.gcr.io/kube-apiserver:v1.12.2
Image ID: docker-pullable://k8s.gcr.io/kube-apiserver@sha256:094929baf3a7681945d83a7654b3248e586b20506e28526121f50eb359cee44f
Port: <none>
Host Port: <none>
Command:
kube-apiserver
--authorization-mode=Node,RBAC
--advertise-address=192.168.87.20
--allow-privileged=true
--client-ca-file=/etc/kubernetes/pki/ca.crt
--enable-admission-plugins=NodeRestriction
--enable-bootstrap-token-auth=true
--etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
--etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
--etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
--etcd-servers=https://127.0.0.1:2379
--insecure-port=0
--kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
--kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
--proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
--proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
--requestheader-allowed-names=front-proxy-client
--requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
--requestheader-extra-headers-prefix=X-Remote-Extra-
--requestheader-group-headers=X-Remote-Group
--requestheader-username-headers=X-Remote-User
--secure-port=6443
--service-account-key-file=/etc/kubernetes/pki/sa.pub
--service-cluster-ip-range=10.96.0.0/12
--tls-cert-file=/etc/kubernetes/pki/apiserver.crt
--tls-private-key-file=/etc/kubernetes/pki/apiserver.key
State: Running
Started: Sun, 04 Nov 2018 22:57:27 -0500
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Sun, 04 Nov 2018 20:12:06 -0500
Finished: Sun, 04 Nov 2018 22:55:24 -0500
Ready: True
Restart Count: 2
Requests:
cpu: 250m
Liveness: http-get https://192.168.87.20:6443/healthz delay=15s timeout=15s period=10s #success=1 #failure=8
Environment: <none>
Mounts:
/etc/ca-certificates from etc-ca-certificates (ro)
/etc/kubernetes/pki from k8s-certs (ro)
/etc/ssl/certs from ca-certs (ro)
/usr/local/share/ca-certificates from usr-local-share-ca-certificates (ro)
/usr/share/ca-certificates from usr-share-ca-certificates (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
etc-ca-certificates:
Type: HostPath (bare host directory volume)
Path: /etc/ca-certificates
HostPathType: DirectoryOrCreate
k8s-certs:
Type: HostPath (bare host directory volume)
Path: /etc/kubernetes/pki
HostPathType: DirectoryOrCreate
ca-certs:
Type: HostPath (bare host directory volume)
Path: /etc/ssl/certs
HostPathType: DirectoryOrCreate
usr-share-ca-certificates:
Type: HostPath (bare host directory volume)
Path: /usr/share/ca-certificates
HostPathType: DirectoryOrCreate
usr-local-share-ca-certificates:
Type: HostPath (bare host directory volume)
Path: /usr/local/share/ca-certificates
HostPathType: DirectoryOrCreate
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: :NoExecute
Events: <none>
#
syslog (host):
Nov 4 22:59:36 suey kubelet[1234]: E1104 22:59:36.139538 1234
pod_workers.go:186] Error syncing pod
d8146b7e-de57-11e8-a1e2-ec8eb57434c8
("coredns-576cbf47c7-hhmws_kube-system(d8146b7e-de57-11e8-a1e2-ec8eb57434c8)"), skipping: failed to "StartContainer" for "coredns" with
CrashLoopBackOff: "Back-off 40s restarting failed container=coredns
pod=coredns-576cbf47c7-hhmws_kube-system(d8146b7e-de57-11e8-a1e2-ec8eb57434c8)"
Please advise.
docker kubernetes kubectl kubeadm coredns
|
show 5 more comments
I'm trying to set up the Kubernetes master, by issuing:
kubeadm init --pod-network-cidr=192.168.0.0/16
- followed by: Installing a pod network add-on (Calico)
- followed by: Master Isolation
issue: coredns
pods have CrashLoopBackOff
or Error
state:
# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-node-lflwx 2/2 Running 0 2d
coredns-576cbf47c7-nm7gc 0/1 CrashLoopBackOff 69 2d
coredns-576cbf47c7-nwcnx 0/1 CrashLoopBackOff 69 2d
etcd-suey.nknwn.local 1/1 Running 0 2d
kube-apiserver-suey.nknwn.local 1/1 Running 0 2d
kube-controller-manager-suey.nknwn.local 1/1 Running 0 2d
kube-proxy-xkgdr 1/1 Running 0 2d
kube-scheduler-suey.nknwn.local 1/1 Running 0 2d
#
I tried with Troubleshooting kubeadm - Kubernetes, however my node isn't running SELinux
and my Docker is up to date.
# docker --version
Docker version 18.06.1-ce, build e68fc7a
#
kubectl
's describe
:
# kubectl -n kube-system describe pod coredns-576cbf47c7-nwcnx
Name: coredns-576cbf47c7-nwcnx
Namespace: kube-system
Priority: 0
PriorityClassName: <none>
Node: suey.nknwn.local/192.168.86.81
Start Time: Sun, 28 Oct 2018 22:39:46 -0400
Labels: k8s-app=kube-dns
pod-template-hash=576cbf47c7
Annotations: cni.projectcalico.org/podIP: 192.168.0.30/32
Status: Running
IP: 192.168.0.30
Controlled By: ReplicaSet/coredns-576cbf47c7
Containers:
coredns:
Container ID: docker://ec65b8f40c38987961e9ed099dfa2e8bb35699a7f370a2cda0e0d522a0b05e79
Image: k8s.gcr.io/coredns:1.2.2
Image ID: docker-pullable://k8s.gcr.io/coredns@sha256:3e2be1cec87aca0b74b7668bbe8c02964a95a402e45ceb51b2252629d608d03a
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Running
Started: Wed, 31 Oct 2018 23:28:58 -0400
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Wed, 31 Oct 2018 23:21:35 -0400
Finished: Wed, 31 Oct 2018 23:23:54 -0400
Ready: True
Restart Count: 103
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-xvq8b (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-xvq8b:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-xvq8b
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: CriticalAddonsOnly
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Killing 54m (x10 over 4h19m) kubelet, suey.nknwn.local Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.
Warning Unhealthy 9m56s (x92 over 4h20m) kubelet, suey.nknwn.local Liveness probe failed: HTTP probe failed with statuscode: 503
Warning BackOff 5m4s (x173 over 4h10m) kubelet, suey.nknwn.local Back-off restarting failed container
# kubectl -n kube-system describe pod coredns-576cbf47c7-nm7gc
Name: coredns-576cbf47c7-nm7gc
Namespace: kube-system
Priority: 0
PriorityClassName: <none>
Node: suey.nknwn.local/192.168.86.81
Start Time: Sun, 28 Oct 2018 22:39:46 -0400
Labels: k8s-app=kube-dns
pod-template-hash=576cbf47c7
Annotations: cni.projectcalico.org/podIP: 192.168.0.31/32
Status: Running
IP: 192.168.0.31
Controlled By: ReplicaSet/coredns-576cbf47c7
Containers:
coredns:
Container ID: docker://0f2db8d89a4c439763e7293698d6a027a109bf556b806d232093300952a84359
Image: k8s.gcr.io/coredns:1.2.2
Image ID: docker-pullable://k8s.gcr.io/coredns@sha256:3e2be1cec87aca0b74b7668bbe8c02964a95a402e45ceb51b2252629d608d03a
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Running
Started: Wed, 31 Oct 2018 23:29:11 -0400
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Wed, 31 Oct 2018 23:21:58 -0400
Finished: Wed, 31 Oct 2018 23:24:08 -0400
Ready: True
Restart Count: 102
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-xvq8b (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-xvq8b:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-xvq8b
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: CriticalAddonsOnly
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Killing 44m (x12 over 4h18m) kubelet, suey.nknwn.local Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.
Warning BackOff 4m58s (x170 over 4h9m) kubelet, suey.nknwn.local Back-off restarting failed container
Warning Unhealthy 8s (x102 over 4h19m) kubelet, suey.nknwn.local Liveness probe failed: HTTP probe failed with statuscode: 503
#
kubectl
's log
:
# kubectl -n kube-system logs -f coredns-576cbf47c7-nm7gc
E1101 03:31:58.974836 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:31:58.974836 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:31:58.974857 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:32:29.975493 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:32:29.976732 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:32:29.977788 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:33:00.976164 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:33:00.977415 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:33:00.978332 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
2018/11/01 03:33:08 [INFO] SIGTERM: Shutting down servers then terminating
E1101 03:33:31.976864 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:33:31.978080 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:33:31.979156 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
#
# kubectl -n kube-system log -f coredns-576cbf47c7-gqdgd
.:53
2018/11/05 04:04:13 [INFO] CoreDNS-1.2.2
2018/11/05 04:04:13 [INFO] linux/amd64, go1.11, eb51e8b
CoreDNS-1.2.2
linux/amd64, go1.11, eb51e8b
2018/11/05 04:04:13 [INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769
2018/11/05 04:04:19 [FATAL] plugin/loop: Seen "HINFO IN 3597544515206064936.6415437575707023337." more than twice, loop detected
# kubectl -n kube-system log -f coredns-576cbf47c7-hhmws
.:53
2018/11/05 04:04:18 [INFO] CoreDNS-1.2.2
2018/11/05 04:04:18 [INFO] linux/amd64, go1.11, eb51e8b
CoreDNS-1.2.2
linux/amd64, go1.11, eb51e8b
2018/11/05 04:04:18 [INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769
2018/11/05 04:04:24 [FATAL] plugin/loop: Seen "HINFO IN 6900627972087569316.7905576541070882081." more than twice, loop detected
#
describe
(apiserver
):
# kubectl -n kube-system describe pod kube-apiserver-suey.nknwn.local
Name: kube-apiserver-suey.nknwn.local
Namespace: kube-system
Priority: 2000000000
PriorityClassName: system-cluster-critical
Node: suey.nknwn.local/192.168.87.20
Start Time: Fri, 02 Nov 2018 00:28:44 -0400
Labels: component=kube-apiserver
tier=control-plane
Annotations: kubernetes.io/config.hash: 2433a531afe72165364aace3b746ea4c
kubernetes.io/config.mirror: 2433a531afe72165364aace3b746ea4c
kubernetes.io/config.seen: 2018-11-02T00:28:43.795663261-04:00
kubernetes.io/config.source: file
scheduler.alpha.kubernetes.io/critical-pod:
Status: Running
IP: 192.168.87.20
Containers:
kube-apiserver:
Container ID: docker://659456385a1a859f078d36f4d1b91db9143d228b3bc5b3947a09460a39ce41fc
Image: k8s.gcr.io/kube-apiserver:v1.12.2
Image ID: docker-pullable://k8s.gcr.io/kube-apiserver@sha256:094929baf3a7681945d83a7654b3248e586b20506e28526121f50eb359cee44f
Port: <none>
Host Port: <none>
Command:
kube-apiserver
--authorization-mode=Node,RBAC
--advertise-address=192.168.87.20
--allow-privileged=true
--client-ca-file=/etc/kubernetes/pki/ca.crt
--enable-admission-plugins=NodeRestriction
--enable-bootstrap-token-auth=true
--etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
--etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
--etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
--etcd-servers=https://127.0.0.1:2379
--insecure-port=0
--kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
--kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
--proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
--proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
--requestheader-allowed-names=front-proxy-client
--requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
--requestheader-extra-headers-prefix=X-Remote-Extra-
--requestheader-group-headers=X-Remote-Group
--requestheader-username-headers=X-Remote-User
--secure-port=6443
--service-account-key-file=/etc/kubernetes/pki/sa.pub
--service-cluster-ip-range=10.96.0.0/12
--tls-cert-file=/etc/kubernetes/pki/apiserver.crt
--tls-private-key-file=/etc/kubernetes/pki/apiserver.key
State: Running
Started: Sun, 04 Nov 2018 22:57:27 -0500
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Sun, 04 Nov 2018 20:12:06 -0500
Finished: Sun, 04 Nov 2018 22:55:24 -0500
Ready: True
Restart Count: 2
Requests:
cpu: 250m
Liveness: http-get https://192.168.87.20:6443/healthz delay=15s timeout=15s period=10s #success=1 #failure=8
Environment: <none>
Mounts:
/etc/ca-certificates from etc-ca-certificates (ro)
/etc/kubernetes/pki from k8s-certs (ro)
/etc/ssl/certs from ca-certs (ro)
/usr/local/share/ca-certificates from usr-local-share-ca-certificates (ro)
/usr/share/ca-certificates from usr-share-ca-certificates (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
etc-ca-certificates:
Type: HostPath (bare host directory volume)
Path: /etc/ca-certificates
HostPathType: DirectoryOrCreate
k8s-certs:
Type: HostPath (bare host directory volume)
Path: /etc/kubernetes/pki
HostPathType: DirectoryOrCreate
ca-certs:
Type: HostPath (bare host directory volume)
Path: /etc/ssl/certs
HostPathType: DirectoryOrCreate
usr-share-ca-certificates:
Type: HostPath (bare host directory volume)
Path: /usr/share/ca-certificates
HostPathType: DirectoryOrCreate
usr-local-share-ca-certificates:
Type: HostPath (bare host directory volume)
Path: /usr/local/share/ca-certificates
HostPathType: DirectoryOrCreate
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: :NoExecute
Events: <none>
#
syslog (host):
Nov 4 22:59:36 suey kubelet[1234]: E1104 22:59:36.139538 1234
pod_workers.go:186] Error syncing pod
d8146b7e-de57-11e8-a1e2-ec8eb57434c8
("coredns-576cbf47c7-hhmws_kube-system(d8146b7e-de57-11e8-a1e2-ec8eb57434c8)"), skipping: failed to "StartContainer" for "coredns" with
CrashLoopBackOff: "Back-off 40s restarting failed container=coredns
pod=coredns-576cbf47c7-hhmws_kube-system(d8146b7e-de57-11e8-a1e2-ec8eb57434c8)"
Please advise.
docker kubernetes kubectl kubeadm coredns
attach the full output pls
– Konstantin Vustin
Oct 31 '18 at 3:49
2
Also trykubectl logs -f coredns-576cbf47c7-nm7gc
– Andre Helberg
Oct 31 '18 at 5:50
@AndreHelberg I updated my question with output fromkubectl logs
command. I'm not sure what this10.96.0.1:443
is ...
– alexus
Nov 1 '18 at 3:34
@KonstantinVustin I updated my question with full output as well.
– alexus
Nov 1 '18 at 3:38
@alexus It looks like you are trying to set up a cluster from scratch? I haven't done this before, so my input might not be of much help, but from the log you pasted, it seems your pod is trying to connect to10.96.0.1:443
. I'd suggest verifying that what ever should be there is up. I'm guessing it iskube-apiserver-suey.nknwn.local
. I would look at: * are these nodes on the same network * Is the ip correct * is something listening on 443 * check the service (logs) listening on 443, might be a cert/auth issue or it's timing out * check that ports aren't being blocked
– Andre Helberg
Nov 1 '18 at 8:46
|
show 5 more comments
I'm trying to set up the Kubernetes master, by issuing:
kubeadm init --pod-network-cidr=192.168.0.0/16
- followed by: Installing a pod network add-on (Calico)
- followed by: Master Isolation
issue: coredns
pods have CrashLoopBackOff
or Error
state:
# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-node-lflwx 2/2 Running 0 2d
coredns-576cbf47c7-nm7gc 0/1 CrashLoopBackOff 69 2d
coredns-576cbf47c7-nwcnx 0/1 CrashLoopBackOff 69 2d
etcd-suey.nknwn.local 1/1 Running 0 2d
kube-apiserver-suey.nknwn.local 1/1 Running 0 2d
kube-controller-manager-suey.nknwn.local 1/1 Running 0 2d
kube-proxy-xkgdr 1/1 Running 0 2d
kube-scheduler-suey.nknwn.local 1/1 Running 0 2d
#
I tried with Troubleshooting kubeadm - Kubernetes, however my node isn't running SELinux
and my Docker is up to date.
# docker --version
Docker version 18.06.1-ce, build e68fc7a
#
kubectl
's describe
:
# kubectl -n kube-system describe pod coredns-576cbf47c7-nwcnx
Name: coredns-576cbf47c7-nwcnx
Namespace: kube-system
Priority: 0
PriorityClassName: <none>
Node: suey.nknwn.local/192.168.86.81
Start Time: Sun, 28 Oct 2018 22:39:46 -0400
Labels: k8s-app=kube-dns
pod-template-hash=576cbf47c7
Annotations: cni.projectcalico.org/podIP: 192.168.0.30/32
Status: Running
IP: 192.168.0.30
Controlled By: ReplicaSet/coredns-576cbf47c7
Containers:
coredns:
Container ID: docker://ec65b8f40c38987961e9ed099dfa2e8bb35699a7f370a2cda0e0d522a0b05e79
Image: k8s.gcr.io/coredns:1.2.2
Image ID: docker-pullable://k8s.gcr.io/coredns@sha256:3e2be1cec87aca0b74b7668bbe8c02964a95a402e45ceb51b2252629d608d03a
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Running
Started: Wed, 31 Oct 2018 23:28:58 -0400
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Wed, 31 Oct 2018 23:21:35 -0400
Finished: Wed, 31 Oct 2018 23:23:54 -0400
Ready: True
Restart Count: 103
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-xvq8b (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-xvq8b:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-xvq8b
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: CriticalAddonsOnly
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Killing 54m (x10 over 4h19m) kubelet, suey.nknwn.local Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.
Warning Unhealthy 9m56s (x92 over 4h20m) kubelet, suey.nknwn.local Liveness probe failed: HTTP probe failed with statuscode: 503
Warning BackOff 5m4s (x173 over 4h10m) kubelet, suey.nknwn.local Back-off restarting failed container
# kubectl -n kube-system describe pod coredns-576cbf47c7-nm7gc
Name: coredns-576cbf47c7-nm7gc
Namespace: kube-system
Priority: 0
PriorityClassName: <none>
Node: suey.nknwn.local/192.168.86.81
Start Time: Sun, 28 Oct 2018 22:39:46 -0400
Labels: k8s-app=kube-dns
pod-template-hash=576cbf47c7
Annotations: cni.projectcalico.org/podIP: 192.168.0.31/32
Status: Running
IP: 192.168.0.31
Controlled By: ReplicaSet/coredns-576cbf47c7
Containers:
coredns:
Container ID: docker://0f2db8d89a4c439763e7293698d6a027a109bf556b806d232093300952a84359
Image: k8s.gcr.io/coredns:1.2.2
Image ID: docker-pullable://k8s.gcr.io/coredns@sha256:3e2be1cec87aca0b74b7668bbe8c02964a95a402e45ceb51b2252629d608d03a
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Running
Started: Wed, 31 Oct 2018 23:29:11 -0400
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Wed, 31 Oct 2018 23:21:58 -0400
Finished: Wed, 31 Oct 2018 23:24:08 -0400
Ready: True
Restart Count: 102
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-xvq8b (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-xvq8b:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-xvq8b
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: CriticalAddonsOnly
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Killing 44m (x12 over 4h18m) kubelet, suey.nknwn.local Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.
Warning BackOff 4m58s (x170 over 4h9m) kubelet, suey.nknwn.local Back-off restarting failed container
Warning Unhealthy 8s (x102 over 4h19m) kubelet, suey.nknwn.local Liveness probe failed: HTTP probe failed with statuscode: 503
#
kubectl
's log
:
# kubectl -n kube-system logs -f coredns-576cbf47c7-nm7gc
E1101 03:31:58.974836 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:31:58.974836 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:31:58.974857 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:32:29.975493 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:32:29.976732 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:32:29.977788 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:33:00.976164 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:33:00.977415 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:33:00.978332 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
2018/11/01 03:33:08 [INFO] SIGTERM: Shutting down servers then terminating
E1101 03:33:31.976864 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:33:31.978080 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:33:31.979156 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
#
# kubectl -n kube-system log -f coredns-576cbf47c7-gqdgd
.:53
2018/11/05 04:04:13 [INFO] CoreDNS-1.2.2
2018/11/05 04:04:13 [INFO] linux/amd64, go1.11, eb51e8b
CoreDNS-1.2.2
linux/amd64, go1.11, eb51e8b
2018/11/05 04:04:13 [INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769
2018/11/05 04:04:19 [FATAL] plugin/loop: Seen "HINFO IN 3597544515206064936.6415437575707023337." more than twice, loop detected
# kubectl -n kube-system log -f coredns-576cbf47c7-hhmws
.:53
2018/11/05 04:04:18 [INFO] CoreDNS-1.2.2
2018/11/05 04:04:18 [INFO] linux/amd64, go1.11, eb51e8b
CoreDNS-1.2.2
linux/amd64, go1.11, eb51e8b
2018/11/05 04:04:18 [INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769
2018/11/05 04:04:24 [FATAL] plugin/loop: Seen "HINFO IN 6900627972087569316.7905576541070882081." more than twice, loop detected
#
describe
(apiserver
):
# kubectl -n kube-system describe pod kube-apiserver-suey.nknwn.local
Name: kube-apiserver-suey.nknwn.local
Namespace: kube-system
Priority: 2000000000
PriorityClassName: system-cluster-critical
Node: suey.nknwn.local/192.168.87.20
Start Time: Fri, 02 Nov 2018 00:28:44 -0400
Labels: component=kube-apiserver
tier=control-plane
Annotations: kubernetes.io/config.hash: 2433a531afe72165364aace3b746ea4c
kubernetes.io/config.mirror: 2433a531afe72165364aace3b746ea4c
kubernetes.io/config.seen: 2018-11-02T00:28:43.795663261-04:00
kubernetes.io/config.source: file
scheduler.alpha.kubernetes.io/critical-pod:
Status: Running
IP: 192.168.87.20
Containers:
kube-apiserver:
Container ID: docker://659456385a1a859f078d36f4d1b91db9143d228b3bc5b3947a09460a39ce41fc
Image: k8s.gcr.io/kube-apiserver:v1.12.2
Image ID: docker-pullable://k8s.gcr.io/kube-apiserver@sha256:094929baf3a7681945d83a7654b3248e586b20506e28526121f50eb359cee44f
Port: <none>
Host Port: <none>
Command:
kube-apiserver
--authorization-mode=Node,RBAC
--advertise-address=192.168.87.20
--allow-privileged=true
--client-ca-file=/etc/kubernetes/pki/ca.crt
--enable-admission-plugins=NodeRestriction
--enable-bootstrap-token-auth=true
--etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
--etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
--etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
--etcd-servers=https://127.0.0.1:2379
--insecure-port=0
--kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
--kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
--proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
--proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
--requestheader-allowed-names=front-proxy-client
--requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
--requestheader-extra-headers-prefix=X-Remote-Extra-
--requestheader-group-headers=X-Remote-Group
--requestheader-username-headers=X-Remote-User
--secure-port=6443
--service-account-key-file=/etc/kubernetes/pki/sa.pub
--service-cluster-ip-range=10.96.0.0/12
--tls-cert-file=/etc/kubernetes/pki/apiserver.crt
--tls-private-key-file=/etc/kubernetes/pki/apiserver.key
State: Running
Started: Sun, 04 Nov 2018 22:57:27 -0500
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Sun, 04 Nov 2018 20:12:06 -0500
Finished: Sun, 04 Nov 2018 22:55:24 -0500
Ready: True
Restart Count: 2
Requests:
cpu: 250m
Liveness: http-get https://192.168.87.20:6443/healthz delay=15s timeout=15s period=10s #success=1 #failure=8
Environment: <none>
Mounts:
/etc/ca-certificates from etc-ca-certificates (ro)
/etc/kubernetes/pki from k8s-certs (ro)
/etc/ssl/certs from ca-certs (ro)
/usr/local/share/ca-certificates from usr-local-share-ca-certificates (ro)
/usr/share/ca-certificates from usr-share-ca-certificates (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
etc-ca-certificates:
Type: HostPath (bare host directory volume)
Path: /etc/ca-certificates
HostPathType: DirectoryOrCreate
k8s-certs:
Type: HostPath (bare host directory volume)
Path: /etc/kubernetes/pki
HostPathType: DirectoryOrCreate
ca-certs:
Type: HostPath (bare host directory volume)
Path: /etc/ssl/certs
HostPathType: DirectoryOrCreate
usr-share-ca-certificates:
Type: HostPath (bare host directory volume)
Path: /usr/share/ca-certificates
HostPathType: DirectoryOrCreate
usr-local-share-ca-certificates:
Type: HostPath (bare host directory volume)
Path: /usr/local/share/ca-certificates
HostPathType: DirectoryOrCreate
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: :NoExecute
Events: <none>
#
syslog (host):
Nov 4 22:59:36 suey kubelet[1234]: E1104 22:59:36.139538 1234
pod_workers.go:186] Error syncing pod
d8146b7e-de57-11e8-a1e2-ec8eb57434c8
("coredns-576cbf47c7-hhmws_kube-system(d8146b7e-de57-11e8-a1e2-ec8eb57434c8)"), skipping: failed to "StartContainer" for "coredns" with
CrashLoopBackOff: "Back-off 40s restarting failed container=coredns
pod=coredns-576cbf47c7-hhmws_kube-system(d8146b7e-de57-11e8-a1e2-ec8eb57434c8)"
Please advise.
docker kubernetes kubectl kubeadm coredns
I'm trying to set up the Kubernetes master, by issuing:
kubeadm init --pod-network-cidr=192.168.0.0/16
- followed by: Installing a pod network add-on (Calico)
- followed by: Master Isolation
issue: coredns
pods have CrashLoopBackOff
or Error
state:
# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-node-lflwx 2/2 Running 0 2d
coredns-576cbf47c7-nm7gc 0/1 CrashLoopBackOff 69 2d
coredns-576cbf47c7-nwcnx 0/1 CrashLoopBackOff 69 2d
etcd-suey.nknwn.local 1/1 Running 0 2d
kube-apiserver-suey.nknwn.local 1/1 Running 0 2d
kube-controller-manager-suey.nknwn.local 1/1 Running 0 2d
kube-proxy-xkgdr 1/1 Running 0 2d
kube-scheduler-suey.nknwn.local 1/1 Running 0 2d
#
I tried with Troubleshooting kubeadm - Kubernetes, however my node isn't running SELinux
and my Docker is up to date.
# docker --version
Docker version 18.06.1-ce, build e68fc7a
#
kubectl
's describe
:
# kubectl -n kube-system describe pod coredns-576cbf47c7-nwcnx
Name: coredns-576cbf47c7-nwcnx
Namespace: kube-system
Priority: 0
PriorityClassName: <none>
Node: suey.nknwn.local/192.168.86.81
Start Time: Sun, 28 Oct 2018 22:39:46 -0400
Labels: k8s-app=kube-dns
pod-template-hash=576cbf47c7
Annotations: cni.projectcalico.org/podIP: 192.168.0.30/32
Status: Running
IP: 192.168.0.30
Controlled By: ReplicaSet/coredns-576cbf47c7
Containers:
coredns:
Container ID: docker://ec65b8f40c38987961e9ed099dfa2e8bb35699a7f370a2cda0e0d522a0b05e79
Image: k8s.gcr.io/coredns:1.2.2
Image ID: docker-pullable://k8s.gcr.io/coredns@sha256:3e2be1cec87aca0b74b7668bbe8c02964a95a402e45ceb51b2252629d608d03a
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Running
Started: Wed, 31 Oct 2018 23:28:58 -0400
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Wed, 31 Oct 2018 23:21:35 -0400
Finished: Wed, 31 Oct 2018 23:23:54 -0400
Ready: True
Restart Count: 103
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-xvq8b (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-xvq8b:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-xvq8b
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: CriticalAddonsOnly
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Killing 54m (x10 over 4h19m) kubelet, suey.nknwn.local Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.
Warning Unhealthy 9m56s (x92 over 4h20m) kubelet, suey.nknwn.local Liveness probe failed: HTTP probe failed with statuscode: 503
Warning BackOff 5m4s (x173 over 4h10m) kubelet, suey.nknwn.local Back-off restarting failed container
# kubectl -n kube-system describe pod coredns-576cbf47c7-nm7gc
Name: coredns-576cbf47c7-nm7gc
Namespace: kube-system
Priority: 0
PriorityClassName: <none>
Node: suey.nknwn.local/192.168.86.81
Start Time: Sun, 28 Oct 2018 22:39:46 -0400
Labels: k8s-app=kube-dns
pod-template-hash=576cbf47c7
Annotations: cni.projectcalico.org/podIP: 192.168.0.31/32
Status: Running
IP: 192.168.0.31
Controlled By: ReplicaSet/coredns-576cbf47c7
Containers:
coredns:
Container ID: docker://0f2db8d89a4c439763e7293698d6a027a109bf556b806d232093300952a84359
Image: k8s.gcr.io/coredns:1.2.2
Image ID: docker-pullable://k8s.gcr.io/coredns@sha256:3e2be1cec87aca0b74b7668bbe8c02964a95a402e45ceb51b2252629d608d03a
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Running
Started: Wed, 31 Oct 2018 23:29:11 -0400
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Wed, 31 Oct 2018 23:21:58 -0400
Finished: Wed, 31 Oct 2018 23:24:08 -0400
Ready: True
Restart Count: 102
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-xvq8b (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-xvq8b:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-xvq8b
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: CriticalAddonsOnly
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Killing 44m (x12 over 4h18m) kubelet, suey.nknwn.local Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.
Warning BackOff 4m58s (x170 over 4h9m) kubelet, suey.nknwn.local Back-off restarting failed container
Warning Unhealthy 8s (x102 over 4h19m) kubelet, suey.nknwn.local Liveness probe failed: HTTP probe failed with statuscode: 503
#
kubectl
's log
:
# kubectl -n kube-system logs -f coredns-576cbf47c7-nm7gc
E1101 03:31:58.974836 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:31:58.974836 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:31:58.974857 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:32:29.975493 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:32:29.976732 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:32:29.977788 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:33:00.976164 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:33:00.977415 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:33:00.978332 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
2018/11/01 03:33:08 [INFO] SIGTERM: Shutting down servers then terminating
E1101 03:33:31.976864 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:33:31.978080 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1101 03:33:31.979156 1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
#
# kubectl -n kube-system log -f coredns-576cbf47c7-gqdgd
.:53
2018/11/05 04:04:13 [INFO] CoreDNS-1.2.2
2018/11/05 04:04:13 [INFO] linux/amd64, go1.11, eb51e8b
CoreDNS-1.2.2
linux/amd64, go1.11, eb51e8b
2018/11/05 04:04:13 [INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769
2018/11/05 04:04:19 [FATAL] plugin/loop: Seen "HINFO IN 3597544515206064936.6415437575707023337." more than twice, loop detected
# kubectl -n kube-system log -f coredns-576cbf47c7-hhmws
.:53
2018/11/05 04:04:18 [INFO] CoreDNS-1.2.2
2018/11/05 04:04:18 [INFO] linux/amd64, go1.11, eb51e8b
CoreDNS-1.2.2
linux/amd64, go1.11, eb51e8b
2018/11/05 04:04:18 [INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769
2018/11/05 04:04:24 [FATAL] plugin/loop: Seen "HINFO IN 6900627972087569316.7905576541070882081." more than twice, loop detected
#
describe
(apiserver
):
# kubectl -n kube-system describe pod kube-apiserver-suey.nknwn.local
Name: kube-apiserver-suey.nknwn.local
Namespace: kube-system
Priority: 2000000000
PriorityClassName: system-cluster-critical
Node: suey.nknwn.local/192.168.87.20
Start Time: Fri, 02 Nov 2018 00:28:44 -0400
Labels: component=kube-apiserver
tier=control-plane
Annotations: kubernetes.io/config.hash: 2433a531afe72165364aace3b746ea4c
kubernetes.io/config.mirror: 2433a531afe72165364aace3b746ea4c
kubernetes.io/config.seen: 2018-11-02T00:28:43.795663261-04:00
kubernetes.io/config.source: file
scheduler.alpha.kubernetes.io/critical-pod:
Status: Running
IP: 192.168.87.20
Containers:
kube-apiserver:
Container ID: docker://659456385a1a859f078d36f4d1b91db9143d228b3bc5b3947a09460a39ce41fc
Image: k8s.gcr.io/kube-apiserver:v1.12.2
Image ID: docker-pullable://k8s.gcr.io/kube-apiserver@sha256:094929baf3a7681945d83a7654b3248e586b20506e28526121f50eb359cee44f
Port: <none>
Host Port: <none>
Command:
kube-apiserver
--authorization-mode=Node,RBAC
--advertise-address=192.168.87.20
--allow-privileged=true
--client-ca-file=/etc/kubernetes/pki/ca.crt
--enable-admission-plugins=NodeRestriction
--enable-bootstrap-token-auth=true
--etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
--etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
--etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
--etcd-servers=https://127.0.0.1:2379
--insecure-port=0
--kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
--kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
--proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
--proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
--requestheader-allowed-names=front-proxy-client
--requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
--requestheader-extra-headers-prefix=X-Remote-Extra-
--requestheader-group-headers=X-Remote-Group
--requestheader-username-headers=X-Remote-User
--secure-port=6443
--service-account-key-file=/etc/kubernetes/pki/sa.pub
--service-cluster-ip-range=10.96.0.0/12
--tls-cert-file=/etc/kubernetes/pki/apiserver.crt
--tls-private-key-file=/etc/kubernetes/pki/apiserver.key
State: Running
Started: Sun, 04 Nov 2018 22:57:27 -0500
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Sun, 04 Nov 2018 20:12:06 -0500
Finished: Sun, 04 Nov 2018 22:55:24 -0500
Ready: True
Restart Count: 2
Requests:
cpu: 250m
Liveness: http-get https://192.168.87.20:6443/healthz delay=15s timeout=15s period=10s #success=1 #failure=8
Environment: <none>
Mounts:
/etc/ca-certificates from etc-ca-certificates (ro)
/etc/kubernetes/pki from k8s-certs (ro)
/etc/ssl/certs from ca-certs (ro)
/usr/local/share/ca-certificates from usr-local-share-ca-certificates (ro)
/usr/share/ca-certificates from usr-share-ca-certificates (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
etc-ca-certificates:
Type: HostPath (bare host directory volume)
Path: /etc/ca-certificates
HostPathType: DirectoryOrCreate
k8s-certs:
Type: HostPath (bare host directory volume)
Path: /etc/kubernetes/pki
HostPathType: DirectoryOrCreate
ca-certs:
Type: HostPath (bare host directory volume)
Path: /etc/ssl/certs
HostPathType: DirectoryOrCreate
usr-share-ca-certificates:
Type: HostPath (bare host directory volume)
Path: /usr/share/ca-certificates
HostPathType: DirectoryOrCreate
usr-local-share-ca-certificates:
Type: HostPath (bare host directory volume)
Path: /usr/local/share/ca-certificates
HostPathType: DirectoryOrCreate
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: :NoExecute
Events: <none>
#
syslog (host):
Nov 4 22:59:36 suey kubelet[1234]: E1104 22:59:36.139538 1234
pod_workers.go:186] Error syncing pod
d8146b7e-de57-11e8-a1e2-ec8eb57434c8
("coredns-576cbf47c7-hhmws_kube-system(d8146b7e-de57-11e8-a1e2-ec8eb57434c8)"), skipping: failed to "StartContainer" for "coredns" with
CrashLoopBackOff: "Back-off 40s restarting failed container=coredns
pod=coredns-576cbf47c7-hhmws_kube-system(d8146b7e-de57-11e8-a1e2-ec8eb57434c8)"
Please advise.
docker kubernetes kubectl kubeadm coredns
docker kubernetes kubectl kubeadm coredns
edited Nov 6 '18 at 3:41
alexus
asked Oct 31 '18 at 3:20
alexusalexus
2,98852849
2,98852849
attach the full output pls
– Konstantin Vustin
Oct 31 '18 at 3:49
2
Also trykubectl logs -f coredns-576cbf47c7-nm7gc
– Andre Helberg
Oct 31 '18 at 5:50
@AndreHelberg I updated my question with output fromkubectl logs
command. I'm not sure what this10.96.0.1:443
is ...
– alexus
Nov 1 '18 at 3:34
@KonstantinVustin I updated my question with full output as well.
– alexus
Nov 1 '18 at 3:38
@alexus It looks like you are trying to set up a cluster from scratch? I haven't done this before, so my input might not be of much help, but from the log you pasted, it seems your pod is trying to connect to10.96.0.1:443
. I'd suggest verifying that what ever should be there is up. I'm guessing it iskube-apiserver-suey.nknwn.local
. I would look at: * are these nodes on the same network * Is the ip correct * is something listening on 443 * check the service (logs) listening on 443, might be a cert/auth issue or it's timing out * check that ports aren't being blocked
– Andre Helberg
Nov 1 '18 at 8:46
|
show 5 more comments
attach the full output pls
– Konstantin Vustin
Oct 31 '18 at 3:49
2
Also trykubectl logs -f coredns-576cbf47c7-nm7gc
– Andre Helberg
Oct 31 '18 at 5:50
@AndreHelberg I updated my question with output fromkubectl logs
command. I'm not sure what this10.96.0.1:443
is ...
– alexus
Nov 1 '18 at 3:34
@KonstantinVustin I updated my question with full output as well.
– alexus
Nov 1 '18 at 3:38
@alexus It looks like you are trying to set up a cluster from scratch? I haven't done this before, so my input might not be of much help, but from the log you pasted, it seems your pod is trying to connect to10.96.0.1:443
. I'd suggest verifying that what ever should be there is up. I'm guessing it iskube-apiserver-suey.nknwn.local
. I would look at: * are these nodes on the same network * Is the ip correct * is something listening on 443 * check the service (logs) listening on 443, might be a cert/auth issue or it's timing out * check that ports aren't being blocked
– Andre Helberg
Nov 1 '18 at 8:46
attach the full output pls
– Konstantin Vustin
Oct 31 '18 at 3:49
attach the full output pls
– Konstantin Vustin
Oct 31 '18 at 3:49
2
2
Also try
kubectl logs -f coredns-576cbf47c7-nm7gc
– Andre Helberg
Oct 31 '18 at 5:50
Also try
kubectl logs -f coredns-576cbf47c7-nm7gc
– Andre Helberg
Oct 31 '18 at 5:50
@AndreHelberg I updated my question with output from
kubectl logs
command. I'm not sure what this 10.96.0.1:443
is ...– alexus
Nov 1 '18 at 3:34
@AndreHelberg I updated my question with output from
kubectl logs
command. I'm not sure what this 10.96.0.1:443
is ...– alexus
Nov 1 '18 at 3:34
@KonstantinVustin I updated my question with full output as well.
– alexus
Nov 1 '18 at 3:38
@KonstantinVustin I updated my question with full output as well.
– alexus
Nov 1 '18 at 3:38
@alexus It looks like you are trying to set up a cluster from scratch? I haven't done this before, so my input might not be of much help, but from the log you pasted, it seems your pod is trying to connect to
10.96.0.1:443
. I'd suggest verifying that what ever should be there is up. I'm guessing it is kube-apiserver-suey.nknwn.local
. I would look at: * are these nodes on the same network * Is the ip correct * is something listening on 443 * check the service (logs) listening on 443, might be a cert/auth issue or it's timing out * check that ports aren't being blocked– Andre Helberg
Nov 1 '18 at 8:46
@alexus It looks like you are trying to set up a cluster from scratch? I haven't done this before, so my input might not be of much help, but from the log you pasted, it seems your pod is trying to connect to
10.96.0.1:443
. I'd suggest verifying that what ever should be there is up. I'm guessing it is kube-apiserver-suey.nknwn.local
. I would look at: * are these nodes on the same network * Is the ip correct * is something listening on 443 * check the service (logs) listening on 443, might be a cert/auth issue or it's timing out * check that ports aren't being blocked– Andre Helberg
Nov 1 '18 at 8:46
|
show 5 more comments
2 Answers
2
active
oldest
votes
This error
[FATAL] plugin/loop: Seen "HINFO IN 6900627972087569316.7905576541070882081." more than twice, loop detected
is caused when CoreDNS detects a loop in the resolve configuration, and it is the intended behavior. You are hitting this issue:
https://github.com/kubernetes/kubeadm/issues/1162
https://github.com/coredns/coredns/issues/2087
Hacky solution: Disable the CoreDNS loop detection
Edit the CoreDNS configmap:
kubectl -n kube-system edit configmap coredns
Remove or comment out the line with loop
, save and exit.
Then remove the CoreDNS pods, so new ones can be created with new config:
kubectl -n kube-system delete pod -l k8s-app=kube-dns
All should be fine after that.
Preferred Solution: Remove the loop in the DNS configuration
First, check if you are using systemd-resolved
. If you are running Ubuntu 18.04, it is probably the case.
systemctl list-unit-files | grep enabled | grep systemd-resolved
If it is, check which resolv.conf
file your cluster is using as reference:
ps auxww | grep kubelet
You might see a line like:
/usr/bin/kubelet ... --resolv-conf=/run/systemd/resolve/resolv.conf
The important part is --resolv-conf
- we figure out if systemd resolv.conf is used, or not.
If it is the resolv.conf
of systemd
, do the following:
Check the content of /run/systemd/resolve/resolv.conf
to see if there is a record like:
nameserver 127.0.0.1
If there is 127.0.0.1
, it is the one causing the loop.
To get rid of it, you should not edit that file, but check other places to make it properly generated.
Check all files under /etc/systemd/network
and if you find a record like
DNS=127.0.0.1
delete that record. Also check /etc/systemd/resolved.conf
and do the same if needed. Make sure you have at least one or two DNS servers configured, such as
DNS=1.1.1.1 1.0.0.1
After doing all that, restart the systemd services to put your changes into effect:
systemctl restart systemd-networkd systemd-resolved
After that, verify that DNS=127.0.0.1
is no more in the resolv.conf
file:
cat /run/systemd/resolve/resolv.conf
Finally, trigger re-creation of the DNS pods
kubectl -n kube-system delete pod -l k8s-app=kube-dns
Summary: The solution involves getting rid of what looks like a DNS lookup loop from the host DNS configuration. Steps vary between different resolv.conf managers/implementations.
"hacky solution" did the trick, yet "proper solution" (preferred over "hacky") -systemd-resolved.service
is inactive (dead) and it is also disabled
– alexus
Nov 23 '18 at 19:59
@alexus Glad that it worked out. The preferred solution assumes that your system uses systemd + it's DNS resolver service. If that's not the case, you can investigate and find out where the nameservers are read from, and how they are populated/added, as I mentioned in the summary. The reason for the 1st solution to be hacky is the following: Loop detection of CoreDNS is necessary, and therefore the crash is the expected behavior. So, ideally, the loop needs to be removed/fixed.
– Utku Özdemir
Nov 25 '18 at 1:56
@UtkuÖzdemir I am usingubuntu 16.04
and I dont haveresolv.conf
ofsystemd
. Contents of/etc/resolv.conf
arenameserver 127.0.1.1 search APSDC.local
. I am getting the same coredns crashloopbackoff. Is it because of the127.0.1.1
IP. Can you please suggest any good solution. stackoverflow.com/questions/54466359/…
– S Andrew
Feb 1 at 3:17
add a comment |
Here's some shell hackery that automates Utku's answer:
# remove loop from DNS config files
sudo find /etc/systemd/network /etc/systemd/resolved.conf -type f
-exec sed -i '/^DNS=127.0.0.1/d' {} +
# if necessary, configure some DNS servers (use cloudfare public)
if ! grep '^DNS=.*' /etc/systemd/resolved.conf; then
sudo sed -i '$aDNS=1.1.1.1 1.0.0.1' /etc/systemd/resolved.conf
fi
# restart systemd services
sudo systemctl restart systemd-networkd systemd-resolved
# force (re-) creation of the dns pods
kubectl -n kube-system delete pod -l k8s-app=kube-dns
I am still getting crashloopbackoff after executing ./shellhackery.sh on Ubuntu 16.04
– Aravind Murthy
Dec 2 '18 at 8:02
@VixZeke, what do the container logs say?
– rubicks
Dec 2 '18 at 15:55
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53075796%2fcoredns-pods-have-crashloopbackoff-or-error-state%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
This error
[FATAL] plugin/loop: Seen "HINFO IN 6900627972087569316.7905576541070882081." more than twice, loop detected
is caused when CoreDNS detects a loop in the resolve configuration, and it is the intended behavior. You are hitting this issue:
https://github.com/kubernetes/kubeadm/issues/1162
https://github.com/coredns/coredns/issues/2087
Hacky solution: Disable the CoreDNS loop detection
Edit the CoreDNS configmap:
kubectl -n kube-system edit configmap coredns
Remove or comment out the line with loop
, save and exit.
Then remove the CoreDNS pods, so new ones can be created with new config:
kubectl -n kube-system delete pod -l k8s-app=kube-dns
All should be fine after that.
Preferred Solution: Remove the loop in the DNS configuration
First, check if you are using systemd-resolved
. If you are running Ubuntu 18.04, it is probably the case.
systemctl list-unit-files | grep enabled | grep systemd-resolved
If it is, check which resolv.conf
file your cluster is using as reference:
ps auxww | grep kubelet
You might see a line like:
/usr/bin/kubelet ... --resolv-conf=/run/systemd/resolve/resolv.conf
The important part is --resolv-conf
- we figure out if systemd resolv.conf is used, or not.
If it is the resolv.conf
of systemd
, do the following:
Check the content of /run/systemd/resolve/resolv.conf
to see if there is a record like:
nameserver 127.0.0.1
If there is 127.0.0.1
, it is the one causing the loop.
To get rid of it, you should not edit that file, but check other places to make it properly generated.
Check all files under /etc/systemd/network
and if you find a record like
DNS=127.0.0.1
delete that record. Also check /etc/systemd/resolved.conf
and do the same if needed. Make sure you have at least one or two DNS servers configured, such as
DNS=1.1.1.1 1.0.0.1
After doing all that, restart the systemd services to put your changes into effect:
systemctl restart systemd-networkd systemd-resolved
After that, verify that DNS=127.0.0.1
is no more in the resolv.conf
file:
cat /run/systemd/resolve/resolv.conf
Finally, trigger re-creation of the DNS pods
kubectl -n kube-system delete pod -l k8s-app=kube-dns
Summary: The solution involves getting rid of what looks like a DNS lookup loop from the host DNS configuration. Steps vary between different resolv.conf managers/implementations.
"hacky solution" did the trick, yet "proper solution" (preferred over "hacky") -systemd-resolved.service
is inactive (dead) and it is also disabled
– alexus
Nov 23 '18 at 19:59
@alexus Glad that it worked out. The preferred solution assumes that your system uses systemd + it's DNS resolver service. If that's not the case, you can investigate and find out where the nameservers are read from, and how they are populated/added, as I mentioned in the summary. The reason for the 1st solution to be hacky is the following: Loop detection of CoreDNS is necessary, and therefore the crash is the expected behavior. So, ideally, the loop needs to be removed/fixed.
– Utku Özdemir
Nov 25 '18 at 1:56
@UtkuÖzdemir I am usingubuntu 16.04
and I dont haveresolv.conf
ofsystemd
. Contents of/etc/resolv.conf
arenameserver 127.0.1.1 search APSDC.local
. I am getting the same coredns crashloopbackoff. Is it because of the127.0.1.1
IP. Can you please suggest any good solution. stackoverflow.com/questions/54466359/…
– S Andrew
Feb 1 at 3:17
add a comment |
This error
[FATAL] plugin/loop: Seen "HINFO IN 6900627972087569316.7905576541070882081." more than twice, loop detected
is caused when CoreDNS detects a loop in the resolve configuration, and it is the intended behavior. You are hitting this issue:
https://github.com/kubernetes/kubeadm/issues/1162
https://github.com/coredns/coredns/issues/2087
Hacky solution: Disable the CoreDNS loop detection
Edit the CoreDNS configmap:
kubectl -n kube-system edit configmap coredns
Remove or comment out the line with loop
, save and exit.
Then remove the CoreDNS pods, so new ones can be created with new config:
kubectl -n kube-system delete pod -l k8s-app=kube-dns
All should be fine after that.
Preferred Solution: Remove the loop in the DNS configuration
First, check if you are using systemd-resolved
. If you are running Ubuntu 18.04, it is probably the case.
systemctl list-unit-files | grep enabled | grep systemd-resolved
If it is, check which resolv.conf
file your cluster is using as reference:
ps auxww | grep kubelet
You might see a line like:
/usr/bin/kubelet ... --resolv-conf=/run/systemd/resolve/resolv.conf
The important part is --resolv-conf
- we figure out if systemd resolv.conf is used, or not.
If it is the resolv.conf
of systemd
, do the following:
Check the content of /run/systemd/resolve/resolv.conf
to see if there is a record like:
nameserver 127.0.0.1
If there is 127.0.0.1
, it is the one causing the loop.
To get rid of it, you should not edit that file, but check other places to make it properly generated.
Check all files under /etc/systemd/network
and if you find a record like
DNS=127.0.0.1
delete that record. Also check /etc/systemd/resolved.conf
and do the same if needed. Make sure you have at least one or two DNS servers configured, such as
DNS=1.1.1.1 1.0.0.1
After doing all that, restart the systemd services to put your changes into effect:
systemctl restart systemd-networkd systemd-resolved
After that, verify that DNS=127.0.0.1
is no more in the resolv.conf
file:
cat /run/systemd/resolve/resolv.conf
Finally, trigger re-creation of the DNS pods
kubectl -n kube-system delete pod -l k8s-app=kube-dns
Summary: The solution involves getting rid of what looks like a DNS lookup loop from the host DNS configuration. Steps vary between different resolv.conf managers/implementations.
"hacky solution" did the trick, yet "proper solution" (preferred over "hacky") -systemd-resolved.service
is inactive (dead) and it is also disabled
– alexus
Nov 23 '18 at 19:59
@alexus Glad that it worked out. The preferred solution assumes that your system uses systemd + it's DNS resolver service. If that's not the case, you can investigate and find out where the nameservers are read from, and how they are populated/added, as I mentioned in the summary. The reason for the 1st solution to be hacky is the following: Loop detection of CoreDNS is necessary, and therefore the crash is the expected behavior. So, ideally, the loop needs to be removed/fixed.
– Utku Özdemir
Nov 25 '18 at 1:56
@UtkuÖzdemir I am usingubuntu 16.04
and I dont haveresolv.conf
ofsystemd
. Contents of/etc/resolv.conf
arenameserver 127.0.1.1 search APSDC.local
. I am getting the same coredns crashloopbackoff. Is it because of the127.0.1.1
IP. Can you please suggest any good solution. stackoverflow.com/questions/54466359/…
– S Andrew
Feb 1 at 3:17
add a comment |
This error
[FATAL] plugin/loop: Seen "HINFO IN 6900627972087569316.7905576541070882081." more than twice, loop detected
is caused when CoreDNS detects a loop in the resolve configuration, and it is the intended behavior. You are hitting this issue:
https://github.com/kubernetes/kubeadm/issues/1162
https://github.com/coredns/coredns/issues/2087
Hacky solution: Disable the CoreDNS loop detection
Edit the CoreDNS configmap:
kubectl -n kube-system edit configmap coredns
Remove or comment out the line with loop
, save and exit.
Then remove the CoreDNS pods, so new ones can be created with new config:
kubectl -n kube-system delete pod -l k8s-app=kube-dns
All should be fine after that.
Preferred Solution: Remove the loop in the DNS configuration
First, check if you are using systemd-resolved
. If you are running Ubuntu 18.04, it is probably the case.
systemctl list-unit-files | grep enabled | grep systemd-resolved
If it is, check which resolv.conf
file your cluster is using as reference:
ps auxww | grep kubelet
You might see a line like:
/usr/bin/kubelet ... --resolv-conf=/run/systemd/resolve/resolv.conf
The important part is --resolv-conf
- we figure out if systemd resolv.conf is used, or not.
If it is the resolv.conf
of systemd
, do the following:
Check the content of /run/systemd/resolve/resolv.conf
to see if there is a record like:
nameserver 127.0.0.1
If there is 127.0.0.1
, it is the one causing the loop.
To get rid of it, you should not edit that file, but check other places to make it properly generated.
Check all files under /etc/systemd/network
and if you find a record like
DNS=127.0.0.1
delete that record. Also check /etc/systemd/resolved.conf
and do the same if needed. Make sure you have at least one or two DNS servers configured, such as
DNS=1.1.1.1 1.0.0.1
After doing all that, restart the systemd services to put your changes into effect:
systemctl restart systemd-networkd systemd-resolved
After that, verify that DNS=127.0.0.1
is no more in the resolv.conf
file:
cat /run/systemd/resolve/resolv.conf
Finally, trigger re-creation of the DNS pods
kubectl -n kube-system delete pod -l k8s-app=kube-dns
Summary: The solution involves getting rid of what looks like a DNS lookup loop from the host DNS configuration. Steps vary between different resolv.conf managers/implementations.
This error
[FATAL] plugin/loop: Seen "HINFO IN 6900627972087569316.7905576541070882081." more than twice, loop detected
is caused when CoreDNS detects a loop in the resolve configuration, and it is the intended behavior. You are hitting this issue:
https://github.com/kubernetes/kubeadm/issues/1162
https://github.com/coredns/coredns/issues/2087
Hacky solution: Disable the CoreDNS loop detection
Edit the CoreDNS configmap:
kubectl -n kube-system edit configmap coredns
Remove or comment out the line with loop
, save and exit.
Then remove the CoreDNS pods, so new ones can be created with new config:
kubectl -n kube-system delete pod -l k8s-app=kube-dns
All should be fine after that.
Preferred Solution: Remove the loop in the DNS configuration
First, check if you are using systemd-resolved
. If you are running Ubuntu 18.04, it is probably the case.
systemctl list-unit-files | grep enabled | grep systemd-resolved
If it is, check which resolv.conf
file your cluster is using as reference:
ps auxww | grep kubelet
You might see a line like:
/usr/bin/kubelet ... --resolv-conf=/run/systemd/resolve/resolv.conf
The important part is --resolv-conf
- we figure out if systemd resolv.conf is used, or not.
If it is the resolv.conf
of systemd
, do the following:
Check the content of /run/systemd/resolve/resolv.conf
to see if there is a record like:
nameserver 127.0.0.1
If there is 127.0.0.1
, it is the one causing the loop.
To get rid of it, you should not edit that file, but check other places to make it properly generated.
Check all files under /etc/systemd/network
and if you find a record like
DNS=127.0.0.1
delete that record. Also check /etc/systemd/resolved.conf
and do the same if needed. Make sure you have at least one or two DNS servers configured, such as
DNS=1.1.1.1 1.0.0.1
After doing all that, restart the systemd services to put your changes into effect:
systemctl restart systemd-networkd systemd-resolved
After that, verify that DNS=127.0.0.1
is no more in the resolv.conf
file:
cat /run/systemd/resolve/resolv.conf
Finally, trigger re-creation of the DNS pods
kubectl -n kube-system delete pod -l k8s-app=kube-dns
Summary: The solution involves getting rid of what looks like a DNS lookup loop from the host DNS configuration. Steps vary between different resolv.conf managers/implementations.
edited Nov 25 '18 at 1:57
answered Nov 21 '18 at 14:15
Utku ÖzdemirUtku Özdemir
3,66713037
3,66713037
"hacky solution" did the trick, yet "proper solution" (preferred over "hacky") -systemd-resolved.service
is inactive (dead) and it is also disabled
– alexus
Nov 23 '18 at 19:59
@alexus Glad that it worked out. The preferred solution assumes that your system uses systemd + it's DNS resolver service. If that's not the case, you can investigate and find out where the nameservers are read from, and how they are populated/added, as I mentioned in the summary. The reason for the 1st solution to be hacky is the following: Loop detection of CoreDNS is necessary, and therefore the crash is the expected behavior. So, ideally, the loop needs to be removed/fixed.
– Utku Özdemir
Nov 25 '18 at 1:56
@UtkuÖzdemir I am usingubuntu 16.04
and I dont haveresolv.conf
ofsystemd
. Contents of/etc/resolv.conf
arenameserver 127.0.1.1 search APSDC.local
. I am getting the same coredns crashloopbackoff. Is it because of the127.0.1.1
IP. Can you please suggest any good solution. stackoverflow.com/questions/54466359/…
– S Andrew
Feb 1 at 3:17
add a comment |
"hacky solution" did the trick, yet "proper solution" (preferred over "hacky") -systemd-resolved.service
is inactive (dead) and it is also disabled
– alexus
Nov 23 '18 at 19:59
@alexus Glad that it worked out. The preferred solution assumes that your system uses systemd + it's DNS resolver service. If that's not the case, you can investigate and find out where the nameservers are read from, and how they are populated/added, as I mentioned in the summary. The reason for the 1st solution to be hacky is the following: Loop detection of CoreDNS is necessary, and therefore the crash is the expected behavior. So, ideally, the loop needs to be removed/fixed.
– Utku Özdemir
Nov 25 '18 at 1:56
@UtkuÖzdemir I am usingubuntu 16.04
and I dont haveresolv.conf
ofsystemd
. Contents of/etc/resolv.conf
arenameserver 127.0.1.1 search APSDC.local
. I am getting the same coredns crashloopbackoff. Is it because of the127.0.1.1
IP. Can you please suggest any good solution. stackoverflow.com/questions/54466359/…
– S Andrew
Feb 1 at 3:17
"hacky solution" did the trick, yet "proper solution" (preferred over "hacky") -
systemd-resolved.service
is inactive (dead) and it is also disabled– alexus
Nov 23 '18 at 19:59
"hacky solution" did the trick, yet "proper solution" (preferred over "hacky") -
systemd-resolved.service
is inactive (dead) and it is also disabled– alexus
Nov 23 '18 at 19:59
@alexus Glad that it worked out. The preferred solution assumes that your system uses systemd + it's DNS resolver service. If that's not the case, you can investigate and find out where the nameservers are read from, and how they are populated/added, as I mentioned in the summary. The reason for the 1st solution to be hacky is the following: Loop detection of CoreDNS is necessary, and therefore the crash is the expected behavior. So, ideally, the loop needs to be removed/fixed.
– Utku Özdemir
Nov 25 '18 at 1:56
@alexus Glad that it worked out. The preferred solution assumes that your system uses systemd + it's DNS resolver service. If that's not the case, you can investigate and find out where the nameservers are read from, and how they are populated/added, as I mentioned in the summary. The reason for the 1st solution to be hacky is the following: Loop detection of CoreDNS is necessary, and therefore the crash is the expected behavior. So, ideally, the loop needs to be removed/fixed.
– Utku Özdemir
Nov 25 '18 at 1:56
@UtkuÖzdemir I am using
ubuntu 16.04
and I dont have resolv.conf
of systemd
. Contents of /etc/resolv.conf
are nameserver 127.0.1.1 search APSDC.local
. I am getting the same coredns crashloopbackoff. Is it because of the 127.0.1.1
IP. Can you please suggest any good solution. stackoverflow.com/questions/54466359/…– S Andrew
Feb 1 at 3:17
@UtkuÖzdemir I am using
ubuntu 16.04
and I dont have resolv.conf
of systemd
. Contents of /etc/resolv.conf
are nameserver 127.0.1.1 search APSDC.local
. I am getting the same coredns crashloopbackoff. Is it because of the 127.0.1.1
IP. Can you please suggest any good solution. stackoverflow.com/questions/54466359/…– S Andrew
Feb 1 at 3:17
add a comment |
Here's some shell hackery that automates Utku's answer:
# remove loop from DNS config files
sudo find /etc/systemd/network /etc/systemd/resolved.conf -type f
-exec sed -i '/^DNS=127.0.0.1/d' {} +
# if necessary, configure some DNS servers (use cloudfare public)
if ! grep '^DNS=.*' /etc/systemd/resolved.conf; then
sudo sed -i '$aDNS=1.1.1.1 1.0.0.1' /etc/systemd/resolved.conf
fi
# restart systemd services
sudo systemctl restart systemd-networkd systemd-resolved
# force (re-) creation of the dns pods
kubectl -n kube-system delete pod -l k8s-app=kube-dns
I am still getting crashloopbackoff after executing ./shellhackery.sh on Ubuntu 16.04
– Aravind Murthy
Dec 2 '18 at 8:02
@VixZeke, what do the container logs say?
– rubicks
Dec 2 '18 at 15:55
add a comment |
Here's some shell hackery that automates Utku's answer:
# remove loop from DNS config files
sudo find /etc/systemd/network /etc/systemd/resolved.conf -type f
-exec sed -i '/^DNS=127.0.0.1/d' {} +
# if necessary, configure some DNS servers (use cloudfare public)
if ! grep '^DNS=.*' /etc/systemd/resolved.conf; then
sudo sed -i '$aDNS=1.1.1.1 1.0.0.1' /etc/systemd/resolved.conf
fi
# restart systemd services
sudo systemctl restart systemd-networkd systemd-resolved
# force (re-) creation of the dns pods
kubectl -n kube-system delete pod -l k8s-app=kube-dns
I am still getting crashloopbackoff after executing ./shellhackery.sh on Ubuntu 16.04
– Aravind Murthy
Dec 2 '18 at 8:02
@VixZeke, what do the container logs say?
– rubicks
Dec 2 '18 at 15:55
add a comment |
Here's some shell hackery that automates Utku's answer:
# remove loop from DNS config files
sudo find /etc/systemd/network /etc/systemd/resolved.conf -type f
-exec sed -i '/^DNS=127.0.0.1/d' {} +
# if necessary, configure some DNS servers (use cloudfare public)
if ! grep '^DNS=.*' /etc/systemd/resolved.conf; then
sudo sed -i '$aDNS=1.1.1.1 1.0.0.1' /etc/systemd/resolved.conf
fi
# restart systemd services
sudo systemctl restart systemd-networkd systemd-resolved
# force (re-) creation of the dns pods
kubectl -n kube-system delete pod -l k8s-app=kube-dns
Here's some shell hackery that automates Utku's answer:
# remove loop from DNS config files
sudo find /etc/systemd/network /etc/systemd/resolved.conf -type f
-exec sed -i '/^DNS=127.0.0.1/d' {} +
# if necessary, configure some DNS servers (use cloudfare public)
if ! grep '^DNS=.*' /etc/systemd/resolved.conf; then
sudo sed -i '$aDNS=1.1.1.1 1.0.0.1' /etc/systemd/resolved.conf
fi
# restart systemd services
sudo systemctl restart systemd-networkd systemd-resolved
# force (re-) creation of the dns pods
kubectl -n kube-system delete pod -l k8s-app=kube-dns
edited Nov 25 '18 at 17:00
answered Nov 25 '18 at 2:53
rubicksrubicks
2,3621727
2,3621727
I am still getting crashloopbackoff after executing ./shellhackery.sh on Ubuntu 16.04
– Aravind Murthy
Dec 2 '18 at 8:02
@VixZeke, what do the container logs say?
– rubicks
Dec 2 '18 at 15:55
add a comment |
I am still getting crashloopbackoff after executing ./shellhackery.sh on Ubuntu 16.04
– Aravind Murthy
Dec 2 '18 at 8:02
@VixZeke, what do the container logs say?
– rubicks
Dec 2 '18 at 15:55
I am still getting crashloopbackoff after executing ./shellhackery.sh on Ubuntu 16.04
– Aravind Murthy
Dec 2 '18 at 8:02
I am still getting crashloopbackoff after executing ./shellhackery.sh on Ubuntu 16.04
– Aravind Murthy
Dec 2 '18 at 8:02
@VixZeke, what do the container logs say?
– rubicks
Dec 2 '18 at 15:55
@VixZeke, what do the container logs say?
– rubicks
Dec 2 '18 at 15:55
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53075796%2fcoredns-pods-have-crashloopbackoff-or-error-state%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
attach the full output pls
– Konstantin Vustin
Oct 31 '18 at 3:49
2
Also try
kubectl logs -f coredns-576cbf47c7-nm7gc
– Andre Helberg
Oct 31 '18 at 5:50
@AndreHelberg I updated my question with output from
kubectl logs
command. I'm not sure what this10.96.0.1:443
is ...– alexus
Nov 1 '18 at 3:34
@KonstantinVustin I updated my question with full output as well.
– alexus
Nov 1 '18 at 3:38
@alexus It looks like you are trying to set up a cluster from scratch? I haven't done this before, so my input might not be of much help, but from the log you pasted, it seems your pod is trying to connect to
10.96.0.1:443
. I'd suggest verifying that what ever should be there is up. I'm guessing it iskube-apiserver-suey.nknwn.local
. I would look at: * are these nodes on the same network * Is the ip correct * is something listening on 443 * check the service (logs) listening on 443, might be a cert/auth issue or it's timing out * check that ports aren't being blocked– Andre Helberg
Nov 1 '18 at 8:46