coredns pods have CrashLoopBackOff or Error state

I'm trying to set up the Kubernetes master, by issuing:

kubeadm init --pod-network-cidr=192.168.0.0/16

followed by: Installing a pod network add-on (Calico)

followed by: Master Isolation

issue: coredns pods have CrashLoopBackOff or Error state:

# kubectl get pods -n kube-system

NAME                                       READY   STATUS             RESTARTS   AGE

calico-node-lflwx                          2/2     Running            0          2d

coredns-576cbf47c7-nm7gc                   0/1     CrashLoopBackOff   69         2d

coredns-576cbf47c7-nwcnx                   0/1     CrashLoopBackOff   69         2d

etcd-suey.nknwn.local                      1/1     Running            0          2d

kube-apiserver-suey.nknwn.local            1/1     Running            0          2d

kube-controller-manager-suey.nknwn.local   1/1     Running            0          2d

kube-proxy-xkgdr                           1/1     Running            0          2d

kube-scheduler-suey.nknwn.local            1/1     Running            0          2d

#

I tried with Troubleshooting kubeadm - Kubernetes, however my node isn't running SELinux and my Docker is up to date.

# docker --version

Docker version 18.06.1-ce, build e68fc7a

#

kubectl's describe:

# kubectl -n kube-system describe pod coredns-576cbf47c7-nwcnx 

Name:               coredns-576cbf47c7-nwcnx

Namespace:          kube-system

Priority:           0

PriorityClassName:  <none>

Node:               suey.nknwn.local/192.168.86.81

Start Time:         Sun, 28 Oct 2018 22:39:46 -0400

Labels:             k8s-app=kube-dns

                    pod-template-hash=576cbf47c7

Annotations:        cni.projectcalico.org/podIP: 192.168.0.30/32

Status:             Running

IP:                 192.168.0.30

Controlled By:      ReplicaSet/coredns-576cbf47c7

Containers:

  coredns:

    Container ID:  docker://ec65b8f40c38987961e9ed099dfa2e8bb35699a7f370a2cda0e0d522a0b05e79

    Image:         k8s.gcr.io/coredns:1.2.2

    Image ID:      docker-pullable://k8s.gcr.io/coredns@sha256:3e2be1cec87aca0b74b7668bbe8c02964a95a402e45ceb51b2252629d608d03a

    Ports:         53/UDP, 53/TCP, 9153/TCP

    Host Ports:    0/UDP, 0/TCP, 0/TCP

    Args:

      -conf

      /etc/coredns/Corefile

    State:          Running

      Started:      Wed, 31 Oct 2018 23:28:58 -0400

    Last State:     Terminated

      Reason:       Error

      Exit Code:    137

      Started:      Wed, 31 Oct 2018 23:21:35 -0400

      Finished:     Wed, 31 Oct 2018 23:23:54 -0400

    Ready:          True

    Restart Count:  103

    Limits:

      memory:  170Mi

    Requests:

      cpu:        100m

      memory:     70Mi

    Liveness:     http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5

    Environment:  <none>

    Mounts:

      /etc/coredns from config-volume (ro)

      /var/run/secrets/kubernetes.io/serviceaccount from coredns-token-xvq8b (ro)

Conditions:

  Type              Status

  Initialized       True 

  Ready             True 

  ContainersReady   True 

  PodScheduled      True 

Volumes:

  config-volume:

    Type:      ConfigMap (a volume populated by a ConfigMap)

    Name:      coredns

    Optional:  false

  coredns-token-xvq8b:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  coredns-token-xvq8b

    Optional:    false

QoS Class:       Burstable

Node-Selectors:  <none>

Tolerations:     CriticalAddonsOnly

                 node-role.kubernetes.io/master:NoSchedule

                 node.kubernetes.io/not-ready:NoExecute for 300s

                 node.kubernetes.io/unreachable:NoExecute for 300s

Events:

  Type     Reason     Age                     From                       Message

  ----     ------     ----                    ----                       -------

  Normal   Killing    54m (x10 over 4h19m)    kubelet, suey.nknwn.local  Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.

  Warning  Unhealthy  9m56s (x92 over 4h20m)  kubelet, suey.nknwn.local  Liveness probe failed: HTTP probe failed with statuscode: 503

  Warning  BackOff    5m4s (x173 over 4h10m)  kubelet, suey.nknwn.local  Back-off restarting failed container

# kubectl -n kube-system describe pod coredns-576cbf47c7-nm7gc 

Name:               coredns-576cbf47c7-nm7gc

Namespace:          kube-system

Priority:           0

PriorityClassName:  <none>

Node:               suey.nknwn.local/192.168.86.81

Start Time:         Sun, 28 Oct 2018 22:39:46 -0400

Labels:             k8s-app=kube-dns

                    pod-template-hash=576cbf47c7

Annotations:        cni.projectcalico.org/podIP: 192.168.0.31/32

Status:             Running

IP:                 192.168.0.31

Controlled By:      ReplicaSet/coredns-576cbf47c7

Containers:

  coredns:

    Container ID:  docker://0f2db8d89a4c439763e7293698d6a027a109bf556b806d232093300952a84359

    Image:         k8s.gcr.io/coredns:1.2.2

    Image ID:      docker-pullable://k8s.gcr.io/coredns@sha256:3e2be1cec87aca0b74b7668bbe8c02964a95a402e45ceb51b2252629d608d03a

    Ports:         53/UDP, 53/TCP, 9153/TCP

    Host Ports:    0/UDP, 0/TCP, 0/TCP

    Args:

      -conf

      /etc/coredns/Corefile

    State:          Running

      Started:      Wed, 31 Oct 2018 23:29:11 -0400

    Last State:     Terminated

      Reason:       Error

      Exit Code:    137

      Started:      Wed, 31 Oct 2018 23:21:58 -0400

      Finished:     Wed, 31 Oct 2018 23:24:08 -0400

    Ready:          True

    Restart Count:  102

    Limits:

      memory:  170Mi

    Requests:

      cpu:        100m

      memory:     70Mi

    Liveness:     http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5

    Environment:  <none>

    Mounts:

      /etc/coredns from config-volume (ro)

      /var/run/secrets/kubernetes.io/serviceaccount from coredns-token-xvq8b (ro)

Conditions:

  Type              Status

  Initialized       True 

  Ready             True 

  ContainersReady   True 

  PodScheduled      True 

Volumes:

  config-volume:

    Type:      ConfigMap (a volume populated by a ConfigMap)

    Name:      coredns

    Optional:  false

  coredns-token-xvq8b:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  coredns-token-xvq8b

    Optional:    false

QoS Class:       Burstable

Node-Selectors:  <none>

Tolerations:     CriticalAddonsOnly

                 node-role.kubernetes.io/master:NoSchedule

                 node.kubernetes.io/not-ready:NoExecute for 300s

                 node.kubernetes.io/unreachable:NoExecute for 300s

Events:

  Type     Reason     Age                     From                       Message

  ----     ------     ----                    ----                       -------

  Normal   Killing    44m (x12 over 4h18m)    kubelet, suey.nknwn.local  Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.

  Warning  BackOff    4m58s (x170 over 4h9m)  kubelet, suey.nknwn.local  Back-off restarting failed container

  Warning  Unhealthy  8s (x102 over 4h19m)    kubelet, suey.nknwn.local  Liveness probe failed: HTTP probe failed with statuscode: 503

#

kubectl's log:

# kubectl -n kube-system logs -f coredns-576cbf47c7-nm7gc 

E1101 03:31:58.974836       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:31:58.974836       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:31:58.974857       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:32:29.975493       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:32:29.976732       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:32:29.977788       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:00.976164       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:00.977415       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:00.978332       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

2018/11/01 03:33:08 [INFO] SIGTERM: Shutting down servers then terminating

E1101 03:33:31.976864       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:31.978080       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:31.979156       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

# 



# kubectl -n kube-system log -f coredns-576cbf47c7-gqdgd

.:53

2018/11/05 04:04:13 [INFO] CoreDNS-1.2.2

2018/11/05 04:04:13 [INFO] linux/amd64, go1.11, eb51e8b

CoreDNS-1.2.2

linux/amd64, go1.11, eb51e8b

2018/11/05 04:04:13 [INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769

2018/11/05 04:04:19 [FATAL] plugin/loop: Seen "HINFO IN 3597544515206064936.6415437575707023337." more than twice, loop detected

# kubectl -n kube-system log -f coredns-576cbf47c7-hhmws

.:53

2018/11/05 04:04:18 [INFO] CoreDNS-1.2.2

2018/11/05 04:04:18 [INFO] linux/amd64, go1.11, eb51e8b

CoreDNS-1.2.2

linux/amd64, go1.11, eb51e8b

2018/11/05 04:04:18 [INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769

2018/11/05 04:04:24 [FATAL] plugin/loop: Seen "HINFO IN 6900627972087569316.7905576541070882081." more than twice, loop detected

#

describe (apiserver):

# kubectl -n kube-system describe pod kube-apiserver-suey.nknwn.local

Name:               kube-apiserver-suey.nknwn.local

Namespace:          kube-system

Priority:           2000000000

PriorityClassName:  system-cluster-critical

Node:               suey.nknwn.local/192.168.87.20

Start Time:         Fri, 02 Nov 2018 00:28:44 -0400

Labels:             component=kube-apiserver

                    tier=control-plane

Annotations:        kubernetes.io/config.hash: 2433a531afe72165364aace3b746ea4c

                    kubernetes.io/config.mirror: 2433a531afe72165364aace3b746ea4c

                    kubernetes.io/config.seen: 2018-11-02T00:28:43.795663261-04:00

                    kubernetes.io/config.source: file

                    scheduler.alpha.kubernetes.io/critical-pod: 

Status:             Running

IP:                 192.168.87.20

Containers:

  kube-apiserver:

    Container ID:  docker://659456385a1a859f078d36f4d1b91db9143d228b3bc5b3947a09460a39ce41fc

    Image:         k8s.gcr.io/kube-apiserver:v1.12.2

    Image ID:      docker-pullable://k8s.gcr.io/kube-apiserver@sha256:094929baf3a7681945d83a7654b3248e586b20506e28526121f50eb359cee44f

    Port:          <none>

    Host Port:     <none>

    Command:

      kube-apiserver

      --authorization-mode=Node,RBAC

      --advertise-address=192.168.87.20

      --allow-privileged=true

      --client-ca-file=/etc/kubernetes/pki/ca.crt

      --enable-admission-plugins=NodeRestriction

      --enable-bootstrap-token-auth=true

      --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt

      --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt

      --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key

      --etcd-servers=https://127.0.0.1:2379

      --insecure-port=0

      --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt

      --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key

      --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname

      --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt

      --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key

      --requestheader-allowed-names=front-proxy-client

      --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt

      --requestheader-extra-headers-prefix=X-Remote-Extra-

      --requestheader-group-headers=X-Remote-Group

      --requestheader-username-headers=X-Remote-User

      --secure-port=6443

      --service-account-key-file=/etc/kubernetes/pki/sa.pub

      --service-cluster-ip-range=10.96.0.0/12

      --tls-cert-file=/etc/kubernetes/pki/apiserver.crt

      --tls-private-key-file=/etc/kubernetes/pki/apiserver.key

    State:          Running

      Started:      Sun, 04 Nov 2018 22:57:27 -0500

    Last State:     Terminated

      Reason:       Completed

      Exit Code:    0

      Started:      Sun, 04 Nov 2018 20:12:06 -0500

      Finished:     Sun, 04 Nov 2018 22:55:24 -0500

    Ready:          True

    Restart Count:  2

    Requests:

      cpu:        250m

    Liveness:     http-get https://192.168.87.20:6443/healthz delay=15s timeout=15s period=10s #success=1 #failure=8

    Environment:  <none>

    Mounts:

      /etc/ca-certificates from etc-ca-certificates (ro)

      /etc/kubernetes/pki from k8s-certs (ro)

      /etc/ssl/certs from ca-certs (ro)

      /usr/local/share/ca-certificates from usr-local-share-ca-certificates (ro)

      /usr/share/ca-certificates from usr-share-ca-certificates (ro)

Conditions:

  Type              Status

  Initialized       True 

  Ready             True 

  ContainersReady   True 

  PodScheduled      True 

Volumes:

  etc-ca-certificates:

    Type:          HostPath (bare host directory volume)

    Path:          /etc/ca-certificates

    HostPathType:  DirectoryOrCreate

  k8s-certs:

    Type:          HostPath (bare host directory volume)

    Path:          /etc/kubernetes/pki

    HostPathType:  DirectoryOrCreate

  ca-certs:

    Type:          HostPath (bare host directory volume)

    Path:          /etc/ssl/certs

    HostPathType:  DirectoryOrCreate

  usr-share-ca-certificates:

    Type:          HostPath (bare host directory volume)

    Path:          /usr/share/ca-certificates

    HostPathType:  DirectoryOrCreate

  usr-local-share-ca-certificates:

    Type:          HostPath (bare host directory volume)

    Path:          /usr/local/share/ca-certificates

    HostPathType:  DirectoryOrCreate

QoS Class:         Burstable

Node-Selectors:    <none>

Tolerations:       :NoExecute

Events:            <none>

#

syslog (host):

Nov 4 22:59:36 suey kubelet[1234]: E1104 22:59:36.139538 1234
pod_workers.go:186] Error syncing pod
d8146b7e-de57-11e8-a1e2-ec8eb57434c8
("coredns-576cbf47c7-hhmws_kube-system(d8146b7e-de57-11e8-a1e2-ec8eb57434c8)"), skipping: failed to "StartContainer" for "coredns" with
CrashLoopBackOff: "Back-off 40s restarting failed container=coredns
pod=coredns-576cbf47c7-hhmws_kube-system(d8146b7e-de57-11e8-a1e2-ec8eb57434c8)"

Please advise.

edited Nov 6 '18 at 3:41

asked Oct 31 '18 at 3:20

alexus

2,98852849

attach the full output pls

– Konstantin Vustin
Oct 31 '18 at 3:49

2

Also try kubectl logs -f coredns-576cbf47c7-nm7gc

– Andre Helberg
Oct 31 '18 at 5:50

@AndreHelberg I updated my question with output from kubectl logs command. I'm not sure what this 10.96.0.1:443 is ...

– alexus
Nov 1 '18 at 3:34

@KonstantinVustin I updated my question with full output as well.

– alexus
Nov 1 '18 at 3:38

@alexus It looks like you are trying to set up a cluster from scratch? I haven't done this before, so my input might not be of much help, but from the log you pasted, it seems your pod is trying to connect to 10.96.0.1:443. I'd suggest verifying that what ever should be there is up. I'm guessing it is kube-apiserver-suey.nknwn.local. I would look at: * are these nodes on the same network * Is the ip correct * is something listening on 443 * check the service (logs) listening on 443, might be a cert/auth issue or it's timing out * check that ports aren't being blocked

– Andre Helberg
Nov 1 '18 at 8:46

|
show 5 more comments

I'm trying to set up the Kubernetes master, by issuing:

kubeadm init --pod-network-cidr=192.168.0.0/16

followed by: Installing a pod network add-on (Calico)

followed by: Master Isolation

issue: coredns pods have CrashLoopBackOff or Error state:

# kubectl get pods -n kube-system

NAME                                       READY   STATUS             RESTARTS   AGE

calico-node-lflwx                          2/2     Running            0          2d

coredns-576cbf47c7-nm7gc                   0/1     CrashLoopBackOff   69         2d

coredns-576cbf47c7-nwcnx                   0/1     CrashLoopBackOff   69         2d

etcd-suey.nknwn.local                      1/1     Running            0          2d

kube-apiserver-suey.nknwn.local            1/1     Running            0          2d

kube-controller-manager-suey.nknwn.local   1/1     Running            0          2d

kube-proxy-xkgdr                           1/1     Running            0          2d

kube-scheduler-suey.nknwn.local            1/1     Running            0          2d

#

I tried with Troubleshooting kubeadm - Kubernetes, however my node isn't running SELinux and my Docker is up to date.

# docker --version

Docker version 18.06.1-ce, build e68fc7a

#

kubectl's describe:

# kubectl -n kube-system describe pod coredns-576cbf47c7-nwcnx 

Name:               coredns-576cbf47c7-nwcnx

Namespace:          kube-system

Priority:           0

PriorityClassName:  <none>

Node:               suey.nknwn.local/192.168.86.81

Start Time:         Sun, 28 Oct 2018 22:39:46 -0400

Labels:             k8s-app=kube-dns

                    pod-template-hash=576cbf47c7

Annotations:        cni.projectcalico.org/podIP: 192.168.0.30/32

Status:             Running

IP:                 192.168.0.30

Controlled By:      ReplicaSet/coredns-576cbf47c7

Containers:

  coredns:

    Container ID:  docker://ec65b8f40c38987961e9ed099dfa2e8bb35699a7f370a2cda0e0d522a0b05e79

    Image:         k8s.gcr.io/coredns:1.2.2

    Image ID:      docker-pullable://k8s.gcr.io/coredns@sha256:3e2be1cec87aca0b74b7668bbe8c02964a95a402e45ceb51b2252629d608d03a

    Ports:         53/UDP, 53/TCP, 9153/TCP

    Host Ports:    0/UDP, 0/TCP, 0/TCP

    Args:

      -conf

      /etc/coredns/Corefile

    State:          Running

      Started:      Wed, 31 Oct 2018 23:28:58 -0400

    Last State:     Terminated

      Reason:       Error

      Exit Code:    137

      Started:      Wed, 31 Oct 2018 23:21:35 -0400

      Finished:     Wed, 31 Oct 2018 23:23:54 -0400

    Ready:          True

    Restart Count:  103

    Limits:

      memory:  170Mi

    Requests:

      cpu:        100m

      memory:     70Mi

    Liveness:     http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5

    Environment:  <none>

    Mounts:

      /etc/coredns from config-volume (ro)

      /var/run/secrets/kubernetes.io/serviceaccount from coredns-token-xvq8b (ro)

Conditions:

  Type              Status

  Initialized       True 

  Ready             True 

  ContainersReady   True 

  PodScheduled      True 

Volumes:

  config-volume:

    Type:      ConfigMap (a volume populated by a ConfigMap)

    Name:      coredns

    Optional:  false

  coredns-token-xvq8b:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  coredns-token-xvq8b

    Optional:    false

QoS Class:       Burstable

Node-Selectors:  <none>

Tolerations:     CriticalAddonsOnly

                 node-role.kubernetes.io/master:NoSchedule

                 node.kubernetes.io/not-ready:NoExecute for 300s

                 node.kubernetes.io/unreachable:NoExecute for 300s

Events:

  Type     Reason     Age                     From                       Message

  ----     ------     ----                    ----                       -------

  Normal   Killing    54m (x10 over 4h19m)    kubelet, suey.nknwn.local  Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.

  Warning  Unhealthy  9m56s (x92 over 4h20m)  kubelet, suey.nknwn.local  Liveness probe failed: HTTP probe failed with statuscode: 503

  Warning  BackOff    5m4s (x173 over 4h10m)  kubelet, suey.nknwn.local  Back-off restarting failed container

# kubectl -n kube-system describe pod coredns-576cbf47c7-nm7gc 

Name:               coredns-576cbf47c7-nm7gc

Namespace:          kube-system

Priority:           0

PriorityClassName:  <none>

Node:               suey.nknwn.local/192.168.86.81

Start Time:         Sun, 28 Oct 2018 22:39:46 -0400

Labels:             k8s-app=kube-dns

                    pod-template-hash=576cbf47c7

Annotations:        cni.projectcalico.org/podIP: 192.168.0.31/32

Status:             Running

IP:                 192.168.0.31

Controlled By:      ReplicaSet/coredns-576cbf47c7

Containers:

  coredns:

    Container ID:  docker://0f2db8d89a4c439763e7293698d6a027a109bf556b806d232093300952a84359

    Image:         k8s.gcr.io/coredns:1.2.2

    Image ID:      docker-pullable://k8s.gcr.io/coredns@sha256:3e2be1cec87aca0b74b7668bbe8c02964a95a402e45ceb51b2252629d608d03a

    Ports:         53/UDP, 53/TCP, 9153/TCP

    Host Ports:    0/UDP, 0/TCP, 0/TCP

    Args:

      -conf

      /etc/coredns/Corefile

    State:          Running

      Started:      Wed, 31 Oct 2018 23:29:11 -0400

    Last State:     Terminated

      Reason:       Error

      Exit Code:    137

      Started:      Wed, 31 Oct 2018 23:21:58 -0400

      Finished:     Wed, 31 Oct 2018 23:24:08 -0400

    Ready:          True

    Restart Count:  102

    Limits:

      memory:  170Mi

    Requests:

      cpu:        100m

      memory:     70Mi

    Liveness:     http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5

    Environment:  <none>

    Mounts:

      /etc/coredns from config-volume (ro)

      /var/run/secrets/kubernetes.io/serviceaccount from coredns-token-xvq8b (ro)

Conditions:

  Type              Status

  Initialized       True 

  Ready             True 

  ContainersReady   True 

  PodScheduled      True 

Volumes:

  config-volume:

    Type:      ConfigMap (a volume populated by a ConfigMap)

    Name:      coredns

    Optional:  false

  coredns-token-xvq8b:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  coredns-token-xvq8b

    Optional:    false

QoS Class:       Burstable

Node-Selectors:  <none>

Tolerations:     CriticalAddonsOnly

                 node-role.kubernetes.io/master:NoSchedule

                 node.kubernetes.io/not-ready:NoExecute for 300s

                 node.kubernetes.io/unreachable:NoExecute for 300s

Events:

  Type     Reason     Age                     From                       Message

  ----     ------     ----                    ----                       -------

  Normal   Killing    44m (x12 over 4h18m)    kubelet, suey.nknwn.local  Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.

  Warning  BackOff    4m58s (x170 over 4h9m)  kubelet, suey.nknwn.local  Back-off restarting failed container

  Warning  Unhealthy  8s (x102 over 4h19m)    kubelet, suey.nknwn.local  Liveness probe failed: HTTP probe failed with statuscode: 503

#

kubectl's log:

# kubectl -n kube-system logs -f coredns-576cbf47c7-nm7gc 

E1101 03:31:58.974836       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:31:58.974836       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:31:58.974857       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:32:29.975493       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:32:29.976732       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:32:29.977788       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:00.976164       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:00.977415       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:00.978332       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

2018/11/01 03:33:08 [INFO] SIGTERM: Shutting down servers then terminating

E1101 03:33:31.976864       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:31.978080       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:31.979156       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

# 



# kubectl -n kube-system log -f coredns-576cbf47c7-gqdgd

.:53

2018/11/05 04:04:13 [INFO] CoreDNS-1.2.2

2018/11/05 04:04:13 [INFO] linux/amd64, go1.11, eb51e8b

CoreDNS-1.2.2

linux/amd64, go1.11, eb51e8b

2018/11/05 04:04:13 [INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769

2018/11/05 04:04:19 [FATAL] plugin/loop: Seen "HINFO IN 3597544515206064936.6415437575707023337." more than twice, loop detected

# kubectl -n kube-system log -f coredns-576cbf47c7-hhmws

.:53

2018/11/05 04:04:18 [INFO] CoreDNS-1.2.2

2018/11/05 04:04:18 [INFO] linux/amd64, go1.11, eb51e8b

CoreDNS-1.2.2

linux/amd64, go1.11, eb51e8b

2018/11/05 04:04:18 [INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769

2018/11/05 04:04:24 [FATAL] plugin/loop: Seen "HINFO IN 6900627972087569316.7905576541070882081." more than twice, loop detected

#

describe (apiserver):

# kubectl -n kube-system describe pod kube-apiserver-suey.nknwn.local

Name:               kube-apiserver-suey.nknwn.local

Namespace:          kube-system

Priority:           2000000000

PriorityClassName:  system-cluster-critical

Node:               suey.nknwn.local/192.168.87.20

Start Time:         Fri, 02 Nov 2018 00:28:44 -0400

Labels:             component=kube-apiserver

                    tier=control-plane

Annotations:        kubernetes.io/config.hash: 2433a531afe72165364aace3b746ea4c

                    kubernetes.io/config.mirror: 2433a531afe72165364aace3b746ea4c

                    kubernetes.io/config.seen: 2018-11-02T00:28:43.795663261-04:00

                    kubernetes.io/config.source: file

                    scheduler.alpha.kubernetes.io/critical-pod: 

Status:             Running

IP:                 192.168.87.20

Containers:

  kube-apiserver:

    Container ID:  docker://659456385a1a859f078d36f4d1b91db9143d228b3bc5b3947a09460a39ce41fc

    Image:         k8s.gcr.io/kube-apiserver:v1.12.2

    Image ID:      docker-pullable://k8s.gcr.io/kube-apiserver@sha256:094929baf3a7681945d83a7654b3248e586b20506e28526121f50eb359cee44f

    Port:          <none>

    Host Port:     <none>

    Command:

      kube-apiserver

      --authorization-mode=Node,RBAC

      --advertise-address=192.168.87.20

      --allow-privileged=true

      --client-ca-file=/etc/kubernetes/pki/ca.crt

      --enable-admission-plugins=NodeRestriction

      --enable-bootstrap-token-auth=true

      --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt

      --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt

      --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key

      --etcd-servers=https://127.0.0.1:2379

      --insecure-port=0

      --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt

      --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key

      --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname

      --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt

      --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key

      --requestheader-allowed-names=front-proxy-client

      --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt

      --requestheader-extra-headers-prefix=X-Remote-Extra-

      --requestheader-group-headers=X-Remote-Group

      --requestheader-username-headers=X-Remote-User

      --secure-port=6443

      --service-account-key-file=/etc/kubernetes/pki/sa.pub

      --service-cluster-ip-range=10.96.0.0/12

      --tls-cert-file=/etc/kubernetes/pki/apiserver.crt

      --tls-private-key-file=/etc/kubernetes/pki/apiserver.key

    State:          Running

      Started:      Sun, 04 Nov 2018 22:57:27 -0500

    Last State:     Terminated

      Reason:       Completed

      Exit Code:    0

      Started:      Sun, 04 Nov 2018 20:12:06 -0500

      Finished:     Sun, 04 Nov 2018 22:55:24 -0500

    Ready:          True

    Restart Count:  2

    Requests:

      cpu:        250m

    Liveness:     http-get https://192.168.87.20:6443/healthz delay=15s timeout=15s period=10s #success=1 #failure=8

    Environment:  <none>

    Mounts:

      /etc/ca-certificates from etc-ca-certificates (ro)

      /etc/kubernetes/pki from k8s-certs (ro)

      /etc/ssl/certs from ca-certs (ro)

      /usr/local/share/ca-certificates from usr-local-share-ca-certificates (ro)

      /usr/share/ca-certificates from usr-share-ca-certificates (ro)

Conditions:

  Type              Status

  Initialized       True 

  Ready             True 

  ContainersReady   True 

  PodScheduled      True 

Volumes:

  etc-ca-certificates:

    Type:          HostPath (bare host directory volume)

    Path:          /etc/ca-certificates

    HostPathType:  DirectoryOrCreate

  k8s-certs:

    Type:          HostPath (bare host directory volume)

    Path:          /etc/kubernetes/pki

    HostPathType:  DirectoryOrCreate

  ca-certs:

    Type:          HostPath (bare host directory volume)

    Path:          /etc/ssl/certs

    HostPathType:  DirectoryOrCreate

  usr-share-ca-certificates:

    Type:          HostPath (bare host directory volume)

    Path:          /usr/share/ca-certificates

    HostPathType:  DirectoryOrCreate

  usr-local-share-ca-certificates:

    Type:          HostPath (bare host directory volume)

    Path:          /usr/local/share/ca-certificates

    HostPathType:  DirectoryOrCreate

QoS Class:         Burstable

Node-Selectors:    <none>

Tolerations:       :NoExecute

Events:            <none>

#

syslog (host):

Nov 4 22:59:36 suey kubelet[1234]: E1104 22:59:36.139538 1234
pod_workers.go:186] Error syncing pod
d8146b7e-de57-11e8-a1e2-ec8eb57434c8
("coredns-576cbf47c7-hhmws_kube-system(d8146b7e-de57-11e8-a1e2-ec8eb57434c8)"), skipping: failed to "StartContainer" for "coredns" with
CrashLoopBackOff: "Back-off 40s restarting failed container=coredns
pod=coredns-576cbf47c7-hhmws_kube-system(d8146b7e-de57-11e8-a1e2-ec8eb57434c8)"

Please advise.

edited Nov 6 '18 at 3:41

asked Oct 31 '18 at 3:20

alexus

2,98852849

attach the full output pls

– Konstantin Vustin
Oct 31 '18 at 3:49

2

Also try kubectl logs -f coredns-576cbf47c7-nm7gc

– Andre Helberg
Oct 31 '18 at 5:50

@AndreHelberg I updated my question with output from kubectl logs command. I'm not sure what this 10.96.0.1:443 is ...

– alexus
Nov 1 '18 at 3:34

@KonstantinVustin I updated my question with full output as well.

– alexus
Nov 1 '18 at 3:38

@alexus It looks like you are trying to set up a cluster from scratch? I haven't done this before, so my input might not be of much help, but from the log you pasted, it seems your pod is trying to connect to 10.96.0.1:443. I'd suggest verifying that what ever should be there is up. I'm guessing it is kube-apiserver-suey.nknwn.local. I would look at: * are these nodes on the same network * Is the ip correct * is something listening on 443 * check the service (logs) listening on 443, might be a cert/auth issue or it's timing out * check that ports aren't being blocked

– Andre Helberg
Nov 1 '18 at 8:46

|
show 5 more comments

I'm trying to set up the Kubernetes master, by issuing:

kubeadm init --pod-network-cidr=192.168.0.0/16

followed by: Installing a pod network add-on (Calico)

followed by: Master Isolation

issue: coredns pods have CrashLoopBackOff or Error state:

# kubectl get pods -n kube-system

NAME                                       READY   STATUS             RESTARTS   AGE

calico-node-lflwx                          2/2     Running            0          2d

coredns-576cbf47c7-nm7gc                   0/1     CrashLoopBackOff   69         2d

coredns-576cbf47c7-nwcnx                   0/1     CrashLoopBackOff   69         2d

etcd-suey.nknwn.local                      1/1     Running            0          2d

kube-apiserver-suey.nknwn.local            1/1     Running            0          2d

kube-controller-manager-suey.nknwn.local   1/1     Running            0          2d

kube-proxy-xkgdr                           1/1     Running            0          2d

kube-scheduler-suey.nknwn.local            1/1     Running            0          2d

#

I tried with Troubleshooting kubeadm - Kubernetes, however my node isn't running SELinux and my Docker is up to date.

# docker --version

Docker version 18.06.1-ce, build e68fc7a

#

kubectl's describe:

# kubectl -n kube-system describe pod coredns-576cbf47c7-nwcnx 

Name:               coredns-576cbf47c7-nwcnx

Namespace:          kube-system

Priority:           0

PriorityClassName:  <none>

Node:               suey.nknwn.local/192.168.86.81

Start Time:         Sun, 28 Oct 2018 22:39:46 -0400

Labels:             k8s-app=kube-dns

                    pod-template-hash=576cbf47c7

Annotations:        cni.projectcalico.org/podIP: 192.168.0.30/32

Status:             Running

IP:                 192.168.0.30

Controlled By:      ReplicaSet/coredns-576cbf47c7

Containers:

  coredns:

    Container ID:  docker://ec65b8f40c38987961e9ed099dfa2e8bb35699a7f370a2cda0e0d522a0b05e79

    Image:         k8s.gcr.io/coredns:1.2.2

    Image ID:      docker-pullable://k8s.gcr.io/coredns@sha256:3e2be1cec87aca0b74b7668bbe8c02964a95a402e45ceb51b2252629d608d03a

    Ports:         53/UDP, 53/TCP, 9153/TCP

    Host Ports:    0/UDP, 0/TCP, 0/TCP

    Args:

      -conf

      /etc/coredns/Corefile

    State:          Running

      Started:      Wed, 31 Oct 2018 23:28:58 -0400

    Last State:     Terminated

      Reason:       Error

      Exit Code:    137

      Started:      Wed, 31 Oct 2018 23:21:35 -0400

      Finished:     Wed, 31 Oct 2018 23:23:54 -0400

    Ready:          True

    Restart Count:  103

    Limits:

      memory:  170Mi

    Requests:

      cpu:        100m

      memory:     70Mi

    Liveness:     http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5

    Environment:  <none>

    Mounts:

      /etc/coredns from config-volume (ro)

      /var/run/secrets/kubernetes.io/serviceaccount from coredns-token-xvq8b (ro)

Conditions:

  Type              Status

  Initialized       True 

  Ready             True 

  ContainersReady   True 

  PodScheduled      True 

Volumes:

  config-volume:

    Type:      ConfigMap (a volume populated by a ConfigMap)

    Name:      coredns

    Optional:  false

  coredns-token-xvq8b:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  coredns-token-xvq8b

    Optional:    false

QoS Class:       Burstable

Node-Selectors:  <none>

Tolerations:     CriticalAddonsOnly

                 node-role.kubernetes.io/master:NoSchedule

                 node.kubernetes.io/not-ready:NoExecute for 300s

                 node.kubernetes.io/unreachable:NoExecute for 300s

Events:

  Type     Reason     Age                     From                       Message

  ----     ------     ----                    ----                       -------

  Normal   Killing    54m (x10 over 4h19m)    kubelet, suey.nknwn.local  Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.

  Warning  Unhealthy  9m56s (x92 over 4h20m)  kubelet, suey.nknwn.local  Liveness probe failed: HTTP probe failed with statuscode: 503

  Warning  BackOff    5m4s (x173 over 4h10m)  kubelet, suey.nknwn.local  Back-off restarting failed container

# kubectl -n kube-system describe pod coredns-576cbf47c7-nm7gc 

Name:               coredns-576cbf47c7-nm7gc

Namespace:          kube-system

Priority:           0

PriorityClassName:  <none>

Node:               suey.nknwn.local/192.168.86.81

Start Time:         Sun, 28 Oct 2018 22:39:46 -0400

Labels:             k8s-app=kube-dns

                    pod-template-hash=576cbf47c7

Annotations:        cni.projectcalico.org/podIP: 192.168.0.31/32

Status:             Running

IP:                 192.168.0.31

Controlled By:      ReplicaSet/coredns-576cbf47c7

Containers:

  coredns:

    Container ID:  docker://0f2db8d89a4c439763e7293698d6a027a109bf556b806d232093300952a84359

    Image:         k8s.gcr.io/coredns:1.2.2

    Image ID:      docker-pullable://k8s.gcr.io/coredns@sha256:3e2be1cec87aca0b74b7668bbe8c02964a95a402e45ceb51b2252629d608d03a

    Ports:         53/UDP, 53/TCP, 9153/TCP

    Host Ports:    0/UDP, 0/TCP, 0/TCP

    Args:

      -conf

      /etc/coredns/Corefile

    State:          Running

      Started:      Wed, 31 Oct 2018 23:29:11 -0400

    Last State:     Terminated

      Reason:       Error

      Exit Code:    137

      Started:      Wed, 31 Oct 2018 23:21:58 -0400

      Finished:     Wed, 31 Oct 2018 23:24:08 -0400

    Ready:          True

    Restart Count:  102

    Limits:

      memory:  170Mi

    Requests:

      cpu:        100m

      memory:     70Mi

    Liveness:     http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5

    Environment:  <none>

    Mounts:

      /etc/coredns from config-volume (ro)

      /var/run/secrets/kubernetes.io/serviceaccount from coredns-token-xvq8b (ro)

Conditions:

  Type              Status

  Initialized       True 

  Ready             True 

  ContainersReady   True 

  PodScheduled      True 

Volumes:

  config-volume:

    Type:      ConfigMap (a volume populated by a ConfigMap)

    Name:      coredns

    Optional:  false

  coredns-token-xvq8b:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  coredns-token-xvq8b

    Optional:    false

QoS Class:       Burstable

Node-Selectors:  <none>

Tolerations:     CriticalAddonsOnly

                 node-role.kubernetes.io/master:NoSchedule

                 node.kubernetes.io/not-ready:NoExecute for 300s

                 node.kubernetes.io/unreachable:NoExecute for 300s

Events:

  Type     Reason     Age                     From                       Message

  ----     ------     ----                    ----                       -------

  Normal   Killing    44m (x12 over 4h18m)    kubelet, suey.nknwn.local  Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.

  Warning  BackOff    4m58s (x170 over 4h9m)  kubelet, suey.nknwn.local  Back-off restarting failed container

  Warning  Unhealthy  8s (x102 over 4h19m)    kubelet, suey.nknwn.local  Liveness probe failed: HTTP probe failed with statuscode: 503

#

kubectl's log:

# kubectl -n kube-system logs -f coredns-576cbf47c7-nm7gc 

E1101 03:31:58.974836       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:31:58.974836       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:31:58.974857       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:32:29.975493       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:32:29.976732       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:32:29.977788       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:00.976164       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:00.977415       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:00.978332       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

2018/11/01 03:33:08 [INFO] SIGTERM: Shutting down servers then terminating

E1101 03:33:31.976864       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:31.978080       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:31.979156       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

# 



# kubectl -n kube-system log -f coredns-576cbf47c7-gqdgd

.:53

2018/11/05 04:04:13 [INFO] CoreDNS-1.2.2

2018/11/05 04:04:13 [INFO] linux/amd64, go1.11, eb51e8b

CoreDNS-1.2.2

linux/amd64, go1.11, eb51e8b

2018/11/05 04:04:13 [INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769

2018/11/05 04:04:19 [FATAL] plugin/loop: Seen "HINFO IN 3597544515206064936.6415437575707023337." more than twice, loop detected

# kubectl -n kube-system log -f coredns-576cbf47c7-hhmws

.:53

2018/11/05 04:04:18 [INFO] CoreDNS-1.2.2

2018/11/05 04:04:18 [INFO] linux/amd64, go1.11, eb51e8b

CoreDNS-1.2.2

linux/amd64, go1.11, eb51e8b

2018/11/05 04:04:18 [INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769

2018/11/05 04:04:24 [FATAL] plugin/loop: Seen "HINFO IN 6900627972087569316.7905576541070882081." more than twice, loop detected

#

describe (apiserver):

# kubectl -n kube-system describe pod kube-apiserver-suey.nknwn.local

Name:               kube-apiserver-suey.nknwn.local

Namespace:          kube-system

Priority:           2000000000

PriorityClassName:  system-cluster-critical

Node:               suey.nknwn.local/192.168.87.20

Start Time:         Fri, 02 Nov 2018 00:28:44 -0400

Labels:             component=kube-apiserver

                    tier=control-plane

Annotations:        kubernetes.io/config.hash: 2433a531afe72165364aace3b746ea4c

                    kubernetes.io/config.mirror: 2433a531afe72165364aace3b746ea4c

                    kubernetes.io/config.seen: 2018-11-02T00:28:43.795663261-04:00

                    kubernetes.io/config.source: file

                    scheduler.alpha.kubernetes.io/critical-pod: 

Status:             Running

IP:                 192.168.87.20

Containers:

  kube-apiserver:

    Container ID:  docker://659456385a1a859f078d36f4d1b91db9143d228b3bc5b3947a09460a39ce41fc

    Image:         k8s.gcr.io/kube-apiserver:v1.12.2

    Image ID:      docker-pullable://k8s.gcr.io/kube-apiserver@sha256:094929baf3a7681945d83a7654b3248e586b20506e28526121f50eb359cee44f

    Port:          <none>

    Host Port:     <none>

    Command:

      kube-apiserver

      --authorization-mode=Node,RBAC

      --advertise-address=192.168.87.20

      --allow-privileged=true

      --client-ca-file=/etc/kubernetes/pki/ca.crt

      --enable-admission-plugins=NodeRestriction

      --enable-bootstrap-token-auth=true

      --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt

      --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt

      --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key

      --etcd-servers=https://127.0.0.1:2379

      --insecure-port=0

      --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt

      --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key

      --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname

      --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt

      --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key

      --requestheader-allowed-names=front-proxy-client

      --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt

      --requestheader-extra-headers-prefix=X-Remote-Extra-

      --requestheader-group-headers=X-Remote-Group

      --requestheader-username-headers=X-Remote-User

      --secure-port=6443

      --service-account-key-file=/etc/kubernetes/pki/sa.pub

      --service-cluster-ip-range=10.96.0.0/12

      --tls-cert-file=/etc/kubernetes/pki/apiserver.crt

      --tls-private-key-file=/etc/kubernetes/pki/apiserver.key

    State:          Running

      Started:      Sun, 04 Nov 2018 22:57:27 -0500

    Last State:     Terminated

      Reason:       Completed

      Exit Code:    0

      Started:      Sun, 04 Nov 2018 20:12:06 -0500

      Finished:     Sun, 04 Nov 2018 22:55:24 -0500

    Ready:          True

    Restart Count:  2

    Requests:

      cpu:        250m

    Liveness:     http-get https://192.168.87.20:6443/healthz delay=15s timeout=15s period=10s #success=1 #failure=8

    Environment:  <none>

    Mounts:

      /etc/ca-certificates from etc-ca-certificates (ro)

      /etc/kubernetes/pki from k8s-certs (ro)

      /etc/ssl/certs from ca-certs (ro)

      /usr/local/share/ca-certificates from usr-local-share-ca-certificates (ro)

      /usr/share/ca-certificates from usr-share-ca-certificates (ro)

Conditions:

  Type              Status

  Initialized       True 

  Ready             True 

  ContainersReady   True 

  PodScheduled      True 

Volumes:

  etc-ca-certificates:

    Type:          HostPath (bare host directory volume)

    Path:          /etc/ca-certificates

    HostPathType:  DirectoryOrCreate

  k8s-certs:

    Type:          HostPath (bare host directory volume)

    Path:          /etc/kubernetes/pki

    HostPathType:  DirectoryOrCreate

  ca-certs:

    Type:          HostPath (bare host directory volume)

    Path:          /etc/ssl/certs

    HostPathType:  DirectoryOrCreate

  usr-share-ca-certificates:

    Type:          HostPath (bare host directory volume)

    Path:          /usr/share/ca-certificates

    HostPathType:  DirectoryOrCreate

  usr-local-share-ca-certificates:

    Type:          HostPath (bare host directory volume)

    Path:          /usr/local/share/ca-certificates

    HostPathType:  DirectoryOrCreate

QoS Class:         Burstable

Node-Selectors:    <none>

Tolerations:       :NoExecute

Events:            <none>

#

syslog (host):

Nov 4 22:59:36 suey kubelet[1234]: E1104 22:59:36.139538 1234
pod_workers.go:186] Error syncing pod
d8146b7e-de57-11e8-a1e2-ec8eb57434c8
("coredns-576cbf47c7-hhmws_kube-system(d8146b7e-de57-11e8-a1e2-ec8eb57434c8)"), skipping: failed to "StartContainer" for "coredns" with
CrashLoopBackOff: "Back-off 40s restarting failed container=coredns
pod=coredns-576cbf47c7-hhmws_kube-system(d8146b7e-de57-11e8-a1e2-ec8eb57434c8)"

Please advise.

edited Nov 6 '18 at 3:41

asked Oct 31 '18 at 3:20

alexus

2,98852849

I'm trying to set up the Kubernetes master, by issuing:

kubeadm init --pod-network-cidr=192.168.0.0/16

followed by: Installing a pod network add-on (Calico)

followed by: Master Isolation

issue: coredns pods have CrashLoopBackOff or Error state:

# kubectl get pods -n kube-system

NAME                                       READY   STATUS             RESTARTS   AGE

calico-node-lflwx                          2/2     Running            0          2d

coredns-576cbf47c7-nm7gc                   0/1     CrashLoopBackOff   69         2d

coredns-576cbf47c7-nwcnx                   0/1     CrashLoopBackOff   69         2d

etcd-suey.nknwn.local                      1/1     Running            0          2d

kube-apiserver-suey.nknwn.local            1/1     Running            0          2d

kube-controller-manager-suey.nknwn.local   1/1     Running            0          2d

kube-proxy-xkgdr                           1/1     Running            0          2d

kube-scheduler-suey.nknwn.local            1/1     Running            0          2d

#

I tried with Troubleshooting kubeadm - Kubernetes, however my node isn't running SELinux and my Docker is up to date.

# docker --version

Docker version 18.06.1-ce, build e68fc7a

#

kubectl's describe:

# kubectl -n kube-system describe pod coredns-576cbf47c7-nwcnx 

Name:               coredns-576cbf47c7-nwcnx

Namespace:          kube-system

Priority:           0

PriorityClassName:  <none>

Node:               suey.nknwn.local/192.168.86.81

Start Time:         Sun, 28 Oct 2018 22:39:46 -0400

Labels:             k8s-app=kube-dns

                    pod-template-hash=576cbf47c7

Annotations:        cni.projectcalico.org/podIP: 192.168.0.30/32

Status:             Running

IP:                 192.168.0.30

Controlled By:      ReplicaSet/coredns-576cbf47c7

Containers:

  coredns:

    Container ID:  docker://ec65b8f40c38987961e9ed099dfa2e8bb35699a7f370a2cda0e0d522a0b05e79

    Image:         k8s.gcr.io/coredns:1.2.2

    Image ID:      docker-pullable://k8s.gcr.io/coredns@sha256:3e2be1cec87aca0b74b7668bbe8c02964a95a402e45ceb51b2252629d608d03a

    Ports:         53/UDP, 53/TCP, 9153/TCP

    Host Ports:    0/UDP, 0/TCP, 0/TCP

    Args:

      -conf

      /etc/coredns/Corefile

    State:          Running

      Started:      Wed, 31 Oct 2018 23:28:58 -0400

    Last State:     Terminated

      Reason:       Error

      Exit Code:    137

      Started:      Wed, 31 Oct 2018 23:21:35 -0400

      Finished:     Wed, 31 Oct 2018 23:23:54 -0400

    Ready:          True

    Restart Count:  103

    Limits:

      memory:  170Mi

    Requests:

      cpu:        100m

      memory:     70Mi

    Liveness:     http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5

    Environment:  <none>

    Mounts:

      /etc/coredns from config-volume (ro)

      /var/run/secrets/kubernetes.io/serviceaccount from coredns-token-xvq8b (ro)

Conditions:

  Type              Status

  Initialized       True 

  Ready             True 

  ContainersReady   True 

  PodScheduled      True 

Volumes:

  config-volume:

    Type:      ConfigMap (a volume populated by a ConfigMap)

    Name:      coredns

    Optional:  false

  coredns-token-xvq8b:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  coredns-token-xvq8b

    Optional:    false

QoS Class:       Burstable

Node-Selectors:  <none>

Tolerations:     CriticalAddonsOnly

                 node-role.kubernetes.io/master:NoSchedule

                 node.kubernetes.io/not-ready:NoExecute for 300s

                 node.kubernetes.io/unreachable:NoExecute for 300s

Events:

  Type     Reason     Age                     From                       Message

  ----     ------     ----                    ----                       -------

  Normal   Killing    54m (x10 over 4h19m)    kubelet, suey.nknwn.local  Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.

  Warning  Unhealthy  9m56s (x92 over 4h20m)  kubelet, suey.nknwn.local  Liveness probe failed: HTTP probe failed with statuscode: 503

  Warning  BackOff    5m4s (x173 over 4h10m)  kubelet, suey.nknwn.local  Back-off restarting failed container

# kubectl -n kube-system describe pod coredns-576cbf47c7-nm7gc 

Name:               coredns-576cbf47c7-nm7gc

Namespace:          kube-system

Priority:           0

PriorityClassName:  <none>

Node:               suey.nknwn.local/192.168.86.81

Start Time:         Sun, 28 Oct 2018 22:39:46 -0400

Labels:             k8s-app=kube-dns

                    pod-template-hash=576cbf47c7

Annotations:        cni.projectcalico.org/podIP: 192.168.0.31/32

Status:             Running

IP:                 192.168.0.31

Controlled By:      ReplicaSet/coredns-576cbf47c7

Containers:

  coredns:

    Container ID:  docker://0f2db8d89a4c439763e7293698d6a027a109bf556b806d232093300952a84359

    Image:         k8s.gcr.io/coredns:1.2.2

    Image ID:      docker-pullable://k8s.gcr.io/coredns@sha256:3e2be1cec87aca0b74b7668bbe8c02964a95a402e45ceb51b2252629d608d03a

    Ports:         53/UDP, 53/TCP, 9153/TCP

    Host Ports:    0/UDP, 0/TCP, 0/TCP

    Args:

      -conf

      /etc/coredns/Corefile

    State:          Running

      Started:      Wed, 31 Oct 2018 23:29:11 -0400

    Last State:     Terminated

      Reason:       Error

      Exit Code:    137

      Started:      Wed, 31 Oct 2018 23:21:58 -0400

      Finished:     Wed, 31 Oct 2018 23:24:08 -0400

    Ready:          True

    Restart Count:  102

    Limits:

      memory:  170Mi

    Requests:

      cpu:        100m

      memory:     70Mi

    Liveness:     http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5

    Environment:  <none>

    Mounts:

      /etc/coredns from config-volume (ro)

      /var/run/secrets/kubernetes.io/serviceaccount from coredns-token-xvq8b (ro)

Conditions:

  Type              Status

  Initialized       True 

  Ready             True 

  ContainersReady   True 

  PodScheduled      True 

Volumes:

  config-volume:

    Type:      ConfigMap (a volume populated by a ConfigMap)

    Name:      coredns

    Optional:  false

  coredns-token-xvq8b:

    Type:        Secret (a volume populated by a Secret)

    SecretName:  coredns-token-xvq8b

    Optional:    false

QoS Class:       Burstable

Node-Selectors:  <none>

Tolerations:     CriticalAddonsOnly

                 node-role.kubernetes.io/master:NoSchedule

                 node.kubernetes.io/not-ready:NoExecute for 300s

                 node.kubernetes.io/unreachable:NoExecute for 300s

Events:

  Type     Reason     Age                     From                       Message

  ----     ------     ----                    ----                       -------

  Normal   Killing    44m (x12 over 4h18m)    kubelet, suey.nknwn.local  Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.

  Warning  BackOff    4m58s (x170 over 4h9m)  kubelet, suey.nknwn.local  Back-off restarting failed container

  Warning  Unhealthy  8s (x102 over 4h19m)    kubelet, suey.nknwn.local  Liveness probe failed: HTTP probe failed with statuscode: 503

#

kubectl's log:

# kubectl -n kube-system logs -f coredns-576cbf47c7-nm7gc 

E1101 03:31:58.974836       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:31:58.974836       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:31:58.974857       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:32:29.975493       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:32:29.976732       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:32:29.977788       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:00.976164       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:00.977415       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:00.978332       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

2018/11/01 03:33:08 [INFO] SIGTERM: Shutting down servers then terminating

E1101 03:33:31.976864       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:31.978080       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

E1101 03:33:31.979156       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

# 



# kubectl -n kube-system log -f coredns-576cbf47c7-gqdgd

.:53

2018/11/05 04:04:13 [INFO] CoreDNS-1.2.2

2018/11/05 04:04:13 [INFO] linux/amd64, go1.11, eb51e8b

CoreDNS-1.2.2

linux/amd64, go1.11, eb51e8b

2018/11/05 04:04:13 [INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769

2018/11/05 04:04:19 [FATAL] plugin/loop: Seen "HINFO IN 3597544515206064936.6415437575707023337." more than twice, loop detected

# kubectl -n kube-system log -f coredns-576cbf47c7-hhmws

.:53

2018/11/05 04:04:18 [INFO] CoreDNS-1.2.2

2018/11/05 04:04:18 [INFO] linux/amd64, go1.11, eb51e8b

CoreDNS-1.2.2

linux/amd64, go1.11, eb51e8b

2018/11/05 04:04:18 [INFO] plugin/reload: Running configuration MD5 = f65c4821c8a9b7b5eb30fa4fbc167769

2018/11/05 04:04:24 [FATAL] plugin/loop: Seen "HINFO IN 6900627972087569316.7905576541070882081." more than twice, loop detected

#

describe (apiserver):

# kubectl -n kube-system describe pod kube-apiserver-suey.nknwn.local

Name:               kube-apiserver-suey.nknwn.local

Namespace:          kube-system

Priority:           2000000000

PriorityClassName:  system-cluster-critical

Node:               suey.nknwn.local/192.168.87.20

Start Time:         Fri, 02 Nov 2018 00:28:44 -0400

Labels:             component=kube-apiserver

                    tier=control-plane

Annotations:        kubernetes.io/config.hash: 2433a531afe72165364aace3b746ea4c

                    kubernetes.io/config.mirror: 2433a531afe72165364aace3b746ea4c

                    kubernetes.io/config.seen: 2018-11-02T00:28:43.795663261-04:00

                    kubernetes.io/config.source: file

                    scheduler.alpha.kubernetes.io/critical-pod: 

Status:             Running

IP:                 192.168.87.20

Containers:

  kube-apiserver:

    Container ID:  docker://659456385a1a859f078d36f4d1b91db9143d228b3bc5b3947a09460a39ce41fc

    Image:         k8s.gcr.io/kube-apiserver:v1.12.2

    Image ID:      docker-pullable://k8s.gcr.io/kube-apiserver@sha256:094929baf3a7681945d83a7654b3248e586b20506e28526121f50eb359cee44f

    Port:          <none>

    Host Port:     <none>

    Command:

      kube-apiserver

      --authorization-mode=Node,RBAC

      --advertise-address=192.168.87.20

      --allow-privileged=true

      --client-ca-file=/etc/kubernetes/pki/ca.crt

      --enable-admission-plugins=NodeRestriction

      --enable-bootstrap-token-auth=true

      --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt

      --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt

      --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key

      --etcd-servers=https://127.0.0.1:2379

      --insecure-port=0

      --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt

      --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key

      --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname

      --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt

      --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key

      --requestheader-allowed-names=front-proxy-client

      --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt

      --requestheader-extra-headers-prefix=X-Remote-Extra-

      --requestheader-group-headers=X-Remote-Group

      --requestheader-username-headers=X-Remote-User

      --secure-port=6443

      --service-account-key-file=/etc/kubernetes/pki/sa.pub

      --service-cluster-ip-range=10.96.0.0/12

      --tls-cert-file=/etc/kubernetes/pki/apiserver.crt

      --tls-private-key-file=/etc/kubernetes/pki/apiserver.key

    State:          Running

      Started:      Sun, 04 Nov 2018 22:57:27 -0500

    Last State:     Terminated

      Reason:       Completed

      Exit Code:    0

      Started:      Sun, 04 Nov 2018 20:12:06 -0500

      Finished:     Sun, 04 Nov 2018 22:55:24 -0500

    Ready:          True

    Restart Count:  2

    Requests:

      cpu:        250m

    Liveness:     http-get https://192.168.87.20:6443/healthz delay=15s timeout=15s period=10s #success=1 #failure=8

    Environment:  <none>

    Mounts:

      /etc/ca-certificates from etc-ca-certificates (ro)

      /etc/kubernetes/pki from k8s-certs (ro)

      /etc/ssl/certs from ca-certs (ro)

      /usr/local/share/ca-certificates from usr-local-share-ca-certificates (ro)

      /usr/share/ca-certificates from usr-share-ca-certificates (ro)

Conditions:

  Type              Status

  Initialized       True 

  Ready             True 

  ContainersReady   True 

  PodScheduled      True 

Volumes:

  etc-ca-certificates:

    Type:          HostPath (bare host directory volume)

    Path:          /etc/ca-certificates

    HostPathType:  DirectoryOrCreate

  k8s-certs:

    Type:          HostPath (bare host directory volume)

    Path:          /etc/kubernetes/pki

    HostPathType:  DirectoryOrCreate

  ca-certs:

    Type:          HostPath (bare host directory volume)

    Path:          /etc/ssl/certs

    HostPathType:  DirectoryOrCreate

  usr-share-ca-certificates:

    Type:          HostPath (bare host directory volume)

    Path:          /usr/share/ca-certificates

    HostPathType:  DirectoryOrCreate

  usr-local-share-ca-certificates:

    Type:          HostPath (bare host directory volume)

    Path:          /usr/local/share/ca-certificates

    HostPathType:  DirectoryOrCreate

QoS Class:         Burstable

Node-Selectors:    <none>

Tolerations:       :NoExecute

Events:            <none>

#

syslog (host):

Nov 4 22:59:36 suey kubelet[1234]: E1104 22:59:36.139538 1234
pod_workers.go:186] Error syncing pod
d8146b7e-de57-11e8-a1e2-ec8eb57434c8
("coredns-576cbf47c7-hhmws_kube-system(d8146b7e-de57-11e8-a1e2-ec8eb57434c8)"), skipping: failed to "StartContainer" for "coredns" with
CrashLoopBackOff: "Back-off 40s restarting failed container=coredns
pod=coredns-576cbf47c7-hhmws_kube-system(d8146b7e-de57-11e8-a1e2-ec8eb57434c8)"

Please advise.

docker kubernetes kubectl kubeadm coredns

edited Nov 6 '18 at 3:41

asked Oct 31 '18 at 3:20

alexus

2,98852849

edited Nov 6 '18 at 3:41

asked Oct 31 '18 at 3:20

alexus

2,98852849

edited Nov 6 '18 at 3:41

asked Oct 31 '18 at 3:20

alexus

2,98852849

asked Oct 31 '18 at 3:20

alexus

2,98852849

asked Oct 31 '18 at 3:20

alexus

2,98852849

attach the full output pls

– Konstantin Vustin
Oct 31 '18 at 3:49

2

Also try kubectl logs -f coredns-576cbf47c7-nm7gc

– Andre Helberg
Oct 31 '18 at 5:50

@AndreHelberg I updated my question with output from kubectl logs command. I'm not sure what this 10.96.0.1:443 is ...

– alexus
Nov 1 '18 at 3:34

@KonstantinVustin I updated my question with full output as well.

– alexus
Nov 1 '18 at 3:38

@alexus It looks like you are trying to set up a cluster from scratch? I haven't done this before, so my input might not be of much help, but from the log you pasted, it seems your pod is trying to connect to 10.96.0.1:443. I'd suggest verifying that what ever should be there is up. I'm guessing it is kube-apiserver-suey.nknwn.local. I would look at: * are these nodes on the same network * Is the ip correct * is something listening on 443 * check the service (logs) listening on 443, might be a cert/auth issue or it's timing out * check that ports aren't being blocked

– Andre Helberg
Nov 1 '18 at 8:46

|
show 5 more comments

attach the full output pls

– Konstantin Vustin
Oct 31 '18 at 3:49

2

Also try kubectl logs -f coredns-576cbf47c7-nm7gc

– Andre Helberg
Oct 31 '18 at 5:50

@AndreHelberg I updated my question with output from kubectl logs command. I'm not sure what this 10.96.0.1:443 is ...

– alexus
Nov 1 '18 at 3:34

@KonstantinVustin I updated my question with full output as well.

– alexus
Nov 1 '18 at 3:38

@alexus It looks like you are trying to set up a cluster from scratch? I haven't done this before, so my input might not be of much help, but from the log you pasted, it seems your pod is trying to connect to 10.96.0.1:443. I'd suggest verifying that what ever should be there is up. I'm guessing it is kube-apiserver-suey.nknwn.local. I would look at: * are these nodes on the same network * Is the ip correct * is something listening on 443 * check the service (logs) listening on 443, might be a cert/auth issue or it's timing out * check that ports aren't being blocked

– Andre Helberg
Nov 1 '18 at 8:46

attach the full output pls

– Konstantin Vustin
Oct 31 '18 at 3:49

Also try kubectl logs -f coredns-576cbf47c7-nm7gc

– Andre Helberg
Oct 31 '18 at 5:50

@AndreHelberg I updated my question with output from kubectl logs command. I'm not sure what this 10.96.0.1:443 is ...

– alexus
Nov 1 '18 at 3:34

@KonstantinVustin I updated my question with full output as well.

– alexus
Nov 1 '18 at 3:38

@alexus It looks like you are trying to set up a cluster from scratch? I haven't done this before, so my input might not be of much help, but from the log you pasted, it seems your pod is trying to connect to 10.96.0.1:443. I'd suggest verifying that what ever should be there is up. I'm guessing it is kube-apiserver-suey.nknwn.local. I would look at: * are these nodes on the same network * Is the ip correct * is something listening on 443 * check the service (logs) listening on 443, might be a cert/auth issue or it's timing out * check that ports aren't being blocked

– Andre Helberg
Nov 1 '18 at 8:46

|
show 5 more comments

2 Answers
2

active

oldest

votes

This error

[FATAL] plugin/loop: Seen "HINFO IN 6900627972087569316.7905576541070882081." more than twice, loop detected

is caused when CoreDNS detects a loop in the resolve configuration, and it is the intended behavior. You are hitting this issue:

https://github.com/kubernetes/kubeadm/issues/1162

https://github.com/coredns/coredns/issues/2087

Hacky solution: Disable the CoreDNS loop detection

Edit the CoreDNS configmap:

kubectl -n kube-system edit configmap coredns

Remove or comment out the line with loop, save and exit.

Then remove the CoreDNS pods, so new ones can be created with new config:

kubectl -n kube-system delete pod -l k8s-app=kube-dns

All should be fine after that.

Preferred Solution: Remove the loop in the DNS configuration

First, check if you are using systemd-resolved. If you are running Ubuntu 18.04, it is probably the case.

systemctl list-unit-files | grep enabled | grep systemd-resolved

If it is, check which resolv.conf file your cluster is using as reference:

ps auxww | grep kubelet

You might see a line like:

/usr/bin/kubelet ... --resolv-conf=/run/systemd/resolve/resolv.conf

The important part is --resolv-conf - we figure out if systemd resolv.conf is used, or not.

If it is the resolv.conf of systemd, do the following:

Check the content of /run/systemd/resolve/resolv.conf to see if there is a record like:

nameserver 127.0.0.1

If there is 127.0.0.1, it is the one causing the loop.

To get rid of it, you should not edit that file, but check other places to make it properly generated.

Check all files under /etc/systemd/network and if you find a record like

DNS=127.0.0.1

delete that record. Also check /etc/systemd/resolved.conf and do the same if needed. Make sure you have at least one or two DNS servers configured, such as

DNS=1.1.1.1 1.0.0.1

After doing all that, restart the systemd services to put your changes into effect:
systemctl restart systemd-networkd systemd-resolved

After that, verify that DNS=127.0.0.1 is no more in the resolv.conf file:

cat /run/systemd/resolve/resolv.conf

Finally, trigger re-creation of the DNS pods

kubectl -n kube-system delete pod -l k8s-app=kube-dns

Summary: The solution involves getting rid of what looks like a DNS lookup loop from the host DNS configuration. Steps vary between different resolv.conf managers/implementations.

edited Nov 25 '18 at 1:57

answered Nov 21 '18 at 14:15

Utku Özdemir

3,66713037

"hacky solution" did the trick, yet "proper solution" (preferred over "hacky") - systemd-resolved.service is inactive (dead) and it is also disabled

– alexus
Nov 23 '18 at 19:59

@alexus Glad that it worked out. The preferred solution assumes that your system uses systemd + it's DNS resolver service. If that's not the case, you can investigate and find out where the nameservers are read from, and how they are populated/added, as I mentioned in the summary. The reason for the 1st solution to be hacky is the following: Loop detection of CoreDNS is necessary, and therefore the crash is the expected behavior. So, ideally, the loop needs to be removed/fixed.

– Utku Özdemir
Nov 25 '18 at 1:56

@UtkuÖzdemir I am using ubuntu 16.04 and I dont have resolv.conf of systemd. Contents of /etc/resolv.conf are nameserver 127.0.1.1 search APSDC.local. I am getting the same coredns crashloopbackoff. Is it because of the 127.0.1.1 IP. Can you please suggest any good solution. stackoverflow.com/questions/54466359/…

– S Andrew
Feb 1 at 3:17

add a comment |

Here's some shell hackery that automates Utku's answer:

# remove loop from DNS config files

sudo find /etc/systemd/network /etc/systemd/resolved.conf -type f 

    -exec sed -i '/^DNS=127.0.0.1/d' {} +



# if necessary, configure some DNS servers (use cloudfare public)

if ! grep '^DNS=.*' /etc/systemd/resolved.conf; then

    sudo sed -i '$aDNS=1.1.1.1 1.0.0.1' /etc/systemd/resolved.conf

fi



# restart systemd services

sudo systemctl restart systemd-networkd systemd-resolved



# force (re-) creation of the dns pods

kubectl -n kube-system delete pod -l k8s-app=kube-dns

edited Nov 25 '18 at 17:00

answered Nov 25 '18 at 2:53

rubicks

2,3621727

I am still getting crashloopbackoff after executing ./shellhackery.sh on Ubuntu 16.04

– Aravind Murthy
Dec 2 '18 at 8:02

@VixZeke, what do the container logs say?

– rubicks
Dec 2 '18 at 15:55

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53075796%2fcoredns-pods-have-crashloopbackoff-or-error-state%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

This error

[FATAL] plugin/loop: Seen "HINFO IN 6900627972087569316.7905576541070882081." more than twice, loop detected

is caused when CoreDNS detects a loop in the resolve configuration, and it is the intended behavior. You are hitting this issue:

https://github.com/kubernetes/kubeadm/issues/1162

https://github.com/coredns/coredns/issues/2087

Hacky solution: Disable the CoreDNS loop detection

Edit the CoreDNS configmap:

kubectl -n kube-system edit configmap coredns

Remove or comment out the line with loop, save and exit.

Then remove the CoreDNS pods, so new ones can be created with new config:

kubectl -n kube-system delete pod -l k8s-app=kube-dns

All should be fine after that.

Preferred Solution: Remove the loop in the DNS configuration

First, check if you are using systemd-resolved. If you are running Ubuntu 18.04, it is probably the case.

systemctl list-unit-files | grep enabled | grep systemd-resolved

If it is, check which resolv.conf file your cluster is using as reference:

ps auxww | grep kubelet

You might see a line like:

/usr/bin/kubelet ... --resolv-conf=/run/systemd/resolve/resolv.conf

The important part is --resolv-conf - we figure out if systemd resolv.conf is used, or not.

If it is the resolv.conf of systemd, do the following:

Check the content of /run/systemd/resolve/resolv.conf to see if there is a record like:

nameserver 127.0.0.1

If there is 127.0.0.1, it is the one causing the loop.

To get rid of it, you should not edit that file, but check other places to make it properly generated.

Check all files under /etc/systemd/network and if you find a record like

DNS=127.0.0.1

delete that record. Also check /etc/systemd/resolved.conf and do the same if needed. Make sure you have at least one or two DNS servers configured, such as

DNS=1.1.1.1 1.0.0.1

After doing all that, restart the systemd services to put your changes into effect:
systemctl restart systemd-networkd systemd-resolved

After that, verify that DNS=127.0.0.1 is no more in the resolv.conf file:

cat /run/systemd/resolve/resolv.conf

Finally, trigger re-creation of the DNS pods

kubectl -n kube-system delete pod -l k8s-app=kube-dns

Summary: The solution involves getting rid of what looks like a DNS lookup loop from the host DNS configuration. Steps vary between different resolv.conf managers/implementations.

edited Nov 25 '18 at 1:57

answered Nov 21 '18 at 14:15

Utku Özdemir

3,66713037

"hacky solution" did the trick, yet "proper solution" (preferred over "hacky") - systemd-resolved.service is inactive (dead) and it is also disabled

– alexus
Nov 23 '18 at 19:59

@alexus Glad that it worked out. The preferred solution assumes that your system uses systemd + it's DNS resolver service. If that's not the case, you can investigate and find out where the nameservers are read from, and how they are populated/added, as I mentioned in the summary. The reason for the 1st solution to be hacky is the following: Loop detection of CoreDNS is necessary, and therefore the crash is the expected behavior. So, ideally, the loop needs to be removed/fixed.

– Utku Özdemir
Nov 25 '18 at 1:56

@UtkuÖzdemir I am using ubuntu 16.04 and I dont have resolv.conf of systemd. Contents of /etc/resolv.conf are nameserver 127.0.1.1 search APSDC.local. I am getting the same coredns crashloopbackoff. Is it because of the 127.0.1.1 IP. Can you please suggest any good solution. stackoverflow.com/questions/54466359/…

– S Andrew
Feb 1 at 3:17

add a comment |

This error

[FATAL] plugin/loop: Seen "HINFO IN 6900627972087569316.7905576541070882081." more than twice, loop detected

is caused when CoreDNS detects a loop in the resolve configuration, and it is the intended behavior. You are hitting this issue:

https://github.com/kubernetes/kubeadm/issues/1162

https://github.com/coredns/coredns/issues/2087

Hacky solution: Disable the CoreDNS loop detection

Edit the CoreDNS configmap:

kubectl -n kube-system edit configmap coredns

Remove or comment out the line with loop, save and exit.

Then remove the CoreDNS pods, so new ones can be created with new config:

kubectl -n kube-system delete pod -l k8s-app=kube-dns

All should be fine after that.

Preferred Solution: Remove the loop in the DNS configuration

First, check if you are using systemd-resolved. If you are running Ubuntu 18.04, it is probably the case.

systemctl list-unit-files | grep enabled | grep systemd-resolved

If it is, check which resolv.conf file your cluster is using as reference:

ps auxww | grep kubelet

You might see a line like:

/usr/bin/kubelet ... --resolv-conf=/run/systemd/resolve/resolv.conf

The important part is --resolv-conf - we figure out if systemd resolv.conf is used, or not.

If it is the resolv.conf of systemd, do the following:

Check the content of /run/systemd/resolve/resolv.conf to see if there is a record like:

nameserver 127.0.0.1

If there is 127.0.0.1, it is the one causing the loop.

To get rid of it, you should not edit that file, but check other places to make it properly generated.

Check all files under /etc/systemd/network and if you find a record like

DNS=127.0.0.1

delete that record. Also check /etc/systemd/resolved.conf and do the same if needed. Make sure you have at least one or two DNS servers configured, such as

DNS=1.1.1.1 1.0.0.1

After doing all that, restart the systemd services to put your changes into effect:
systemctl restart systemd-networkd systemd-resolved

After that, verify that DNS=127.0.0.1 is no more in the resolv.conf file:

cat /run/systemd/resolve/resolv.conf

Finally, trigger re-creation of the DNS pods

kubectl -n kube-system delete pod -l k8s-app=kube-dns

Summary: The solution involves getting rid of what looks like a DNS lookup loop from the host DNS configuration. Steps vary between different resolv.conf managers/implementations.

edited Nov 25 '18 at 1:57

answered Nov 21 '18 at 14:15

Utku Özdemir

3,66713037

"hacky solution" did the trick, yet "proper solution" (preferred over "hacky") - systemd-resolved.service is inactive (dead) and it is also disabled

– alexus
Nov 23 '18 at 19:59

@alexus Glad that it worked out. The preferred solution assumes that your system uses systemd + it's DNS resolver service. If that's not the case, you can investigate and find out where the nameservers are read from, and how they are populated/added, as I mentioned in the summary. The reason for the 1st solution to be hacky is the following: Loop detection of CoreDNS is necessary, and therefore the crash is the expected behavior. So, ideally, the loop needs to be removed/fixed.

– Utku Özdemir
Nov 25 '18 at 1:56

@UtkuÖzdemir I am using ubuntu 16.04 and I dont have resolv.conf of systemd. Contents of /etc/resolv.conf are nameserver 127.0.1.1 search APSDC.local. I am getting the same coredns crashloopbackoff. Is it because of the 127.0.1.1 IP. Can you please suggest any good solution. stackoverflow.com/questions/54466359/…

– S Andrew
Feb 1 at 3:17

add a comment |

This error

[FATAL] plugin/loop: Seen "HINFO IN 6900627972087569316.7905576541070882081." more than twice, loop detected

is caused when CoreDNS detects a loop in the resolve configuration, and it is the intended behavior. You are hitting this issue:

https://github.com/kubernetes/kubeadm/issues/1162

https://github.com/coredns/coredns/issues/2087

Hacky solution: Disable the CoreDNS loop detection

Edit the CoreDNS configmap:

kubectl -n kube-system edit configmap coredns

Remove or comment out the line with loop, save and exit.

Then remove the CoreDNS pods, so new ones can be created with new config:

kubectl -n kube-system delete pod -l k8s-app=kube-dns

All should be fine after that.

Preferred Solution: Remove the loop in the DNS configuration

First, check if you are using systemd-resolved. If you are running Ubuntu 18.04, it is probably the case.

systemctl list-unit-files | grep enabled | grep systemd-resolved

If it is, check which resolv.conf file your cluster is using as reference:

ps auxww | grep kubelet

You might see a line like:

/usr/bin/kubelet ... --resolv-conf=/run/systemd/resolve/resolv.conf

The important part is --resolv-conf - we figure out if systemd resolv.conf is used, or not.

If it is the resolv.conf of systemd, do the following:

Check the content of /run/systemd/resolve/resolv.conf to see if there is a record like:

nameserver 127.0.0.1

If there is 127.0.0.1, it is the one causing the loop.

To get rid of it, you should not edit that file, but check other places to make it properly generated.

Check all files under /etc/systemd/network and if you find a record like

DNS=127.0.0.1

delete that record. Also check /etc/systemd/resolved.conf and do the same if needed. Make sure you have at least one or two DNS servers configured, such as

DNS=1.1.1.1 1.0.0.1

After doing all that, restart the systemd services to put your changes into effect:
systemctl restart systemd-networkd systemd-resolved

After that, verify that DNS=127.0.0.1 is no more in the resolv.conf file:

cat /run/systemd/resolve/resolv.conf

Finally, trigger re-creation of the DNS pods

kubectl -n kube-system delete pod -l k8s-app=kube-dns

Summary: The solution involves getting rid of what looks like a DNS lookup loop from the host DNS configuration. Steps vary between different resolv.conf managers/implementations.

edited Nov 25 '18 at 1:57

answered Nov 21 '18 at 14:15

Utku Özdemir

3,66713037

This error

[FATAL] plugin/loop: Seen "HINFO IN 6900627972087569316.7905576541070882081." more than twice, loop detected

is caused when CoreDNS detects a loop in the resolve configuration, and it is the intended behavior. You are hitting this issue:

https://github.com/kubernetes/kubeadm/issues/1162

https://github.com/coredns/coredns/issues/2087

Hacky solution: Disable the CoreDNS loop detection

Edit the CoreDNS configmap:

kubectl -n kube-system edit configmap coredns

Remove or comment out the line with loop, save and exit.

Then remove the CoreDNS pods, so new ones can be created with new config:

kubectl -n kube-system delete pod -l k8s-app=kube-dns

All should be fine after that.

Preferred Solution: Remove the loop in the DNS configuration

First, check if you are using systemd-resolved. If you are running Ubuntu 18.04, it is probably the case.

systemctl list-unit-files | grep enabled | grep systemd-resolved

If it is, check which resolv.conf file your cluster is using as reference:

ps auxww | grep kubelet

You might see a line like:

/usr/bin/kubelet ... --resolv-conf=/run/systemd/resolve/resolv.conf

The important part is --resolv-conf - we figure out if systemd resolv.conf is used, or not.

If it is the resolv.conf of systemd, do the following:

Check the content of /run/systemd/resolve/resolv.conf to see if there is a record like:

nameserver 127.0.0.1

If there is 127.0.0.1, it is the one causing the loop.

To get rid of it, you should not edit that file, but check other places to make it properly generated.

Check all files under /etc/systemd/network and if you find a record like

DNS=127.0.0.1

delete that record. Also check /etc/systemd/resolved.conf and do the same if needed. Make sure you have at least one or two DNS servers configured, such as

DNS=1.1.1.1 1.0.0.1

After doing all that, restart the systemd services to put your changes into effect:
systemctl restart systemd-networkd systemd-resolved

After that, verify that DNS=127.0.0.1 is no more in the resolv.conf file:

cat /run/systemd/resolve/resolv.conf

Finally, trigger re-creation of the DNS pods

kubectl -n kube-system delete pod -l k8s-app=kube-dns

Summary: The solution involves getting rid of what looks like a DNS lookup loop from the host DNS configuration. Steps vary between different resolv.conf managers/implementations.

edited Nov 25 '18 at 1:57

answered Nov 21 '18 at 14:15

Utku Özdemir

3,66713037

edited Nov 25 '18 at 1:57

answered Nov 21 '18 at 14:15

Utku Özdemir

3,66713037

answered Nov 21 '18 at 14:15

Utku Özdemir

3,66713037

answered Nov 21 '18 at 14:15

Utku Özdemir

3,66713037

"hacky solution" did the trick, yet "proper solution" (preferred over "hacky") - systemd-resolved.service is inactive (dead) and it is also disabled

– alexus
Nov 23 '18 at 19:59

@alexus Glad that it worked out. The preferred solution assumes that your system uses systemd + it's DNS resolver service. If that's not the case, you can investigate and find out where the nameservers are read from, and how they are populated/added, as I mentioned in the summary. The reason for the 1st solution to be hacky is the following: Loop detection of CoreDNS is necessary, and therefore the crash is the expected behavior. So, ideally, the loop needs to be removed/fixed.

– Utku Özdemir
Nov 25 '18 at 1:56

@UtkuÖzdemir I am using ubuntu 16.04 and I dont have resolv.conf of systemd. Contents of /etc/resolv.conf are nameserver 127.0.1.1 search APSDC.local. I am getting the same coredns crashloopbackoff. Is it because of the 127.0.1.1 IP. Can you please suggest any good solution. stackoverflow.com/questions/54466359/…

– S Andrew
Feb 1 at 3:17

add a comment |

"hacky solution" did the trick, yet "proper solution" (preferred over "hacky") - systemd-resolved.service is inactive (dead) and it is also disabled

– alexus
Nov 23 '18 at 19:59

@alexus Glad that it worked out. The preferred solution assumes that your system uses systemd + it's DNS resolver service. If that's not the case, you can investigate and find out where the nameservers are read from, and how they are populated/added, as I mentioned in the summary. The reason for the 1st solution to be hacky is the following: Loop detection of CoreDNS is necessary, and therefore the crash is the expected behavior. So, ideally, the loop needs to be removed/fixed.

– Utku Özdemir
Nov 25 '18 at 1:56

@UtkuÖzdemir I am using ubuntu 16.04 and I dont have resolv.conf of systemd. Contents of /etc/resolv.conf are nameserver 127.0.1.1 search APSDC.local. I am getting the same coredns crashloopbackoff. Is it because of the 127.0.1.1 IP. Can you please suggest any good solution. stackoverflow.com/questions/54466359/…

– S Andrew
Feb 1 at 3:17

"hacky solution" did the trick, yet "proper solution" (preferred over "hacky") - systemd-resolved.service is inactive (dead) and it is also disabled

– alexus
Nov 23 '18 at 19:59

@alexus Glad that it worked out. The preferred solution assumes that your system uses systemd + it's DNS resolver service. If that's not the case, you can investigate and find out where the nameservers are read from, and how they are populated/added, as I mentioned in the summary. The reason for the 1st solution to be hacky is the following: Loop detection of CoreDNS is necessary, and therefore the crash is the expected behavior. So, ideally, the loop needs to be removed/fixed.

– Utku Özdemir
Nov 25 '18 at 1:56

@UtkuÖzdemir I am using ubuntu 16.04 and I dont have resolv.conf of systemd. Contents of /etc/resolv.conf are nameserver 127.0.1.1 search APSDC.local. I am getting the same coredns crashloopbackoff. Is it because of the 127.0.1.1 IP. Can you please suggest any good solution. stackoverflow.com/questions/54466359/…

– S Andrew
Feb 1 at 3:17

add a comment |

Here's some shell hackery that automates Utku's answer:

# remove loop from DNS config files

sudo find /etc/systemd/network /etc/systemd/resolved.conf -type f 

    -exec sed -i '/^DNS=127.0.0.1/d' {} +



# if necessary, configure some DNS servers (use cloudfare public)

if ! grep '^DNS=.*' /etc/systemd/resolved.conf; then

    sudo sed -i '$aDNS=1.1.1.1 1.0.0.1' /etc/systemd/resolved.conf

fi



# restart systemd services

sudo systemctl restart systemd-networkd systemd-resolved



# force (re-) creation of the dns pods

kubectl -n kube-system delete pod -l k8s-app=kube-dns

edited Nov 25 '18 at 17:00

answered Nov 25 '18 at 2:53

rubicks

2,3621727

I am still getting crashloopbackoff after executing ./shellhackery.sh on Ubuntu 16.04

– Aravind Murthy
Dec 2 '18 at 8:02

@VixZeke, what do the container logs say?

– rubicks
Dec 2 '18 at 15:55

add a comment |

Here's some shell hackery that automates Utku's answer:

# remove loop from DNS config files

sudo find /etc/systemd/network /etc/systemd/resolved.conf -type f 

    -exec sed -i '/^DNS=127.0.0.1/d' {} +



# if necessary, configure some DNS servers (use cloudfare public)

if ! grep '^DNS=.*' /etc/systemd/resolved.conf; then

    sudo sed -i '$aDNS=1.1.1.1 1.0.0.1' /etc/systemd/resolved.conf

fi



# restart systemd services

sudo systemctl restart systemd-networkd systemd-resolved



# force (re-) creation of the dns pods

kubectl -n kube-system delete pod -l k8s-app=kube-dns

edited Nov 25 '18 at 17:00

answered Nov 25 '18 at 2:53

rubicks

2,3621727

I am still getting crashloopbackoff after executing ./shellhackery.sh on Ubuntu 16.04

– Aravind Murthy
Dec 2 '18 at 8:02

@VixZeke, what do the container logs say?

– rubicks
Dec 2 '18 at 15:55

add a comment |

Here's some shell hackery that automates Utku's answer:

# remove loop from DNS config files

sudo find /etc/systemd/network /etc/systemd/resolved.conf -type f 

    -exec sed -i '/^DNS=127.0.0.1/d' {} +



# if necessary, configure some DNS servers (use cloudfare public)

if ! grep '^DNS=.*' /etc/systemd/resolved.conf; then

    sudo sed -i '$aDNS=1.1.1.1 1.0.0.1' /etc/systemd/resolved.conf

fi



# restart systemd services

sudo systemctl restart systemd-networkd systemd-resolved



# force (re-) creation of the dns pods

kubectl -n kube-system delete pod -l k8s-app=kube-dns

edited Nov 25 '18 at 17:00

answered Nov 25 '18 at 2:53

rubicks

2,3621727

Here's some shell hackery that automates Utku's answer:

# remove loop from DNS config files

sudo find /etc/systemd/network /etc/systemd/resolved.conf -type f 

    -exec sed -i '/^DNS=127.0.0.1/d' {} +



# if necessary, configure some DNS servers (use cloudfare public)

if ! grep '^DNS=.*' /etc/systemd/resolved.conf; then

    sudo sed -i '$aDNS=1.1.1.1 1.0.0.1' /etc/systemd/resolved.conf

fi



# restart systemd services

sudo systemctl restart systemd-networkd systemd-resolved



# force (re-) creation of the dns pods

kubectl -n kube-system delete pod -l k8s-app=kube-dns

edited Nov 25 '18 at 17:00

answered Nov 25 '18 at 2:53

rubicks

2,3621727

edited Nov 25 '18 at 17:00

answered Nov 25 '18 at 2:53

rubicks

2,3621727

answered Nov 25 '18 at 2:53

rubicks

2,3621727

answered Nov 25 '18 at 2:53

rubicks

2,3621727

I am still getting crashloopbackoff after executing ./shellhackery.sh on Ubuntu 16.04

– Aravind Murthy
Dec 2 '18 at 8:02

@VixZeke, what do the container logs say?

– rubicks
Dec 2 '18 at 15:55

add a comment |

I am still getting crashloopbackoff after executing ./shellhackery.sh on Ubuntu 16.04

– Aravind Murthy
Dec 2 '18 at 8:02

@VixZeke, what do the container logs say?

– rubicks
Dec 2 '18 at 15:55

I am still getting crashloopbackoff after executing ./shellhackery.sh on Ubuntu 16.04

– Aravind Murthy
Dec 2 '18 at 8:02

@VixZeke, what do the container logs say?

– rubicks
Dec 2 '18 at 15:55

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Nsryjdtyk