kube-dnsがCrashLoopBackOffで起動しなくなった

  • このエントリーをはてなブックマークに追加

突然いろいろなサービスの名前解決が出来なくなってPODがエラーを吐き始めました。GKEのConsoleで見ていたら、しばらく経つとエラーは無くなりましたがkube-dnsだけCrashLoopBackOffでステータスがDoes not have minimum availabilityと表示されていました。

調べたこと

kube-dnsはnamespaceがkube-systemDeployment,Replica=2として動くように定義されています。

# kubectl -n kube-system get pod | grep kube-system

4/4 Running 0 45m 10.4.0.132 gke-pn-cluster-2-pn-pool-1-efba661f-pz54 <none>
kube-dns-76dbb796c5-9gljt 3/4 CrashLoopBackOff 13 45m 10.4.0.134 gke-pn-cluster-2-pn-pool-1-efba661f-pz54 <none>
kube-dns-autoscaler-67c97c87fb-rj4fq 1/1 Running 0 45m 10.4.0.135 gke-pn-cluster-2-pn-pool-1-efba661f-pz54 <none>

このpodの定義を見ると、4つのcontainerが含まれており、kubedns, dnsmasq, sidecar, prometheus-to-sdがありました。

# kubectl -n kube-system describe pod kube-dns-76dbb796c5-9gljt

Name: kube-dns-76dbb796c5-9gljt
Namespace: kube-system
Priority: 2000000000
PriorityClassName: system-cluster-critical
Node: gke-pn-cluster-2-pn-pool-1-efba661f-pz54/10.146.0.2
Start Time: Sun, 05 May 2019 14:46:19 +0900
Labels: k8s-app=kube-dns
pod-template-hash=3286635271
Annotations: scheduler.alpha.kubernetes.io/critical-pod=
seccomp.security.alpha.kubernetes.io/pod=docker/default
Status: Running
IP: 10.4.0.134
Controlled By: ReplicaSet/kube-dns-76dbb796c5
Containers:
kubedns:
Container ID: docker://2e341ab157aee24b63d95eefb4da434c79306229055d135abf6b730708589d68
Image: k8s.gcr.io/k8s-dns-kube-dns-amd64:1.14.13
Image ID: docker-pullable://k8s.gcr.io/k8s-dns-kube-dns-amd64@sha256:618a82fa66cf0c75e4753369a6999032372be7308866fc9afb381789b1e5ad52
Ports: 10053/UDP, 10053/TCP, 10055/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
--domain=cluster.local.
--dns-port=10053
--config-dir=/kube-dns-config
--v=2
State: Running
Started: Sun, 05 May 2019 14:46:42 +0900
Ready: True
Restart Count: 0
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:10054/healthcheck/kubedns delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8081/readiness delay=3s timeout=5s period=10s #success=1 #failure=3
Environment:
PROMETHEUS_PORT: 10055
Mounts:
/kube-dns-config from kube-dns-config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-dns-token-jdksn (ro)
dnsmasq:
Container ID: docker://5cc16055a401b91bd15ba6507c9f2b7b4e4b20647496746d978cb211e1a0555d
Image: k8s.gcr.io/k8s-dns-dnsmasq-nanny-amd64:1.14.13
Image ID: docker-pullable://k8s.gcr.io/k8s-dns-dnsmasq-nanny-amd64@sha256:45df3e8e0c551bd0c79cdba48ae6677f817971dcbd1eeed7fd1f9a35118410e4
Ports: 53/UDP, 53/TCP
Host Ports: 0/UDP, 0/TCP
Args:
-v=2
-logtostderr
-configDir=/etc/k8s/dns/dnsmasq-nanny
-restartDnsmasq=true
--
-k
--cache-size=1000
--no-negcache
--log-facility=-
--server=/cluster.local/127.0.0.1#10053
--server=/in-addr.arpa/127.0.0.1#10053
--server=/ip6.arpa/127.0.0.1#10053
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 255
Started: Sun, 05 May 2019 15:28:15 +0900
Finished: Sun, 05 May 2019 15:28:16 +0900
Ready: False
Restart Count: 13
Requests:
cpu: 150m
memory: 20Mi
Liveness: http-get http://:10054/healthcheck/dnsmasq delay=60s timeout=5s period=10s #success=1 #failure=5
Environment: <none>
Mounts:
/etc/k8s/dns/dnsmasq-nanny from kube-dns-config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-dns-token-jdksn (ro)
sidecar:
Container ID: docker://f8c87600c704dd709d27c518d0b3ce20d944608be16f1436442970454715977a
Image: k8s.gcr.io/k8s-dns-sidecar-amd64:1.14.13
Image ID: docker-pullable://k8s.gcr.io/k8s-dns-sidecar-amd64@sha256:cedc8fe2098dffc26d17f64061296b7aa54258a31513b6c52df271a98bb522b3
Port: 10054/TCP
Host Port: 0/TCP
Args:
--v=2
--logtostderr
--probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local,5,SRV
--probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local,5,SRV
State: Running
Started: Sun, 05 May 2019 14:46:50 +0900
Ready: True
Restart Count: 0
Requests:
cpu: 10m
memory: 20Mi
Liveness: http-get http://:10054/metrics delay=60s timeout=5s period=10s #success=1 #failure=5
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-dns-token-jdksn (ro)
prometheus-to-sd:
Container ID: docker://82d84415443253693be30368ed14555ff24ba21844ae607abf990c369008f70e
Image: k8s.gcr.io/prometheus-to-sd:v0.4.2
Image ID: docker-pullable://gcr.io/google-containers/prometheus-to-sd@sha256:aca8ef83a7fae83f1f8583e978dd4d1ff655b9f2ca0a76bda5edce6d8965bdf2
Port: <none>
Host Port: <none>
Command:
/monitor
--source=kubedns:http://localhost:10054?whitelisted=probe_kubedns_latency_ms,probe_kubedns_errors,dnsmasq_misses,dnsmasq_hits
--stackdriver-prefix=container.googleapis.com/internal/addons
--api-override=https://monitoring.googleapis.com/
--pod-id=$(POD_NAME)
--namespace-id=$(POD_NAMESPACE)
--v=2
State: Running
Started: Sun, 05 May 2019 14:46:51 +0900
Ready: True
Restart Count: 0
Environment:
POD_NAME: kube-dns-76dbb796c5-9gljt (v1:metadata.name)
POD_NAMESPACE: kube-system (v1:metadata.namespace)
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-dns-token-jdksn (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
kube-dns-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: kube-dns
Optional: true
kube-dns-token-jdksn:
Type: Secret (a volume populated by a Secret)
SecretName: kube-dns-token-jdksn
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: CriticalAddonsOnly
node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 46m default-scheduler Successfully assigned kube-system/kube-dns-76dbb796c5-9gljt to gke-pn-cluster-2-pn-pool-1-efba661f-pz54
Normal SuccessfulMountVolume 46m kubelet, gke-pn-cluster-2-pn-pool-1-efba661f-pz54 MountVolume.SetUp succeeded for volume "kube-dns-config"
Normal SuccessfulMountVolume 46m kubelet, gke-pn-cluster-2-pn-pool-1-efba661f-pz54 MountVolume.SetUp succeeded for volume "kube-dns-token-jdksn"
Normal Pulling 46m kubelet, gke-pn-cluster-2-pn-pool-1-efba661f-pz54 pulling image "k8s.gcr.io/k8s-dns-kube-dns-amd64:1.14.13"
Normal Pulled 46m kubelet, gke-pn-cluster-2-pn-pool-1-efba661f-pz54 Successfully pulled image "k8s.gcr.io/k8s-dns-kube-dns-amd64:1.14.13"
Normal Created 46m kubelet, gke-pn-cluster-2-pn-pool-1-efba661f-pz54 Created container
Normal Started 46m kubelet, gke-pn-cluster-2-pn-pool-1-efba661f-pz54 Started container
Normal Pulling 46m kubelet, gke-pn-cluster-2-pn-pool-1-efba661f-pz54 pulling image "k8s.gcr.io/k8s-dns-dnsmasq-nanny-amd64:1.14.13"
Normal Pulled 46m kubelet, gke-pn-cluster-2-pn-pool-1-efba661f-pz54 Successfully pulled image "k8s.gcr.io/k8s-dns-dnsmasq-nanny-amd64:1.14.13"
Normal Pulling 46m kubelet, gke-pn-cluster-2-pn-pool-1-efba661f-pz54 pulling image "k8s.gcr.io/k8s-dns-sidecar-amd64:1.14.13"
Normal Created 45m kubelet, gke-pn-cluster-2-pn-pool-1-efba661f-pz54 Created container
Normal Pulled 45m kubelet, gke-pn-cluster-2-pn-pool-1-efba661f-pz54 Successfully pulled image "k8s.gcr.io/k8s-dns-sidecar-amd64:1.14.13"
Normal Pulling 45m kubelet, gke-pn-cluster-2-pn-pool-1-efba661f-pz54 pulling image "k8s.gcr.io/prometheus-to-sd:v0.4.2"
Normal Started 45m kubelet, gke-pn-cluster-2-pn-pool-1-efba661f-pz54 Started container
Normal Created 45m kubelet, gke-pn-cluster-2-pn-pool-1-efba661f-pz54 Created container
Normal Started 45m kubelet, gke-pn-cluster-2-pn-pool-1-efba661f-pz54 Started container
Normal Pulled 45m kubelet, gke-pn-cluster-2-pn-pool-1-efba661f-pz54 Successfully pulled image "k8s.gcr.io/prometheus-to-sd:v0.4.2"
Normal Created 45m (x2 over 46m) kubelet, gke-pn-cluster-2-pn-pool-1-efba661f-pz54 Created container
Normal Started 45m (x2 over 46m) kubelet, gke-pn-cluster-2-pn-pool-1-efba661f-pz54 Started container
Normal Pulled 45m (x2 over 45m) kubelet, gke-pn-cluster-2-pn-pool-1-efba661f-pz54 Container image "k8s.gcr.io/k8s-dns-dnsmasq-nanny-amd64:1.14.13" already present on machine
Warning BackOff 1m (x211 over 45m) kubelet, gke-pn-cluster-2-pn-pool-1-efba661f-pz54 Back-off restarting failed container

Logを見ますが、普通に起動できている(ように見えてました)。

# kubectl -n kube-system logs kube-dns-76dbb796c5-9gljt -c kubedns

I0505 05:46:42.318712 1 dns.go:48] version: 1.14.13
I0505 05:46:42.322157 1 server.go:69] Using configuration read from directory: /kube-dns-config with period 10s
I0505 05:46:42.322453 1 server.go:121] FLAG: --alsologtostderr="false"
I0505 05:46:42.322508 1 server.go:121] FLAG: --config-dir="/kube-dns-config"
I0505 05:46:42.322517 1 server.go:121] FLAG: --config-map=""
I0505 05:46:42.322636 1 server.go:121] FLAG: --config-map-namespace="kube-system"
I0505 05:46:42.322718 1 server.go:121] FLAG: --config-period="10s"
I0505 05:46:42.322739 1 server.go:121] FLAG: --dns-bind-address="0.0.0.0"
I0505 05:46:42.322746 1 server.go:121] FLAG: --dns-port="10053"
I0505 05:46:42.322756 1 server.go:121] FLAG: --domain="cluster.local."
I0505 05:46:42.322856 1 server.go:121] FLAG: --federations=""
I0505 05:46:42.322878 1 server.go:121] FLAG: --healthz-port="8081"
I0505 05:46:42.322886 1 server.go:121] FLAG: --initial-sync-timeout="1m0s"
I0505 05:46:42.322895 1 server.go:121] FLAG: --kube-master-url=""
I0505 05:46:42.322983 1 server.go:121] FLAG: --kubecfg-file=""
I0505 05:46:42.323003 1 server.go:121] FLAG: --log-backtrace-at=":0"
I0505 05:46:42.323016 1 server.go:121] FLAG: --log-dir=""
I0505 05:46:42.323099 1 server.go:121] FLAG: --log-flush-frequency="5s"
I0505 05:46:42.323117 1 server.go:121] FLAG: --logtostderr="true"
I0505 05:46:42.323123 1 server.go:121] FLAG: --nameservers=""
I0505 05:46:42.323129 1 server.go:121] FLAG: --stderrthreshold="2"
I0505 05:46:42.323137 1 server.go:121] FLAG: --v="2"
I0505 05:46:42.323144 1 server.go:121] FLAG: --version="false"
I0505 05:46:42.323253 1 server.go:121] FLAG: --vmodule=""
I0505 05:46:42.323511 1 server.go:169] Starting SkyDNS server (0.0.0.0:10053)
I0505 05:46:42.331087 1 server.go:179] Skydns metrics enabled (/metrics:10055)
I0505 05:46:42.331168 1 dns.go:188] Starting endpointsController
I0505 05:46:42.331375 1 dns.go:191] Starting serviceController
I0505 05:46:42.331754 1 dns.go:184] Configuration updated: {TypeMeta:{Kind: APIVersion:} Federations:map[] StubDomains:map[] UpstreamNameservers:[]}
I0505 05:46:42.335926 1 logs.go:41] skydns: ready for queries on cluster.local. for tcp://0.0.0.0:10053 [rcache 0]
I0505 05:46:42.337525 1 logs.go:41] skydns: ready for queries on cluster.local. for udp://0.0.0.0:10053 [rcache 0]
I0505 05:46:42.835211 1 dns.go:222] Initialized services and endpoints from apiserver
I0505 05:46:42.835268 1 server.go:137] Setting up Healthz Handler (/readiness)
I0505 05:46:42.835298 1 server.go:142] Setting up cache handler (/cache)
I0505 05:46:42.835317 1 server.go:128] Status HTTP port 8081

Containerに入って、色々と確認してみます。
Livenessを手動で確認するという事をしてみます。curlがインストールされていないImageだったのでwgetを使っています。すると、dnsmasqのエラーが出てきました。

# kubectl exec -n kube-system -c kubedns -it kube-dns-76dbb796c5-9gljt sh

wget -O - http://localhost:10054/healthcheck/dnsmasq
Connecting to localhost:10054 (127.0.0.1:10054)
wget: server returned error: HTTP/1.1 503 Service Unavailable

dnsmasqにはLoginできないのでこっちがこけているっぽい。

kubectl exec -n kube-system -c dnsmasq -it kube-dns-76dbb796c5-9gljt sh
# kubectl -n kube-system logs kube-dns-76dbb796c5-9gljt -c dnsmasq

I0505 06:38:30.432185 1 main.go:74] opts: {{/usr/sbin/dnsmasq [-k --cache-size=1000 --no-negcache --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053] true} /etc/k8s/dns/dnsmasq-nanny 10000000000}
I0505 06:38:30.432498 1 nanny.go:94] Starting dnsmasq [-k --cache-size=1000 --no-negcache --log-facility=- --server=/cluster.local/127.0.0.1#10053 --server=/in-addr.arpa/127.0.0.1#10053 --server=/ip6.arpa/127.0.0.1#10053]
I0505 06:38:30.992206 1 nanny.go:119]
W0505 06:38:30.992278 1 nanny.go:120] Got EOF from stdout
I0505 06:38:30.992242 1 nanny.go:116]
I0505 06:38:30.992337 1 nanny.go:116] dnsmasq: failed to create inotify: No file descriptors available
I0505 06:38:30.992385 1 nanny.go:123]
E0505 06:38:30.992407 1 nanny.go:124] Error reading from stderr: read |0: file already closed
F0505 06:38:30.992583 1 nanny.go:190] dnsmasq exited: exit status 5

dnsmasq: failed to create inotify: No file descriptors available とても怪しい。

# cat /proc/sys/fs/inotify/max_user_instances
128
# cat /proc/sys/fs/inotify/max_user_watches
8192

nodeが2つ(GCEのVMが2つ)あって、VMのログなどを見ると1つのVMで以下のログを発見。

sudo journalctl -f
Failed to get journal fd: Too many open files

YAMLの定義を抜き出してきて、直接VM上で起動してみると起動できるVMとできないVMがあり、できないVMではjournallogに上記のログが出ていました。

docker run -it --rm k8s.gcr.io/k8s-dns-dnsmasq-nanny-amd64:1.14.13 \
-v=2 \
-logtostderr \
-configDir=/etc/k8s/dns/dnsmasq-nanny \
-restartDnsmasq=true \
-- \
-k \
--cache-size=1000 \
--no-negcache \
--log-facility=- \
--server=/cluster.local/127.0.0.1#10053 \
--server=/in-addr.arpa/127.0.0.1#10053 \
--server=/ip6.arpa/127.0.0.1#10053 \

対策

NodePoolを作ってそっちに全部移動させるか、NodePoolのNode数を増やしてdrainさせるか等を考えました。今回はVMで消費しているFDを見て、怪しいプロセスを消したり再起動したりする事にしました。
もう使っていないPODを消したり、ResourceLeakしてそうなPodを再起動したり。

これを行ったら、なんとか復旧しました。

その他メモ

GKEでNodePoolを作るときにubuntuではなくCOSにしていましたが、lsofが使えませんでした。toolkitを使ってapt-get update && apt-get install lsofで使えるようになりました。

ref: https://cloud.google.com/container-optimized-os/docs/how-to/toolbox

参考