Purple Logo - Dark BackgroundCreated with Sketch.
Navigate back to the homepage
What is Release?launch

Kubernetes - How to Debug CrashLoopBackOff in a Container

David Giffin
January 26th, 2021 · 3 min read

Kubernetes - How to Debug CrashLoopBackOff in a Container

If you’ve used Kubernetes (k8s), you’ve probably bumped into the dreaded CrashLoopBackOff. A CrashLoopBackOff is possible for several types of k8s misconfigurations (not able to connect to persistent volumes, init-container misconfiguration, etc). We aren’t going to cover how to configure k8s properly in this article, but instead will focus on the harder problem of debugging your code or, even worse, someone else’s code 😱

Here is the output from kubectl describe pod for a CrashLoopBackOff:

1Name: frontend-5c49b595fc-sjzkg
2Namespace: tedbf02-ac-david-nginx-golang-tmcclung-nginx-golang
3Priority: 0
4Start Time: Wed, 23 Dec 2020 14:55:49 -0500
5Labels: app=frontend
6 pod-template-hash=5c49b595fc
7 tier=frontend
8Status: Running
10IPs: <none>
11Controlled By: ReplicaSet/frontend-5c49b595fc
13 frontend:
14 Container ID: docker://a4ed7efcaaa87fe36342cf7532ff1de5cd51b62d3d681dfb9857999300f6c587
15 Image: .amazonaws.com/tommyrelease/awesome-compose/frontend@sha256:dfd762c
16 Image ID: docker-pullable://.amazonaws.com/tommyrelease/awesome-compose/frontend@sha256:dfd762c
17 Port: 80/TCP
18 Host Port: 0/TCP
19 State: Waiting
20 Reason: CrashLoopBackOff
21 Last State: Terminated
22 Reason: Error
23 Exit Code: 1
24 Started: Sun, 24 Jan 2021 20:25:26 -0500
25 Finished: Sun, 24 Jan 2021 20:25:26 -0500
26 Ready: False
27 Restart Count: 9043

Two common problems when starting a container are OCI runtime create failed (which means you are referencing a binary or script that doesn’t exist on the container) and container “Completed” or “Error” which both mean that the code executing on the container failed to run a service and stay running.

Here’s an example of an OCI runtime error, trying to execute: “hello crashloop”:

1Port: 80/TCP
2 Host Port: 0/TCP
3 Command:
4 hello
5 crashloop
6 State: Waiting
7 Reason: CrashLoopBackOff
8 Last State: Terminated
9 Reason: ContainerCannotRun
10 Message: OCI runtime create failed: container_linux.go:370: starting container process caused: exec: "hello": executable file not found in $PATH: unknown
11 Exit Code: 127
12 Started: Mon, 25 Jan 2021 22:20:04 -0500
13 Finished: Mon, 25 Jan 2021 22:20:04 -0500

K8s gives you the exit status of the process in the container when you look at a pod using kubectl or k9s. Common exit statuses from unix processes include 1-125. Each unix command usually has a man page, which provides more details around the various exit codes. Exit code (128 + SIGKILL 9) 137 means that k8s hit the memory limit for your pod and killed your container for you.

Here is the output from kubectl describe pod, showing the container exit code:

1Last State: Terminated
2 Reason: Error
3 Exit Code: 1
4 Started: Sun, 24 Jan 2021 20:25:26 -0500
5 Finished: Sun, 24 Jan 2021 20:25:26 -0500
6 Ready: False
7 Restart Count: 9043

All containers are not created equally.

Docker allows you to define an Entrypoint and Cmd which you can mix and match in a Dockerfile. Entrypoint is the executable, and Cmd are the arguments passed to the Entrypoint. The Dockerfile schema is quite lenient and allows users to set Cmd without Entrypoint, which means that the first argument in Cmd will be the executable to run.

Note: k8s uses a different naming convention for Docker Entrypoint and Cmd. In Kubernetes command is Docker Entrypoint and Kubernetes args is Docker Cmd.

DescriptionDocker field nameKubernetes field name
The command run by the containerEntrypointcommand
Arguments passed to the commandCmdargs

There are a few tricks to understanding how the container you’re working with starts up. In order to get the startup command when you’re dealing with someone else’s container, we need to know the intended Docker Entrypoint and Cmd of the Docker image. If you have the Dockerfile that created the Docker image, then you likely already know the Entrypoint and Cmd, unless you aren’t defining them and inheriting from a base image that has them set.

When dealing with either off the shelf containers, using someone else’s container and you don’t have the Dockerfile, or you’re inheriting from a base image that you don’t have the Dockerfile for, you can use the following steps to get the values you need. First, we pull the container locally using docker pull, then we inspect the container image to get the Entrypoint and Cmd:

  • docker pull <image id>
  • docker inspect <image id>

Here we use jq to filter the JSON response from docker inspect:

1david@sega:~: docker pull docker.elastic.co/elasticsearch/elasticsearch:7.10.2
27.10.2: Pulling from elasticsearch/elasticsearch
3ddf49b9115d7: Pull complete
4e736878e27ad: Pull complete
57487c9dcefbe: Pull complete
69ccb7e6e1f0c: Pull complete
7dcec6dec98db: Pull complete
88a10b4854661: Pull complete
91e595aee1b7d: Pull complete
1006cc198dbf22: Pull complete
1155b9b1b50ed8: Pull complete
12Digest: sha256:d528cec81720266974fdfe7a0f12fee928dc02e5a2c754b45b9a84c84695bfd9
13Status: Downloaded newer image for docker.elastic.co/elasticsearch/elasticsearch:7.10.2
15david@sega:~: docker inspect docker.elastic.co/elasticsearch/elasticsearch:7.10.2 | jq '.[0] .ContainerConfig .Entrypoint'
17 "/tini",
18 "--",
19 "/usr/local/bin/docker-entrypoint.sh"
21david@sega:~: docker inspect docker.elastic.co/elasticsearch/elasticsearch:7.10.2 | jq '.[0] .ContainerConfig .Cmd'
23 "/bin/sh",
24 "-c",
25 "#(nop) ",
26 "CMD [\"eswrapper\"]"

The Dreaded CrashLoopBackOff

Now that you have all that background, let’s get to debugging the CrashLoopBackOff.

In order to understand what’s happening, it’s important to be able to inspect the container inside of k8s so the application has all the environment variables and dependent services. Updating the deployment and setting the container Entrypoint or k8s command temporarily to tail -f /dev/null or sleep infinity will give you an opportunity to debug why the service doesn’t stay running.

Here’s how to configure k8s to override the container Entrypoint:

1apiVersion: extensions/v1beta1
2kind: Deployment
4 name: elasticsearch
5 namespace: elasticsearch
7 progressDeadlineSeconds: 600
8 replicas: 1
9 revisionHistoryLimit: 3
10 selector:
11 matchLabels:
12 app: backend
13 tier: backend
14 strategy:
15 rollingUpdate:
16 maxSurge: 25%
17 maxUnavailable: 25%
18 type: RollingUpdate
19 template:
20 metadata:
21 creationTimestamp: null
22 labels:
23 app: backend
24 tier: backend
25 spec:
26 containers:
27 - command:
28 - tail
29 - "-f"
30 - /dev/null

Here’s the configuration in Release:

2- name: elasticsearch
3 image: docker.elastic.co/elasticsearch/elasticsearch:7.10.2
4 command:
5 - tail
6 - "-f"
7 - /dev/null

You can now use kubectl or k9s to exec into the container and take a look around. Using the Entrypoint and Cmd you discovered earlier, you can execute the intended startup command and see how the application is failing.

Depending on the container you’re running, it may be missing many of the tools necessary to debug your problem like: curl, lsof, vim; and if it’s someone else’s code, you probably don’t know which version of linux was used to create the image. We typically try all of the common package managers until we find the right one. Most containers these days use Alpine Linux (apk package manager) or a Debian, Ubuntu (apt-get package manager) based image. In some cases we’ve seen Centos and Fedora, which both use the yum package manager.

One of the following commands should work depending on the operating system:

  • apk
  • apt-get
  • yum

Dockerfile maintainers often remove the cache from the package manager to shrink the size of the image, so you may also need to run one of the following:

  • apk update
  • apt-get update
  • yum makecache

Now you need to add the necessary tools to help with debugging. Depending on the package manager you found, use one of the following commands to add useful debugging tools:

  • apt-get install -y curl vim procps inetutils-tools net-tools lsof
  • apk add curl vim procps net-tools lsof
  • yum install curl vim procps lsof

At this point, it’s up to you to figure out the problem. You can edit files using vim to tweak the container until you understand what’s going on. If you forget all of the files you’ve touched on the container, you can alway kill the pod and the container will restart without your changes. Always remember to write down the steps taken to get the container working. You’ll want to use your notes to alter the Dockerfile or add commands to the container startup scripts.

Debugging Your Containers

We have created a simple script to get all of the debuging tools, as long as you are working with a container that has curl pre-installed:

1# install debugging tools on a container with curl pre-installed
2/bin/sh -c "$(curl -fsSL https://raw.githubusercontent.com/releaseapp-io/container-debug/main/install.sh)"


In this article, we’ve learnt how to spot and investigate the CrashLoopBackOff errors in containers. We walked you through how to inspect and investigate the container image itself. We’ve listed and shown some tools that we use to spot problems and investigate issues. We got several useful and basic tools installed on the image, hopefully regardless of base image. With these steps in mind and all the tools ready at your disposal, go forth and fix all the things!

Learn more at  

Purple Logo - Dark BackgroundCreated with Sketch.

Browse through our  Documentation  to dive deeper.

More articles from Release

You don't need to know what you’re doing, you need an agile DevOps methodology

Do you ever feel like you don't know what you’re doing? Like you’re just kind of going along with things, doing your *best* but not sure if you’re doing it *right*? Agile DevOps methodology can help.

January 6th, 2021 · 5 min read

How To: Get a Free Minecraft Server Running on Release (https://releaseapp.io)

Get a free Minecraft server running on https://releaseapp.io

December 15th, 2020 · 6 min read
© 2020 Release
Link to $https://github.com/releaseapp-io