| title | kubectl Debug: Deep Dive | ||||
|---|---|---|---|---|---|
| description | Ephemeral containers, process namespace sharing, debug profiles, the copy-to mechanism, node-level debugging, and troubleshooting workflows | ||||
| sidebar |
|
||||
| category | observability |
This document explains ephemeral containers, process namespace sharing, debug profiles, the --copy-to mechanism, node-level debugging, and systematic workflows for troubleshooting common Kubernetes failure modes.
Ephemeral containers are a special container type that can be added to a running pod. They are defined in the EphemeralContainerSpec and differ from regular containers in important ways.
| Feature | Regular Container | Ephemeral Container |
|---|---|---|
| Defined at creation | Yes | No, added to running pods |
| Resource limits | Required (with quota) | Optional |
| Port mappings | Yes | No |
| Liveness/readiness probes | Yes | No |
| Restart policy | Applies | Never restarts |
| Lifecycle hooks | Yes | No |
Shows in kubectl get pods |
Yes | Only with -o yaml |
Ephemeral containers cannot be removed once added. They run until they exit. If you exit the shell, the container stops. The pod is not modified in any other way.
When you run kubectl debug -it <pod> --image=busybox, kubectl sends a PATCH request to the pod's ephemeralContainers subresource:
PATCH /api/v1/namespaces/{ns}/pods/{pod}/ephemeralcontainers
The payload adds a new entry to spec.ephemeralContainers:
ephemeralContainers:
- name: debugger-abc123
image: busybox:1.36
stdin: true
tty: true
targetContainerName: app # Share process namespace with this containerThe kubelet pulls the debug image and starts the container in the pod's existing network and IPC namespaces.
An ephemeral container shares the pod's:
- Network namespace: Same IP, same ports, same DNS
- IPC namespace: Same shared memory segments
- Volume mounts: Only if explicitly configured (not automatic)
It does NOT share by default:
- PID namespace: Processes are isolated unless
shareProcessNamespaceis true ortargetContainerNameis set - Filesystem: The debug container has its own root filesystem from its image
Process namespace sharing is the key feature that makes ephemeral containers useful for debugging.
When you specify --target=<container> in kubectl debug, the ephemeral container joins the target container's process namespace:
kubectl debug -it deploy/distroless-app -n debug-demo \
--image=busybox:1.36 \
--target=appWith process namespace sharing:
ps auxin the debug container shows processes from the target container- You can inspect
/proc/<pid>/root/to see the target container's filesystem - You can send signals to the target container's processes
- You can read
/proc/<pid>/environto see environment variables - You can use
strace -p <pid>to trace system calls (if capabilities allow)
Without --target, the debug container only sees its own processes.
The pod spec also supports process namespace sharing at the pod level:
spec:
shareProcessNamespace: trueWhen this is set, all containers in the pod share a single PID namespace. Every container can see every other container's processes. PID 1 is the pause container, not your application.
This is useful for sidecar patterns but changes application behavior. Some applications expect to be PID 1 and behave differently when they are not.
Kubernetes v1.30+ supports debug profiles that configure the security context of debug containers.
No restrictions. The debug container runs with whatever security context the runtime provides. This is the most permissive and the default.
Applies baseline Pod Security Standards to the debug container:
- No privileged mode
- No host namespaces
- Default capabilities only
Applies restricted Pod Security Standards:
- Non-root user
- No privilege escalation
- Capabilities dropped
- Seccomp RuntimeDefault
This may limit what you can do in the debug container. Tools that need root access (tcpdump, strace) will not work.
Adds NET_ADMIN and NET_RAW capabilities. Useful for network debugging:
kubectl debug -it <pod> --image=nicolaka/netshoot --profile=netadminWith netadmin, you can run:
tcpdumpto capture packetsiptablesto inspect firewall rulesss/netstatto inspect connectionsipto inspect routing
# Default (general) profile
kubectl debug -it <pod> --image=busybox:1.36
# Restricted profile (for security-hardened namespaces)
kubectl debug -it <pod> --image=busybox:1.36 --profile=restricted
# Network admin profile
kubectl debug -it <pod> --image=nicolaka/netshoot --profile=netadminThe --copy-to flag creates a copy of the target pod instead of adding an ephemeral container. This is essential for debugging crashed containers.
You cannot add an ephemeral container to a pod in CrashLoopBackOff. The container keeps crashing and restarting. The ephemeral container would start, but the pod might restart before you can investigate.
The demo's crashing pod:
apiVersion: v1
kind: Pod
metadata:
name: crash-loop
namespace: debug-demo
spec:
containers:
- name: app
image: busybox:1.36
command:
- /bin/sh
- -c
- |
echo "Starting..."
echo "Loading config from /config/app.yaml"
if [ ! -f /config/app.yaml ]; then
echo "ERROR: Config file not found!"
exit 1
fiThis pod crashes because /config/app.yaml does not exist. You need to get inside to investigate.
kubectl debug crash-loop -n debug-demo -it \
--copy-to=crash-debug \
--container=app \
-- /bin/shThis:
- Creates a new pod called
crash-debugwith the same spec ascrash-loop. - Overrides the command of the
appcontainer with/bin/sh. - Attaches an interactive terminal.
The copy has the same volumes, environment variables, and image as the original. But because the command is overridden to /bin/sh, the container starts a shell instead of crashing.
Inside the copy, you can investigate:
ls /config/ # See what files exist
cat /config/app.yaml # Check if config was supposed to be mounted
env # Check environment variables
mount # Check volume mountsYou can also change the image in the copy:
kubectl debug crash-loop -n debug-demo -it \
--copy-to=crash-debug \
--image=ubuntu:22.04 \
--container=app \
-- bashThis replaces the container's image with Ubuntu, giving you access to tools like apt, curl, dig, etc. The volumes and env vars from the original pod are preserved.
The copied pod is a new pod. It gets a new IP address. It is not behind the same Service. It is a diagnostic tool, not a live replacement.
Remember to clean up copies:
kubectl delete pod crash-debug -n debug-demokubectl debug node/ creates a privileged pod on a specific node with access to the host filesystem:
kubectl debug node/minikube -it --image=busybox:1.36The debug pod mounts the host root filesystem at /host:
# Inside the debug pod
chroot /host ps aux # Host processes
chroot /host df -h # Host disk usage
chroot /host journalctl -u kubelet # Kubelet logsThe chroot /host command changes the root directory to the host filesystem. After chroot, you are effectively running commands on the host.
Check kubelet status:
chroot /host systemctl status kubelet
chroot /host journalctl -u kubelet --no-pager | /usr/bin/tail -50Check disk pressure:
chroot /host df -h
chroot /host du -sh /var/lib/kubelet
chroot /host du -sh /var/lib/containersCheck container runtime:
chroot /host crictl ps
chroot /host crictl images
chroot /host crictl logs <container-id>Check network:
chroot /host ip addr
chroot /host ip route
chroot /host iptables -t nat -L -nCrashLoopBackOff means the container starts, crashes, and keeps restarting with increasing backoff delays (10s, 20s, 40s, up to 5 minutes).
1. Check logs:
kubectl logs crash-loop -n debug-demo
kubectl logs crash-loop -n debug-demo --previous--previous shows logs from the last crash. Without it, you might see logs from the current (still-starting) instance.
2. Check events:
kubectl describe pod crash-loop -n debug-demoLook at the Events section. Common messages:
Back-off restarting failed container: The container exited with non-zeroError: ImagePullBackOff: Cannot pull the image (wrong name, auth failure)OOMKilled: Container exceeded memory limit
3. Check exit code:
kubectl get pod crash-loop -n debug-demo -o jsonpath='{.status.containerStatuses[0].lastState.terminated.exitCode}'Common exit codes:
| Code | Meaning |
|---|---|
| 0 | Success (but containers should not exit in a Deployment) |
| 1 | General error (application error) |
| 2 | Misuse of shell command |
| 126 | Command not executable |
| 127 | Command not found |
| 128+N | Killed by signal N (e.g., 137 = SIGKILL = 128+9) |
| 137 | OOMKilled or SIGKILL |
| 139 | Segfault (SIGSEGV = 128+11) |
| 143 | SIGTERM (128+15, graceful shutdown) |
4. Use kubectl debug --copy-to:
kubectl debug crash-loop -n debug-demo -it \
--copy-to=debug-crash \
--container=app \
-- /bin/shThe kubelet cannot pull the container image.
Wrong image name or tag:
kubectl describe pod <pod> | grep "Image:"
kubectl describe pod <pod> | grep "Failed"Authentication failure:
kubectl get pod <pod> -o jsonpath='{.spec.imagePullSecrets}'
kubectl get secret <pull-secret> -o jsonpath='{.data.\.dockerconfigjson}' | base64 -dRegistry not reachable:
kubectl debug node/minikube -it --image=busybox:1.36
# Inside: wget -O- https://registry.example.com/v2/Rate limiting (Docker Hub): Docker Hub limits pulls to 100 per 6 hours for anonymous users. Use authenticated pulls or mirror images locally.
A pod stays in Pending when the scheduler cannot find a node.
kubectl describe pod <pod> -n <namespace>Check the Events section for:
Insufficient resources:
0/3 nodes are available: 3 Insufficient cpu
The cluster does not have enough free CPU. Either add nodes, reduce requests, or delete other workloads.
Node affinity/selector mismatch:
0/3 nodes are available: 3 node(s) didn't match Pod's node affinity/selector
The pod requires a node label that no node has. Check nodeSelector or nodeAffinity in the pod spec.
Taints and tolerations:
0/3 nodes are available: 3 node(s) had taints that the pod didn't tolerate
All nodes have taints that the pod does not tolerate. Add tolerations or remove taints.
PVC not bound:
0/3 nodes are available: 3 pod has unbound immediate PersistentVolumeClaims
The PVC is waiting for a PersistentVolume. Check PVC status and StorageClass provisioner.
ResourceQuota exceeded:
exceeded quota: compute-quota
The namespace has hit its resource quota. Free up resources or increase the quota.
The container was killed by the kernel OOM killer because it exceeded its memory limit.
kubectl describe pod <pod> | grep -A 5 "Last State"Look for:
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
Fix options:
- Increase the container's memory limit
- Fix the memory leak in the application
- Use a memory profiler inside a debug container
kubectl debug -it <pod> --image=alpine --target=app -- sh
# Inside: watch cat /proc/1/status | grep -i vmThis shows the target container's virtual memory stats in real time.
Use nicolaka/netshoot with --profile=netadmin for network debugging. Inside: nslookup for DNS, curl -v for HTTP, nc -zv for TCP, tcpdump for packet capture, ip route for routing.
Key kubectl logs flags: -f for streaming, --previous for last crash, --since=1h for time-based, --tail=100 for line limits, --timestamps for timing, --all-containers for multi-container pods.
Use kubectl top pods and kubectl top nodes (requires metrics-server) for current consumption.
For images like registry.k8s.io/pause:3.9 (no shell), ephemeral containers are the only option. Use --target=app to share the process namespace, then inspect via /proc/1/root/ (filesystem), /proc/1/environ (env vars), and /proc/1/cmdline (command).
| Image | Size | Use Case |
|---|---|---|
busybox:1.36 |
~4 MB | Basic shell, file operations |
alpine:3.19 |
~7 MB | Shell + package manager (apk) |
nicolaka/netshoot |
~350 MB | Full network debugging toolkit |
curlimages/curl |
~15 MB | HTTP debugging |
ubuntu:22.04 |
~75 MB | General purpose with apt |
registry.k8s.io/e2e-test-images/agnhost |
~30 MB | Kubernetes-aware debugging |
Choose the smallest image that has the tools you need. In production clusters with image pull restrictions, pre-pull debug images or use an internal registry.
Ephemeral containers bypass the pod's original security posture. A pod running with restricted PSS can have a debug container injected with general profile that runs as root with all capabilities.
This is by design. Debugging requires elevated access. But it means:
- RBAC on
pods/ephemeralcontainerscontrols who can debug. - Audit logs capture debug container creation.
- Debug containers in production should be time-limited and reviewed.
Lock down ephemeral container creation with RBAC:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: pod-debugger
rules:
- apiGroups: [""]
resources: ["pods/ephemeralcontainers"]
verbs: ["patch"]
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list"]Only users bound to this role can create debug containers. Others can only view pods.