Skip to content

Latest commit

 

History

History
504 lines (363 loc) · 15.1 KB

File metadata and controls

504 lines (363 loc) · 15.1 KB
title kubectl Debug: Deep Dive
description Ephemeral containers, process namespace sharing, debug profiles, the copy-to mechanism, node-level debugging, and troubleshooting workflows
sidebar
label order
kubectl Debug (Deep Dive)
25
category observability

Deep Dive: kubectl Debug and Troubleshooting

This document explains ephemeral containers, process namespace sharing, debug profiles, the --copy-to mechanism, node-level debugging, and systematic workflows for troubleshooting common Kubernetes failure modes.

Ephemeral Containers

Ephemeral containers are a special container type that can be added to a running pod. They are defined in the EphemeralContainerSpec and differ from regular containers in important ways.

How They Differ from Regular Containers

Feature Regular Container Ephemeral Container
Defined at creation Yes No, added to running pods
Resource limits Required (with quota) Optional
Port mappings Yes No
Liveness/readiness probes Yes No
Restart policy Applies Never restarts
Lifecycle hooks Yes No
Shows in kubectl get pods Yes Only with -o yaml

Ephemeral containers cannot be removed once added. They run until they exit. If you exit the shell, the container stops. The pod is not modified in any other way.

The API Behind kubectl debug

When you run kubectl debug -it <pod> --image=busybox, kubectl sends a PATCH request to the pod's ephemeralContainers subresource:

PATCH /api/v1/namespaces/{ns}/pods/{pod}/ephemeralcontainers

The payload adds a new entry to spec.ephemeralContainers:

ephemeralContainers:
  - name: debugger-abc123
    image: busybox:1.36
    stdin: true
    tty: true
    targetContainerName: app    # Share process namespace with this container

The kubelet pulls the debug image and starts the container in the pod's existing network and IPC namespaces.

What Ephemeral Containers Can Access

An ephemeral container shares the pod's:

  • Network namespace: Same IP, same ports, same DNS
  • IPC namespace: Same shared memory segments
  • Volume mounts: Only if explicitly configured (not automatic)

It does NOT share by default:

  • PID namespace: Processes are isolated unless shareProcessNamespace is true or targetContainerName is set
  • Filesystem: The debug container has its own root filesystem from its image

Process Namespace Sharing

Process namespace sharing is the key feature that makes ephemeral containers useful for debugging.

targetContainerName

When you specify --target=<container> in kubectl debug, the ephemeral container joins the target container's process namespace:

kubectl debug -it deploy/distroless-app -n debug-demo \
  --image=busybox:1.36 \
  --target=app

With process namespace sharing:

  • ps aux in the debug container shows processes from the target container
  • You can inspect /proc/<pid>/root/ to see the target container's filesystem
  • You can send signals to the target container's processes
  • You can read /proc/<pid>/environ to see environment variables
  • You can use strace -p <pid> to trace system calls (if capabilities allow)

Without --target, the debug container only sees its own processes.

Pod-Level Process Sharing

The pod spec also supports process namespace sharing at the pod level:

spec:
  shareProcessNamespace: true

When this is set, all containers in the pod share a single PID namespace. Every container can see every other container's processes. PID 1 is the pause container, not your application.

This is useful for sidecar patterns but changes application behavior. Some applications expect to be PID 1 and behave differently when they are not.

Debug Profiles

Kubernetes v1.30+ supports debug profiles that configure the security context of debug containers.

general (default)

No restrictions. The debug container runs with whatever security context the runtime provides. This is the most permissive and the default.

baseline

Applies baseline Pod Security Standards to the debug container:

  • No privileged mode
  • No host namespaces
  • Default capabilities only

restricted

Applies restricted Pod Security Standards:

  • Non-root user
  • No privilege escalation
  • Capabilities dropped
  • Seccomp RuntimeDefault

This may limit what you can do in the debug container. Tools that need root access (tcpdump, strace) will not work.

netadmin

Adds NET_ADMIN and NET_RAW capabilities. Useful for network debugging:

kubectl debug -it <pod> --image=nicolaka/netshoot --profile=netadmin

With netadmin, you can run:

  • tcpdump to capture packets
  • iptables to inspect firewall rules
  • ss / netstat to inspect connections
  • ip to inspect routing

Using Profiles

# Default (general) profile
kubectl debug -it <pod> --image=busybox:1.36

# Restricted profile (for security-hardened namespaces)
kubectl debug -it <pod> --image=busybox:1.36 --profile=restricted

# Network admin profile
kubectl debug -it <pod> --image=nicolaka/netshoot --profile=netadmin

The --copy-to Mechanism

The --copy-to flag creates a copy of the target pod instead of adding an ephemeral container. This is essential for debugging crashed containers.

Why Copies Are Needed

You cannot add an ephemeral container to a pod in CrashLoopBackOff. The container keeps crashing and restarting. The ephemeral container would start, but the pod might restart before you can investigate.

The demo's crashing pod:

apiVersion: v1
kind: Pod
metadata:
  name: crash-loop
  namespace: debug-demo
spec:
  containers:
    - name: app
      image: busybox:1.36
      command:
        - /bin/sh
        - -c
        - |
          echo "Starting..."
          echo "Loading config from /config/app.yaml"
          if [ ! -f /config/app.yaml ]; then
            echo "ERROR: Config file not found!"
            exit 1
          fi

This pod crashes because /config/app.yaml does not exist. You need to get inside to investigate.

How --copy-to Works

kubectl debug crash-loop -n debug-demo -it \
  --copy-to=crash-debug \
  --container=app \
  -- /bin/sh

This:

  1. Creates a new pod called crash-debug with the same spec as crash-loop.
  2. Overrides the command of the app container with /bin/sh.
  3. Attaches an interactive terminal.

The copy has the same volumes, environment variables, and image as the original. But because the command is overridden to /bin/sh, the container starts a shell instead of crashing.

Inside the copy, you can investigate:

ls /config/          # See what files exist
cat /config/app.yaml # Check if config was supposed to be mounted
env                  # Check environment variables
mount                # Check volume mounts

Copy with Image Override

You can also change the image in the copy:

kubectl debug crash-loop -n debug-demo -it \
  --copy-to=crash-debug \
  --image=ubuntu:22.04 \
  --container=app \
  -- bash

This replaces the container's image with Ubuntu, giving you access to tools like apt, curl, dig, etc. The volumes and env vars from the original pod are preserved.

Copy Limitations

The copied pod is a new pod. It gets a new IP address. It is not behind the same Service. It is a diagnostic tool, not a live replacement.

Remember to clean up copies:

kubectl delete pod crash-debug -n debug-demo

Node-Level Debugging

kubectl debug node/ creates a privileged pod on a specific node with access to the host filesystem:

kubectl debug node/minikube -it --image=busybox:1.36

What the Node Debug Pod Gets

The debug pod mounts the host root filesystem at /host:

# Inside the debug pod
chroot /host ps aux        # Host processes
chroot /host df -h         # Host disk usage
chroot /host journalctl -u kubelet  # Kubelet logs

The chroot /host command changes the root directory to the host filesystem. After chroot, you are effectively running commands on the host.

Common Node Debugging Tasks

Check kubelet status:

chroot /host systemctl status kubelet
chroot /host journalctl -u kubelet --no-pager | /usr/bin/tail -50

Check disk pressure:

chroot /host df -h
chroot /host du -sh /var/lib/kubelet
chroot /host du -sh /var/lib/containers

Check container runtime:

chroot /host crictl ps
chroot /host crictl images
chroot /host crictl logs <container-id>

Check network:

chroot /host ip addr
chroot /host ip route
chroot /host iptables -t nat -L -n

Troubleshooting CrashLoopBackOff

CrashLoopBackOff means the container starts, crashes, and keeps restarting with increasing backoff delays (10s, 20s, 40s, up to 5 minutes).

Diagnostic Steps

1. Check logs:

kubectl logs crash-loop -n debug-demo
kubectl logs crash-loop -n debug-demo --previous

--previous shows logs from the last crash. Without it, you might see logs from the current (still-starting) instance.

2. Check events:

kubectl describe pod crash-loop -n debug-demo

Look at the Events section. Common messages:

  • Back-off restarting failed container: The container exited with non-zero
  • Error: ImagePullBackOff: Cannot pull the image (wrong name, auth failure)
  • OOMKilled: Container exceeded memory limit

3. Check exit code:

kubectl get pod crash-loop -n debug-demo -o jsonpath='{.status.containerStatuses[0].lastState.terminated.exitCode}'

Common exit codes:

Code Meaning
0 Success (but containers should not exit in a Deployment)
1 General error (application error)
2 Misuse of shell command
126 Command not executable
127 Command not found
128+N Killed by signal N (e.g., 137 = SIGKILL = 128+9)
137 OOMKilled or SIGKILL
139 Segfault (SIGSEGV = 128+11)
143 SIGTERM (128+15, graceful shutdown)

4. Use kubectl debug --copy-to:

kubectl debug crash-loop -n debug-demo -it \
  --copy-to=debug-crash \
  --container=app \
  -- /bin/sh

Troubleshooting ImagePullBackOff

The kubelet cannot pull the container image.

Common Causes

Wrong image name or tag:

kubectl describe pod <pod> | grep "Image:"
kubectl describe pod <pod> | grep "Failed"

Authentication failure:

kubectl get pod <pod> -o jsonpath='{.spec.imagePullSecrets}'
kubectl get secret <pull-secret> -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d

Registry not reachable:

kubectl debug node/minikube -it --image=busybox:1.36
# Inside: wget -O- https://registry.example.com/v2/

Rate limiting (Docker Hub): Docker Hub limits pulls to 100 per 6 hours for anonymous users. Use authenticated pulls or mirror images locally.

Troubleshooting Pending Pods

A pod stays in Pending when the scheduler cannot find a node.

Diagnostic Steps

kubectl describe pod <pod> -n <namespace>

Check the Events section for:

Insufficient resources:

0/3 nodes are available: 3 Insufficient cpu

The cluster does not have enough free CPU. Either add nodes, reduce requests, or delete other workloads.

Node affinity/selector mismatch:

0/3 nodes are available: 3 node(s) didn't match Pod's node affinity/selector

The pod requires a node label that no node has. Check nodeSelector or nodeAffinity in the pod spec.

Taints and tolerations:

0/3 nodes are available: 3 node(s) had taints that the pod didn't tolerate

All nodes have taints that the pod does not tolerate. Add tolerations or remove taints.

PVC not bound:

0/3 nodes are available: 3 pod has unbound immediate PersistentVolumeClaims

The PVC is waiting for a PersistentVolume. Check PVC status and StorageClass provisioner.

ResourceQuota exceeded:

exceeded quota: compute-quota

The namespace has hit its resource quota. Free up resources or increase the quota.

Troubleshooting OOMKilled

The container was killed by the kernel OOM killer because it exceeded its memory limit.

Diagnostic Steps

kubectl describe pod <pod> | grep -A 5 "Last State"

Look for:

Last State:  Terminated
  Reason:    OOMKilled
  Exit Code: 137

Fix options:

  1. Increase the container's memory limit
  2. Fix the memory leak in the application
  3. Use a memory profiler inside a debug container
kubectl debug -it <pod> --image=alpine --target=app -- sh
# Inside: watch cat /proc/1/status | grep -i vm

This shows the target container's virtual memory stats in real time.

Common Debugging Workflows

Network Connectivity

Use nicolaka/netshoot with --profile=netadmin for network debugging. Inside: nslookup for DNS, curl -v for HTTP, nc -zv for TCP, tcpdump for packet capture, ip route for routing.

Application Logs

Key kubectl logs flags: -f for streaming, --previous for last crash, --since=1h for time-based, --tail=100 for line limits, --timestamps for timing, --all-containers for multi-container pods.

Resource Usage

Use kubectl top pods and kubectl top nodes (requires metrics-server) for current consumption.

Distroless Containers

For images like registry.k8s.io/pause:3.9 (no shell), ephemeral containers are the only option. Use --target=app to share the process namespace, then inspect via /proc/1/root/ (filesystem), /proc/1/environ (env vars), and /proc/1/cmdline (command).

Useful Debug Images

Image Size Use Case
busybox:1.36 ~4 MB Basic shell, file operations
alpine:3.19 ~7 MB Shell + package manager (apk)
nicolaka/netshoot ~350 MB Full network debugging toolkit
curlimages/curl ~15 MB HTTP debugging
ubuntu:22.04 ~75 MB General purpose with apt
registry.k8s.io/e2e-test-images/agnhost ~30 MB Kubernetes-aware debugging

Choose the smallest image that has the tools you need. In production clusters with image pull restrictions, pre-pull debug images or use an internal registry.

Security Considerations

Ephemeral containers bypass the pod's original security posture. A pod running with restricted PSS can have a debug container injected with general profile that runs as root with all capabilities.

This is by design. Debugging requires elevated access. But it means:

  1. RBAC on pods/ephemeralcontainers controls who can debug.
  2. Audit logs capture debug container creation.
  3. Debug containers in production should be time-limited and reviewed.

Lock down ephemeral container creation with RBAC:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: pod-debugger
rules:
  - apiGroups: [""]
    resources: ["pods/ephemeralcontainers"]
    verbs: ["patch"]
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["get", "list"]

Only users bound to this role can create debug containers. Others can only view pods.

Related Resources