Flame Graphs on Java-based Kubernetes workloads

Generating flame graphs on containerized workloads requires a bit of fiddling around with reasons (and solutions) discussed thoroughly here and here. As the title suggests, we’ll try to apply the methods above but modify it slightly to make it work in the context of workloads (scoped to Java-based applications) running inside a Kubernetes cluster.

The “Workaround”

The entire idea is illustrated below:

Kubernetes Node
+-----------------------------------------------------------+
| Docker Runtime |
| +--------------------------------------+ |
| | PID Namespace | |
| | +-------------+ | |
| Host Path | | +---------+ | | |
| +-------------+ | | | app | | | |
| |+-----------+| | +--------+ | +---------+ | | |
| ||docker.sock||---->|profiler|-\ | ^ | | |
| |+-----------+| | +--------+ -\ | | | | |
| +-------------+ | ^ -\ | | | | |
| | | -\ | +---------+ | | |
| | | - ->| fg-perf | | | |
| | | | +---------+ | | |
| | | +-------------+ | |
| | | ^ | |
| +------|----------------------|--------+ |
| | | |
| | | |
| v v |
| +------------+ +-------+ |
| |profiler pod| |app pod| |
| +------------+ +-------+ |
+-----------------------------------------------------------+

The way it works is as follows:

  1. Spawn a “profiler” pod on the same node where the pod we want to profile is scheduled. We also need to mount the host’s /var/run/docker.sock [*] via hostPath.

  2. Via the “profiler” pod, run a container (using the docker.sock we mounted earlier) and explicitly set the PID Namespace to join the target pod’s namespace via the --pid flag, e.g.

docker run -it --pid=container:<container_id> <perf-container-image>

It is also possible to let Kubernetes handle the PID namespace gymnastics but it will involve modification of the deployment resource.

  1. Exec into the container you just created and perform the actual profiling:
    3.1 Run perf on PID 1 [**] .

    perf record -F 99 -a -g -p 1 -- sleep 60

    3.2 Generate mappings using perf-map-agent. More details here.

    java -cp attach-main.jar:tools.jar net.virtualvoid.perf.AttachOnce PID 1

    3.3 Generate the Flame Graph.

    perf script -f | FlameGraph/stackcollapse-perf.pl | FlameGraph/flamegraph.pl --color=java --hash > flamegraph.svg

    3.4 You may need to do a combination of docker cp and kubectl cp to get the generated flame graph out of the container and into your machine.

Results

To make sure that everything is working accordingly, I created a sample Spring Boot application and then used Christopher’s sample code for comparison. Sure enough I got the following graph (interactive version):

Happy profiling!


[*] A big security risk, tread carefully.
[**] Assuming that the workload is using Google’s distroless as the base image.