Simplifying Thread Dump Analysis: A Comprehensive Guide

Monitoring is crucial for any application, mainly if it serves users. Any disruption in the working state may cost money and reputation. Thus, we need to strike this ever-evasive balance of enough resources so our SLAs won’t be affected and not to over allocate and pay a fortune for a few servers.

Thread dumps can help us check the health of our application regarding threads. In most cases, even a tiny service uses lots of threads. They can have different origins: explicit threading in an application, threads spun by a framework for request processing, garbage collection threads, or JVM threads for internal housekeeping.

Thus, the interaction between threads and the changes in their states over time can help us identify any issues and spot opportunities for improvement. In this article, we’ll learn several methods for capturing thread dumps from a running application.

Console Tools

First, let’s consider simple console tools we can use wherever we want: directly on a host, on remote servers, and inside containers. While analyzing the thread dumps in a console isn’t very convenient, it’s crucial to understand this set of tools.

kill -3

The kill -3 command is simple and lets us get a thread dump from our application. Running this command would print the dump into the application output. However, first, we need to identify the application’s PID. The simplest way to do this is to use jps:

$ jps
[PID] GcThreadsRunner

After that, we can use the kill -3 command with the found PID:

$ kill -3 [PID]

We won’t get the thread dump in the console where we execute it, but it should be in the application’s output by default. However, it’s not always (never) convenient to just print a thread dump.

Luckily, we can change it to get the result in a file. This way, we can read, analyze, and store it for future reference. We can start our application with the following flags to achieve this:

-XX:+UnlockDiagnosticVMOptions
-XX:+LogVMOutput
-XX:LogFile=./thread_dump_%t.log

Also, please note that it won’t kill the process despite its name, so we can use it repeatedly to get the dump.

Another benefit of this option is that it works perfectly with OnOutOfMemoryError, and we can take the thread dump of a dying application. Getting a thread dump at the last moment helps us see the thread state to gain more insights into the cause of OnOutOfMemoryError:

-XX:OnOutOfMemoryError= "kill -3 %p"

It’s a great way to ensure we have all the required information to investigate the crashed system.

Control + Break

An even more straightforward way to get the thread dump is to use the combination of Control + Break (backslash on Unix systems). There’s a particular thread that listens to this combination and triggers a thread dump:

…
"Monitor Ctrl-Break" #19 [24835] daemon prio=5 os_prio=31 cpu=14.32ms elapsed=1.37s tid=0x0000000141812a00 nid=24835 runnable [0x000000016f9ae000]
java.lang.Thread.State: RUNNABLE
…

However, this is possible only when we attach to the application directly. We can also redirect the output to a file, but it won’t distinguish between the usual output and the thread dump. While unsuitable for production environments, it can be a nice shortcut for development and debugging.

jstack

jstack is also a simple console tool that we can use. It provides similar functionality to kill -3:

$ jstack [PID]

To output the thread dump to a file, we can provide it as a second argument. It’s more straightforward, as we don’t need to add VM options on the startup:

$ jstack [PID] > FILENAME

This tool helps us to get more information about the thread state and identify any hotspots and performance issues.

jcmd

Let’s check another tool. jcmd is included in JDK and helps us to interact with a running Java application. While it’s not a tool for analysis per se, we can use it to execute diagnostics commands.

If we run it in a console as it is or use the -l flag, it will show us the JVM processes running on our machine. It’s similar to jps but provides more information:

$ jcmd -l
12568 jdk.jcmd/sun.tools.jcmd.JCmd
12506 com.kovko.benchmark.gcthreads.GcThreadsInfiniteLoop
7707 com.intellij.idea.Main

After identifying the process, we can use it to trigger a thread dump in our application by sending the following command:

$ jcmd [PID] Thread.print

Also, we can use it to output the dump to a file:

$ jcmd [PID] Thread.print > thread_dump.txt

However, we have a dedicated command to handle this case:

$ jcmd [PID] Thread.dump_to_file  thread_dump.txt

jcmd is a powerful tool, and thread dumps are only a small part of its functionality. To check all the commands, either calling for a specific PID without any command or using help:

$ jcmd [PID] help

JMX and mxterm

Another tool that isn’t a diagnostics tool itself is Java Management Extension. However, as the jcmd, we can use it to interact with a Java application from outside.

While it’s possible to create custom MBeans for different purposes, such as turning on more comprehensive logging at runtime, we’ll concentrate on the features it provides out of the box.

Java exposes DiagnosticsCommand MBean, which we can use to trigger diagnostics operations. We can trigger such operations inside our application, but it’s not a very useful approach.

It’s better to use some external client that allows us to connect to the MBean server. Most desktop tools have this feature. However, if we want to stick to the console solution, we can use mxterm, but first, we need to download a jar file.

We won’t go through all the steps but concentrate on the DiagnosticsCommand instead:

$ beans
#domain = com.sun.management:
com.sun.management:type=DiagnosticCommand
com.sun.management:type=HotSpotDiagnostic


$ bean com.sun.management:type=DiagnosticCommand 
#bean is set to com.sun.management:type=DiagnosticCommand


$ info
#mbean = com.sun.management:type=DiagnosticCommand
#class name = com.sun.management.internal.DiagnosticCommandImpl
#there is no attribute
# operations
...
  %21  - java.lang.String threadDumpToFile([Ljava.lang.String; arguments)
  %22  - java.lang.String threadPrint([Ljava.lang.String; arguments)
...
  %30  - java.lang.String vmEvents([Ljava.lang.String; arguments)
  %31  - java.lang.String vmFlags([Ljava.lang.String; arguments)
  %32  - java.lang.String vmInfo()
...
  %43  - java.lang.String vmVersion()
# notifications
  %0   - javax.management.Notification(jmx.mbean.info.changed)


$ run threadDumpToFile 'FILE_PATH/thread_dump.log'
#calling operation threadDumpToFile of mbean com.sun.management:type=DiagnosticCommand with params [[Ljava.lang.String;@c667f46]
#operation returns: 
Created FILE_PATH/thread_dump.log

We can use many other diagnostic commands, so it’s worth checking them. However, later in the article, we’ll learn how to access this functionality via desktop tools. jmxterm can help us automate processes and store the commands under version control.

Monitoring Containers

We can use the same methods to get thread dumps from containers as we used directly on the host. To do so, we can attach a terminal to a running container:

$ docker exec -it CONTAINER_ID sh

After that, we can follow the same steps as we described previously. When we direct thread dumps into the application output, we can check it using the Docker log:

$ docker logs CONTAINER_ID

However, getting it to a file and storing it on the host is more convenient. Thus, we can either do this in two steps: create a file with a thread dump and, after that, use the cp command to copy it to the host:

$ docker cp CONTAINER_ID:<PATH_IN_CONTEINER> <PATH_ON_HOST>

At the same time, we can create a host volume and just store the files there. It’s a simple approach for getting diagnostics information from containers.

Desktop Tools

After checking console tools, let’s review tools that simplify the experience. In most cases, these tools use the same interface to interact with the Java processes but provide an intuitive interface and visualization:

jconsole

It’s a simple default Java tool. Thus, we can run it directly from the console:

$ jconsole

In the beginning, we just need to pick a process we’re interested in:

It allows us to check the threads and their stacks:

However, this tool doesn’t allow thread dumps, at least not directly. We can use JMX to trigger thread dump:

Most desktop tools provide access to JMX, so even if they allow functionality directly, we can use JMX for this.

VisualVM

It’s a powerful tool for monitoring the performance of a JVM. The initial screen is similar to jconsole:

It presents the data nicely using visualizations. For example, this is how the information about application threads would look like in VisualVM:

Also, we can take the thread dump with a single click:

All the thread dumps will be attached to a process on the right:

Technically, we can compare them, but there is no comprehensive comparison tool, so check the changes in the thread states over time.

JDK Mission Control

JDK Mission Control is another comprehensive tool for application analysis. Overall, the UI looks similar to the previous ones we discussed:

We can get access to diagnostic commands:

**Fig: JDK Mission Control diagnostics commands**

Also, we can use the MBean browser as well:

**Fig: JDK Mission Control MBean Browser**

Thus, all these tools provide a nice wrapper over the functionality available through the console.

IDEs

If we run our application from the IDE and want more information about it, we can use profilers inside the IDEs themselves. We’ll use IntelliJ IDEA as an example; however, other IDEs support such functionality.

IntelliJ IDEA has an in-built profiler that can help us monitor the application in real time, profile it, or get thread and heap dumps:

We can create a thread dump directly from this profiler view:

***Fig: IntelliJ IDEA taking a thread dump from a profiler view***

The visualization is similar to the VisualVm:

***Fig: IntelliJ IDEA thread visualization***

It provides some nice features for filtering and searching. We can also export the thread dump to a file.

Additionally, we can get information about the threads while debugging. This way, we can have more control over the state we want to check:

**Fig: IntelliJ IDEA thread visualization during debugging**

This way, we won’t get the thread dump, but it’s more convenient to use during development.

yCrash

Although thread dumps are readable and can be analyzed directly, this approach isn’t very convenient. We used a simple application with around twenty threads, but it’s impossible to get a comprehensive understanding just by reading raw files in applications of significant size.

Also, we can collect thread dumps manually, but it might not be consistent overall. It’s recommended to create several thread dumps with pauses between them. This way, we can see the application in motion, which helps us reason about any possible issues.

fastThread

To analyze thread dumps, we can check fastThread, a tool provided by yCrash. Its purpose is to help with thread dump analysis, and it aims to compare and analyze the application state over time. It starts with the fact that we can upload multiple files at the same time:

The report contains an overview of the thread dump. The visualizations help us to reason about our application:

For example, the report contains the information about the identical thread stacks:

**Fig: fastThread highlighting identical thread stacks**

Additionally, we can compare the reports and get more information about the trends and changes in the thread states. This comprehensive analysis is quite a nice feature, as we can get more dynamic information about our application:

A comparative summary can help us analyze the changes in our application and ensure we don’t have problems during high load. As was shown previously, idle applications won’t utilize all the threads.

Also, we can access the reports from a historical point of view. This way, we won’t lose the reference reports and can always compare them with our current ones:

Apart from comparing the reports within a single upload, we can compare the reports across all the uploads using the yCrash dashboard:

This way, we can track the performance and compare it to the previous weeks and months.

Micro-Metrics Monitoring

Using a yCrash agent can simplify the process. The agent captures a thread dump and additional information to make a comprehensive analysis.

yCrash agent provides the m3 mode, which collects all the necessary metrics from an application and sends it to the yCrash server. The script can run in the background and allows us to monitor our application continuously.

The yCrash script is an excellent way to gather information about running applications. It allows us to automate this process, which helps avoid human errors and reduces the time we need to spend on it.

Conclusion

Monitoring is an integral part of software development. It helps us identify problems and find ways to improve our application.

Thread dump doesn’t create much overhead, but at the same time, it can provide valuable insight into the inner workings of an application. Doing it regularly is a good practice that helps avoid unexpected issues and problems.

yCrash helps automate this process and collects and stores all the reports in the same place. These reports can be easily accessed and compared for further analysis or used as references for the future.

Fast thread

Universal Java Thread Dump Analyzer