Writing Advanced Applications, Chapter 7: Analyzing Stack Traces

Training Index

Writing Advanced Applications
Chapter 7 Continued: Analyzing Stack Traces

[<<BACK] [CONTENTS] [NEXT>>]

Stack traces have often been considered a mystery to developers. There is little or no documentation available, and when you get one or need to generate one, time is always at a premium. The next sections uncover the secrets to debugging stack traces, and by the end, you might consider a stack trace to be a helpful tool for analyzing other programs--not just broken ones!

What is a stack trace produced by the Java^TM platform? It is a user friendly snapshot of the threads and monitors in a Java¹ VM. Depending on how complex your application or applet is, a stack trace can range from fifty lines to thousands of lines of diagnostics.

Regardless of the size of the stack trace, there are a few key things that anyone can find to help diagnose most software problems, whether you are an expert or very new to the Java platform.

There are three popular ways to generate a stack trace: sending a signal to the Java VM; the Java VM generates a stack trace for you; or using debugging tools or API calls.

Sending a Signal to the Java VM
The Java VM Generates a Stack Trace
Using Debugging Tools or API Calls
What to Look For First
Which Release Generated the Stack Trace?
Which Platform Generated the Stack Trace?
Which Thread Package Was Used?
What are the Thread States
Examining Monitors
Putting the Steps Into Practice
Expert's Checklist

Sending a signal to the Java VM

On UNIX platforms you can send a signal to a program with the kill command. This is the quit signal, which is handled by the Java Virtual Machine (VM).

Unix Systems:

For example, on the Solaris^TM platform, you can use the kill -QUIT process_id command, where process_id is the process number of your program.

Alternately, you can enter the key sequence <ctrl>\ in the window where the program started.

Sending this signal instructs a signal handler in the Java VM to recursively print out all the information on the threads and monitor inside the Java VM.

Windows 95/NT:

To generate a stack trace on the Windows 95 or Windows NT platforms, enter the key sequence <ctrl><break> in the window where the program is running.

The Java VM Generates a Stack Trace

If the Java VM experienced an internal error such as a segmentation violation or an illegal page fault, it calls its own signal handler to print out the threads and monitor information.

Using Debugging Tools or API Calls

You can generate a partial stack trace, (which in this case is only the threads information) by using the Thread.dumpStack method, or the printStackTrace method of the Throwable class.

You can also obtain similar information by entering where inside the Java debugger.

If you are successful at generating a stack trace, you should see something similar to this stack trace.

strings core | grep JAVA_HOME

In the Java 2 software release, threads that called methods resulting in a call to native code are indicated in the stack trace.

Which Release Generated The Stack Trace?

In the Java 2 release the stack trace contains the Java Virtual Machine version string, the same information you see when using the -version paramater.

However if there is no version string, you can stilltake a pretty good guess at which release this stack trace came from. Obviously, if you generated the stack trace yourself this should not be much of an issue, but you may see a stack trace posted on a newsgroup or in an email article.

First identify where the Registered Monitor Dump section is in the stack trace:

If you see a utf8 hash table lock in the Registered Monitor Dump, this is a Java 2 platform stack trace. The final release of the Java 2 platform also contains a version string so if a version string is missing this stack trace may be from a Java 2 beta release.
If you see a JNI pinning lock and no utf8 hash lock, this is a JDK 1.1+ release.

If neither of these appears in the Registered Monitor Dump, it is probably a JDK 1.0.2 release.

Which Platform Generated the Stack Trace?

You can also find out if the stack trace came from a Windows 95, an NT, or UNIX machine by looking for any waiting threads. On a UNIX machine the waiting threads are named explicitly. On a Windows 95, or NT machine only a count of the waiting threads is displayed:

Windows 95/NT: Finalize me queue lock: <unowned> Writer: 1
UNIX: Finalize me queue lock: <unowned>
waiting to be notified "Finalizer Thread"

Which Thread Package was Used?

Windows 95 and Windows NT Java VMs are by default native thread Java VMs. UNIX Java VMs are by default green thread Java VMs, they use a pseudo thread implementation. To make your Java VM use native threads you need to supply the -native parameter, for example, java -native MyClass.

By verifying the existence of an Alarm monitor in the stack trace output you can identify that this stack trace came from a green threads Java VM.

What are the Thread States?

You will see many different threads in many different states in a snapshot from a Java VM stack trace. This table describes the various keys and their meanings.

Key Meaning

R Running or runnable thread

S Suspended thread

CW Thread waiting on a condition variable

MW Thread waiting on a monitor lock

MS Thread suspended waiting on a monitor lock

Normally, only threads in R, S, CW or MW should appear in the stack trace. If you see a thread in state MS, report it to Sun Microsystems, through the Java Developer Connection^SM (JDC) Bug Parade feature, because there is a good chance it is a bug. The reason being that most of the time a thread in Monitor Wait (MW) state will appear in the S state when it is suspended.

Monitors are used to manage access to code that should only be run by a single thread at a time. Monitors are covered in more detail in the next section. The other two common thread states you may see are R, runnable threads and CW, threads in a condition wait state. Runnable threads by definition are threads that could be running or are running at that instance of time. On a multi-processor machine running a true multi-processing Operating System it is possible for all the runnable threads to be running at one time. However its more likely for the other runnable threads to be waiting on the thread scheduler to have their turn to run.

Threads in a condition wait state can be thought of as waiting for an event to occur. Often a thread will appear in state CW if it is in a Thread.sleep or in a synchronized wait. In our earlier stack trace our main method was waiting for a thread to complete and to be nofified of its completion. In the stack trace this appears as

"main" (TID:0xebc981e0, sys_thread_t:0x26bb0, 
				state:CW) prio=5
 at java.lang.Object.wait(Native Method)
 at java.lang.Object.wait(Object.java:424)
 at HangingProgram.main(HangingProgram.java:33)

The code that created this stack trace is as follows:

  synchronized(t1) {
    try {
      t1.wait();    //line 33
    }catch (InterruptedException e){}
  }

In the Java 2 release monitor operations, including our wait here, are handled by the Java Virtual Machine through a JNI call to sysMonitor. The condition wait thread is kept on a special monitor wait queue on the object it is waiting on. This explains why even though you are only waiting on an object that the code still needs to be synchronized on that object as it is infact using the monitor for that object.

Examining Monitors

This brings us to the other part of the stack trace: the monitor dump. If you consider that the threads section of a stack trace identifies the multithreaded part of your application, then the monitors section represents the parts of your application that are single threaded.

It may be easier to imagine a monitor as a car wash. In most car washes, only one car can be in the wash at a time. In your Java code only one thread at a time can have the lock to a synchronized piece of code. All the other threads queue up to enter the synchronized code just as cars queue up to enter the car wash.

A monitor can be thought of as a lock on an object, and every object has a monitor. When you generate a stack trace, monitors are either listed as being registered or not. In the majority of cases these registered monitors, or system monitors, should not be the cause of your software problems, but it helps to be able to understand and recognize them. The following table describes the common registered monitors:

Monitor Description

utf8 hash table Locks the hashtable of defined
i18N Strings that were loaded from the class constant pool.

JNI pinning lock Protects block copies of arrays to native method code.

JNI global reference lock Locks the global reference table which holds values that need to be explicitly freed, and will outlive the lifetime of the native method call.

BinClass lock Locks access to the loaded and resolved classes list. The global table list of classes

Class linking lock Protects a classes data when loading native libraries to resolve symbolic references

System class loader lock Ensures that only one thread is loading a system class at a time.

Code rewrite lock Protects code when an optimization is attempted.

Heap lock Protects the Java heap during heap memory management

Monitor cache lock Only one thread can have access to the monitor cache at a time this lock ensures the integrity of the monitor cache

Dynamic loading lock Protects Unix green threads JVMs from loading the shared library stub libdl.so more than once at a time.

Monitor IO lock Protects physical I/O for example,
open and read.

User signal monitor Controls access to the signal handler if a user signal USRSIG in green threads JVMs.

Child death monitor Controls access to the process wait information when using the runtime system calls to run locals commands in a green threads JVM.

I/O Monitor Controls access to the threads file descriptors for poll/select events

Alarm Monitor Controls access to a clock handler used in green threads JVMs to handle timeouts

Thread queue lock Protects the queue of active threads

Monitor registry Only one thread can have access to the monitor registry at a time this lock ensures the integrity of that registry

Has finalization queue lock * Protects the list of queue lock objects that have been garbage-collected, and deemed to need finalization. They are copied to the Finalize me queue

Finalize me queue lock * Protects a list of objects that can be finalized at leisure

Name and type hash table lock * Protects the JVM hash tables of constants and their types

String intern lock * Locks the hashtable of defined
Strings that were loaded from the class constant pool

Class loading lock * Ensures only one thread loads a class at a time

Java stack lock * Protects the free stack segments list

Note: * Lock only appeared in pre-Java 2 stack traces

The monitor registry itself is protected by a monitor. This means the thread that owns the lock is the last thread to use a monitor. It is very likely this thread is also the current thread. Because only one thread can enter a synchronized block at a time, other threads queue up at the start of the synchronized code and appear as thread state MW. In the monitor cache dump, they are denoted as "waiting to enter" threads. In user code a monitor is called into action wherever a synchronized block or method is used.

Any code waiting on an object or event (a wait method) also has to be inside a synchronized block. However, once the wait method is called, the lock on the synchronized object is given up.

When the thread in the wait state is notified of an event to the object, it has to compete for exclusive access to that object, and it has to obtain the monitor. Even when a thread has sent a "notify event" to the waiting threads, none of the waiting threads can actually gain control of the monitor lock until the notifying thread has left its synchronized code block.

You will see "Waiting to be notified" for threads at the wait method

Putting the Steps Into Practice

Example 1

Consider a real-life problem such as Bug ID 4098756, for example. You can find details on this bug in JDC Bug Parade. This bug documents a problem that occurs when using a Choice Component on Windows 95.

When the user selects one of the choices from the Choice Component using the mouse, everything is fine. However, when the user tries to use an Arrow key to move up or down the list of choices, the Java application freezes.

Fortunately, this problem is reproducible and there was a Java stack trace to help track down the problem. The full stack trace is in the bug report page, but you only need to focus on the following two key threads:


"AWT-Windows" (TID:0xf54b70, 
sys_thread_t:0x875a80,Win32ID:0x67, 
state:MW) prio=5
java.awt.Choice.select(Choice.java:293)
sun.awt.windows.WChoicePeer.handleAction(
                              WChoicePeer.java:86)

"AWT-EventQueue-0" (TID:0xf54a98,sys_thread_t:0x875c20,
Win32ID:0x8f, state:R) prio=5
java.awt.Choice.remove(Choice.java:228)
java.awt.Choice.removeAll(Choice.java:246)

The AWT-EventQueue-0 thread is in a runnable state inside the remove method. Remove is synchronized, which explains why the AWT-Windows thread cannot enter the select method. The AWT-Windows thread is in MW state (monitor wait); however, if you keep taking stack traces, this situation does not change and the graphical user interface (GUI) appears to have frozen.

This indicates that the remove call never returned. By following the code path to the ChoicePeer class, you can see this is making a native MFC call that does not return. That is where the real problem lies and is a bug in the Java core classes. The user's code was okay.

Example 2

In this second example you will investigate a bug that on initial outset appears to be a fault in Swing but as you will discover is due to the fact that Swing is not thread safe.

Again the bug report is available to view on the JDC site, the bug number this time is 4098525.

Here is a cut down sample of the code used to reproduce this problem. The modal dialog is being created from within the JPanel paint method.


import java.awt.event.*;
import java.awt.*;
import java.util.*;
import javax.swing.*;

class MyDialog extends Dialog 
                         implements ActionListener {

    MyDialog(Frame parent) {
        super(parent, "My Dialog", true); 
        Button okButton = new Button("OK");
        okButton.addActionListener(this);
        add(okButton);
        pack();
    }

    public void actionPerformed(ActionEvent event) {
         dispose();
    }
}

public class Tester extends JPanel {

    MyDialog myDialog;
    boolean firstTime = true;

    public Tester (JFrame frame) throws Exception {
        super();
	myDialog = new MyDialog(frame);
    }

    void showDialogs() {
        myDialog.show();
    }

    public void paint(Graphics g) {
        super.paint(g);
        if (firstTime) {
           firstTime = false;
           showDialogs();
        }
    }

    public static void main(String args[]) 
                              throws Exception {

       JFrame frame = new JFrame ("Test");
       Tester gui = new Tester(frame);
       frame.getContentPane().add(gui);
       frame.setSize(800, 600);
       frame.pack();
       frame.setVisible(true);
    }
}

When you run this program you find that it deadlocks straight away. By taking a stack trace you see the these key threads.

The stack trace you have here is slightly different to the stack trace that appears in the bug report, but caused by the same effect. We are also using the Java 2 release to generate the trace and supplied the option -Djava.compiler=NONE when you ran the program so that you could see the source line numbers. The thread to look for is the thread in MW, monitor wait which in this case is thread AWT-EventQueue-1


"AWT-EventQueue-1" (
       TID:0xebca8c20, sys_thread_t:0x376660, 
				state:MW) prio=6
 at java.awt.Component.invalidate(Component.java:1664)
 at java.awt.Container.invalidate(Container.java:507)
 t java.awt.Window.dispatchEventImpl(Window.java:696)
 at java.awt.Component.dispatchEvent(
                          Component.java:2289)
 at java.awt.EventQueue.dispatchEvent(
                          EventQueue.java:258)
 at java.awt.EventDispatchThread.run(
                          EventDispatchThread.java:68)

If you look for that line in file java/awt/Component.java which is contained in the src.jar archive, you see the following:

    public void invalidate() {
        synchronized (getTreeLock()) { //line 1664

This is where our application is stuck, it is waiting for the getTreeLock monitor lock to become free. The next task is to find out which thread has this getTreeLock monitor lock held.

To see who is holding this monitor lock you look at the Monitor cache dump and in this example you can see the following:


Monitor Cache Dump:
  java.awt.Component$AWTTreeLock@EBC9C228/EBCF2408: 
	owner "AWT-EventQueue-0" ( 0x263850) 3 entries
  Waiting to enter:
    "AWT-EventQueue-1" (0x376660)

The method getTreeLock monitor is actually a lock on a specially created inner class object ofAWTTreeLock. This is the code used to create that lock in file Component.java.

    static final Object LOCK = new AWTTreeLock();
    static class AWTTreeLock {}

The current owner is AWT-EventQueue-0. Thie thread called our paint method to create our modal Dialog via a call to paintComponent. paintComponent itself was called from an update call of JFrame.

So where was the lock set? Well there is no simple way to find out which stack frame actually held the lock but on a simple search of javax.swing.JComponent you see that getTreeLock is called inside the method paintChildren which you left at line 388.

at Tester.paint(Tester.java:39)
at javax.swing.JComponent.paintChildren(
                            JComponent.java:388)

The rest of the puzzle is pieced together by analyzing the MDialogPeer show method. The Dialog code creates a new ModalThread which is why you see an AWT-Modal thread in the stack trace output, this thread is used to post the Dialog. It is when this event is dispatched using AWT-EventQueue-1 which used to be the AWT Dispatch proxy that getTreeLock monitor access is required and so you have a deadlock.

Unfortunately Swing code is not designed to be thread safe and so the workaround in this example is to not create modal dialogs inside a Swing paint methods. Since Swing has to do alot of locking and calculations as to which parts of a lightweight component needs to be painted it is strongly advised to not include sychronized code or code that will result in a synchronized call such as in a modal dialog, inside paint method.

This completes Java stack traces theory, and you should now know what to look for the next time you see a stack trace. To save time, you should make full use of the JDC bug search to see if the problem you are having has already been reported by someone else.

Expert's Checklist

To summarize, these are the steps to take the next time you come across a problem in a Java program.

Hanging, deadlocked or frozen programs: If you think your program is hanging, generate a stack trace. Examine the threads in states MW or CW. If the program is deadlocked, some of the system threads will probably show up as the current thread because there is nothing else for the Java VM to do.
Crashed or aborted programs: On UNIX look for a core file. You can analyze this file in a native debugging tool such as gdb or dbx. Look for threads that have called native methods. Because Java technology uses a safe memory model, any corruption probably occurred in the native code. Remember that the Java VM also uses native code so it might not be a bug in your application.
Busy programs: The best course of action you can take for busy programs is to generate frequent stack traces. This will narrow down the code path that is causing the errors, and you can start your investigation from there.

[TOP]

_______
¹ As used on this web site, the terms "Java virtual machine" or "JVM" mean a virtual machine for the Java platform.

[ This page was updated: 13-Oct-99 ]

Glossary - Applets - Tutorial - Employment - Business & Licensing - Java Store - Java in the Real World

FAQ | Feedback | Map | A-Z Index

For more information on Java technology
and other software from Sun Microsystems, call:
(800) 786-7638
Outside the U.S. and Canada, dial your country's AT&T Direct Access Number first.

Writing Advanced Applications Chapter 7 Continued: Analyzing Stack Traces