Stack traces have often been considered a mystery to developers.
There is little or no documentation available, and when you get one
or need to generate one, time is always at a premium. The next
sections uncover the secrets to debugging stack traces, and by the end,
you might consider a stack trace to be a helpful tool for analyzing other
programs--not just broken ones!
What is a stack trace produced by the JavaTM platform? It is a user
friendly snapshot
of the threads and monitors in a Java1 VM. Depending on how complex your
application or applet is, a stack trace can range from fifty lines to
thousands of lines of diagnostics.
Regardless of the size of the stack trace, there are a few key
things that anyone can find to help diagnose most software problems,
whether you are an expert or very new to the Java platform.
There are three popular ways to generate a stack trace: sending
a signal to the Java VM; the Java VM generates a stack
trace for you; or using debugging tools or API calls.
Sending a signal to the Java VM
On UNIX platforms you can send a signal to a program with the kill
command. This is the quit signal, which is handled by the Java
Virtual Machine (VM).
Unix Systems:
For example, on the SolarisTM
platform, you can use the kill -QUIT process_id command,
where process_id is the process number of your program.
Alternately, you can enter the key sequence <ctrl>\ in
the window where the program started.
Sending this signal instructs a signal handler in the Java VM to recursively
print out all the information on the threads and monitor inside the Java
VM.
Windows 95/NT:
To generate a stack trace on the Windows 95 or Windows NT platforms, enter
the key sequence <ctrl><break> in the window where the
program is running.
The Java VM Generates a Stack Trace
If the Java VM experienced an internal error such as a segmentation violation
or an illegal page fault, it calls its own signal handler to print out the
threads and monitor information.
Using Debugging Tools or API Calls
You can generate a partial stack trace, (which in this case is
only the
threads information) by using the Thread.dumpStack method,
or the printStackTrace method of the Throwable class.
You can also obtain similar information by entering
where inside the Java debugger.
If you are successful at generating a stack trace, you should see
something similar to this stack trace.
strings core | grep JAVA_HOME
In the Java 2 software release, threads that called methods resulting
in a call to native code are indicated in the stack trace.
Which Release Generated The Stack Trace?
In the Java 2 release the stack trace contains the Java Virtual Machine version string, the same information you see when using the -version paramater.
However if there is no version string, you can stilltake a pretty good guess at
which release this stack trace came from. Obviously, if you generated
the stack trace yourself this should not be much of an issue, but you may
see a stack trace posted on a newsgroup or in an email article.
First identify where the Registered Monitor Dump section is in the stack
trace:
- If you see a
utf8 hash table lock in the Registered
Monitor Dump, this is a Java 2 platform stack trace. The final release of
the Java 2 platform
also contains a version string so if a version string is missing this stack
trace may be from a Java 2 beta release.
- If you see a
JNI pinning lock and
no utf8 hash lock , this is a JDK 1.1+ release.
If neither of these appears in the Registered Monitor Dump, it
is probably a JDK 1.0.2 release.
Which Platform Generated the Stack Trace?
You can also find out if the stack trace came from a Windows 95, an NT,
or UNIX machine by looking for any waiting threads. On a UNIX machine the
waiting threads are named explicitly. On a Windows 95, or NT machine only
a count of the waiting threads is displayed:
- Windows 95/NT: Finalize me queue lock:
<unowned> Writer: 1
- UNIX: Finalize me queue lock: <unowned>
waiting to be notified "Finalizer Thread"
Which Thread Package was Used?
Windows 95 and Windows NT Java VMs are by default native
thread Java VMs. UNIX Java VMs are by default green thread Java
VMs, they use a pseudo
thread implementation. To make your Java VM use native threads you
need to supply the -native parameter, for example,
java -native MyClass .
By verifying the existence of an Alarm monitor in the stack
trace output you can identify that this stack trace came from a green
threads Java VM.
What are the Thread States?
You will see many different threads in many different states in a snapshot
from a Java VM stack trace. This table describes the various keys
and their meanings.
Key | Meaning |
R |
Running or runnable thread |
S |
Suspended thread |
CW |
Thread waiting on a condition variable |
MW |
Thread waiting on a monitor lock |
MS |
Thread suspended waiting on a monitor lock |
Normally, only threads in R , S ,
CW or MW should appear in the
stack trace. If you see a thread in state MS , report
it to Sun Microsystems, through the Java Developer
ConnectionSM (JDC) Bug Parade feature,
because there is a good chance it is a bug. The reason being that most of the
time a thread in Monitor Wait (MW) state will appear in the
S state when it is suspended.
Monitors are used to manage access to code that should only be run
by a single thread at a time. Monitors are covered in more detail in the next section. The other two common thread states you may see are R, runnable threads
and CW, threads in a condition wait state. Runnable threads by definition
are threads that could be running or are running at that instance of time. On a multi-processor machine running a true multi-processing Operating System it is possible for all the runnable threads to be running at one time. However its
more likely for the other runnable threads to be waiting on the thread
scheduler to have their turn to run.
Threads in a condition wait state can be thought of as waiting for an event
to occur. Often a thread will appear in state CW if it is in a Thread.sleep or in a synchronized wait. In our earlier stack trace our main
method was waiting for a thread to complete and to be nofified of its
completion. In the stack trace this appears as
"main" (TID:0xebc981e0, sys_thread_t:0x26bb0,
state:CW) prio=5
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:424)
at HangingProgram.main(HangingProgram.java:33)
The code that created this stack trace is as follows:
synchronized(t1) {
try {
t1.wait(); //line 33
}catch (InterruptedException e){}
}
In the Java 2 release monitor operations, including our wait here, are handled by the Java Virtual Machine through a JNI call to sysMonitor. The condition
wait thread is kept on a special monitor wait queue on the object it is waiting
on. This explains why even though you are only waiting on an object that the
code still needs to be synchronized on that object as it is infact using
the monitor for that object.
Examining Monitors
This brings us to the other part of the stack trace: the monitor dump.
If you consider that the threads section of a stack trace identifies the
multithreaded part of your application, then the monitors section
represents the parts of your application that are single threaded.
It may be easier to imagine a monitor as a car wash. In most car washes, only
one car can be in the wash at a time. In your Java code only one thread
at a time can have the lock to a synchronized piece of code.
All the other threads queue up to enter the synchronized code just
as cars queue up to enter the car wash.
A monitor can be thought of as a lock on an object, and every object
has a monitor. When you generate a stack trace, monitors are either
listed as being registered or not. In the majority of cases these
registered monitors, or system monitors, should not be the cause of
your software problems, but it helps to be able to understand and recognize
them. The following table describes the common registered monitors:
Monitor | Description |
utf8 hash table |
Locks the hashtable of defined
i18N Strings that were loaded from the class constant pool. |
JNI pinning lock |
Protects block copies of arrays to native method code. |
JNI global reference lock |
Locks the global reference table which holds values that need to be
explicitly freed, and will outlive the lifetime of the native method call. |
BinClass lock |
Locks access to the loaded and resolved classes list. The global table list of classes |
Class linking lock |
Protects a classes data when loading native libraries to resolve symbolic references |
System class loader lock |
Ensures that only one thread is loading a system class at a time. |
Code rewrite lock |
Protects code when an optimization is attempted. |
Heap lock |
Protects the Java heap during heap memory management |
Monitor cache lock |
Only one thread can have access to the monitor cache at a time this lock ensures the integrity of the monitor cache |
Dynamic loading lock |
Protects Unix green threads JVMs from loading the shared library stub libdl.so more than once at a time. |
Monitor IO lock |
Protects physical I/O for example,
open and read. |
User signal monitor |
Controls access to the signal handler if a user signal USRSIG in green threads JVMs. |
Child death monitor |
Controls access to the process wait information when using the runtime system calls to run locals commands in a green threads JVM. |
I/O Monitor |
Controls access to the threads file descriptors for poll/select events |
Alarm Monitor |
Controls access to a clock handler used in green threads JVMs to handle timeouts |
Thread queue lock |
Protects the queue of active threads |
Monitor registry |
Only one thread can have access to the monitor registry at a time this lock ensures the integrity of that registry |
Has finalization queue lock * |
Protects the list of queue lock objects that have been garbage-collected,
and deemed to need finalization. They are copied to the Finalize me queue |
Finalize me queue lock * |
Protects a list of objects that can be finalized at leisure |
Name and type hash table lock * |
Protects the JVM hash tables of constants and their types |
String intern lock * |
Locks the hashtable of defined
Strings that were loaded from the class constant pool |
Class loading lock * |
Ensures only one thread loads a class at a time |
Java stack lock * |
Protects the free stack segments list |
Note:
* Lock only appeared in pre-Java 2 stack traces
The monitor registry itself is protected by a monitor. This
means the thread that owns the lock is the last thread to use a
monitor. It is very likely this thread is also the current
thread.
Because only one thread can enter a synchronized block at a time, other
threads queue up at the start of the synchronized code and appear as
thread state MW . In the monitor cache dump, they are denoted
as "waiting to enter" threads. In user code a monitor is called into action
wherever a synchronized block or method is used.
Any code waiting on an object or event (a wait method) also
has to be inside a synchronized block. However, once the wait method is called,
the lock on the synchronized object is given up.
When the thread in the wait state is notified of an event to the object,
it has to compete for exclusive access to that object, and it has to
obtain the monitor. Even when a thread has sent a "notify event"
to the waiting threads, none of the waiting threads can actually gain
control of the monitor lock until the notifying thread has left its
synchronized code block.
You will see "Waiting to be notified" for threads at the wait method
Putting the Steps Into Practice
Example 1
Consider a real-life problem such as Bug ID
4098756, for example.
You can find details on this bug in JDC Bug Parade.
This bug documents a problem that occurs when using a Choice
Component on Windows 95.
When the user selects one of the choices from the Choice Component
using the mouse, everything is fine. However, when the user tries to use
an Arrow key to move up or down the list of choices, the Java application
freezes.
Fortunately, this problem is reproducible and there was a Java stack
trace to help track down the problem.
The full stack trace is in the bug report page, but you only need
to focus on the following two key threads:
"AWT-Windows" (TID:0xf54b70,
sys_thread_t:0x875a80,Win32ID:0x67,
state:MW) prio=5
java.awt.Choice.select(Choice.java:293)
sun.awt.windows.WChoicePeer.handleAction(
WChoicePeer.java:86)
"AWT-EventQueue-0" (TID:0xf54a98,sys_thread_t:0x875c20,
Win32ID:0x8f, state:R) prio=5
java.awt.Choice.remove(Choice.java:228)
java.awt.Choice.removeAll(Choice.java:246)
The AWT-EventQueue-0 thread is in a runnable state inside the
remove method. Remove is synchronized, which
explains why the AWT-Windows thread cannot enter
the select method. The AWT-Windows thread is in
MW state (monitor wait); however, if you keep taking stack
traces, this situation does not change and the graphical user interface
(GUI) appears to have frozen.
This indicates that the remove call never returned. By following
the code path to the ChoicePeer class, you can see this is making
a native MFC call that does not return. That is
where the real problem lies and is a bug in the Java core classes.
The user's code was okay.
Example 2
In this second example you will investigate a bug that on initial
outset appears to be a fault in Swing but as you will discover is due to
the fact that Swing is not thread safe.
Again the bug report is available to view on the JDC site, the bug number this
time is
4098525.
Here is a cut down sample of the code used to reproduce this problem. The modal dialog is being created from within the JPanel paint method.
import java.awt.event.*;
import java.awt.*;
import java.util.*;
import javax.swing.*;
class MyDialog extends Dialog
implements ActionListener {
MyDialog(Frame parent) {
super(parent, "My Dialog", true);
Button okButton = new Button("OK");
okButton.addActionListener(this);
add(okButton);
pack();
}
public void actionPerformed(ActionEvent event) {
dispose();
}
}
public class Tester extends JPanel {
MyDialog myDialog;
boolean firstTime = true;
public Tester (JFrame frame) throws Exception {
super();
myDialog = new MyDialog(frame);
}
void showDialogs() {
myDialog.show();
}
public void paint(Graphics g) {
super.paint(g);
if (firstTime) {
firstTime = false;
showDialogs();
}
}
public static void main(String args[])
throws Exception {
JFrame frame = new JFrame ("Test");
Tester gui = new Tester(frame);
frame.getContentPane().add(gui);
frame.setSize(800, 600);
frame.pack();
frame.setVisible(true);
}
}
When you run this program you find that it deadlocks straight away. By taking
a stack trace you see the these key threads.
The stack trace you have here is slightly different to the stack trace that
appears in the bug report, but caused by the same effect. We are also
using the Java 2 release to generate the trace and supplied the option
-Djava.compiler=NONE when you ran the program so that you
could see the source line numbers. The thread to look for is the thread
in MW, monitor wait which in this case is thread AWT-EventQueue-1
"AWT-EventQueue-1" (
TID:0xebca8c20, sys_thread_t:0x376660,
state:MW) prio=6
at java.awt.Component.invalidate(Component.java:1664)
at java.awt.Container.invalidate(Container.java:507)
t java.awt.Window.dispatchEventImpl(Window.java:696)
at java.awt.Component.dispatchEvent(
Component.java:2289)
at java.awt.EventQueue.dispatchEvent(
EventQueue.java:258)
at java.awt.EventDispatchThread.run(
EventDispatchThread.java:68)
If you look for that line in file java/awt/Component.java which is contained in the src.jar archive, you see the following:
public void invalidate() {
synchronized (getTreeLock()) { //line 1664
This is where our application is stuck, it is waiting for the getTreeLock monitor lock to become free. The next task is to find out which thread has this getTreeLock monitor lock held.
To see who is holding this monitor lock you look at the Monitor cache dump and
in this example you can see the following:
Monitor Cache Dump:
java.awt.Component$AWTTreeLock@EBC9C228/EBCF2408:
owner "AWT-EventQueue-0" ( 0x263850) 3 entries
Waiting to enter:
"AWT-EventQueue-1" (0x376660)
The method getTreeLock monitor is actually a lock on a
specially created inner class object ofAWTTreeLock . This is
the code used to create that lock in file Component.java .
static final Object LOCK = new AWTTreeLock();
static class AWTTreeLock {}
The current owner is AWT-EventQueue-0. Thie thread called our paint method to create our modal Dialog via a call to paintComponent . paintComponent itself was called from an update call of JFrame .
So where was the lock set? Well there is no simple way to find out which
stack frame actually held the lock but on a simple search of javax.swing.JComponent you see that getTreeLock is called inside
the method paintChildren which you left at line 388.
at Tester.paint(Tester.java:39)
at javax.swing.JComponent.paintChildren(
JComponent.java:388)
The rest of the puzzle is pieced together by analyzing the MDialogPeer show method. The Dialog code creates a new ModalThread which is why
you see an AWT-Modal thread in the stack trace output, this
thread is used to post the Dialog. It is when this event is dispatched
using AWT-EventQueue-1 which used to be the AWT Dispatch proxy
that getTreeLock monitor access is required and so you have a deadlock.
Unfortunately Swing code is not designed to be thread safe and so the
workaround in this example is to not create modal dialogs inside a Swing
paint methods. Since Swing has to do alot of locking and calculations as
to which parts of a lightweight component needs to be painted it is strongly
advised to not include sychronized code or code that will result
in a synchronized call such as in a modal dialog, inside paint method.
This completes Java stack traces theory, and you should now know
what to look for the next time you see a stack trace. To save time,
you should make full use of the JDC bug search to see if the problem you
are having has already been reported by someone else.
Expert's Checklist
To summarize, these are the steps to take the next time you
come across a problem in a Java program.
-
Hanging, deadlocked or frozen programs: If you think
your program is hanging, generate a stack trace. Examine the threads in states
MW or CW . If the program is deadlocked, some of
the system threads will probably show up as the current thread because there
is nothing else for the Java VM to do.
-
Crashed or aborted programs: On UNIX look for a core file. You
can analyze this file in a native debugging tool such as
gdb or dbx . Look for threads that have called
native methods. Because Java technology uses a safe memory model, any
corruption probably occurred
in the native code. Remember that the Java VM also uses native code so it
might not be a bug in your application.
-
Busy programs: The best course of action you can take for busy
programs is to generate frequent stack traces. This will narrow down the
code path that is causing the errors, and you can start your investigation
from there.
[TOP]
_______
1 As used on this web site,
the terms "Java virtual
machine" or "JVM" mean a virtual machine
for the Java platform.
|