The new JavaTM Virtual Machines (VMs)
have features to increase performance, and you can use a number
of tools to increase application performance or reduce the size of
generated class files. Such features and tools improve the
performance of your application with little or no change required to
your application.
Java VM Features
The Java® 2 Plaftform release has
introduced many performance improvements over previous
releases, including faster memory allocation, reduction of class sizes,
improved garbage collection, streamlined monitors and a built-in JIT
as standard.
When using the new Java 2 VM straight out of the box you will see an
improvement, however by understanding how the speed-ups work you can tune
your application to squeeze out every last bit of performance.
Method Inlining
The Java 2 release of the Java VM automatically inlines simple methods at
runtime.
In an un-optimized Java VM, every time a new method is called, a new
stack frame is created. The creation of a new stack frame requires additional
resources as well as some re-mapping of the stack, the end result
is that creating new stack frames incurs a small overhead.
Method inlining increases performance by reducing the number
of method calls your program makes.
The Java VM inlining code inlines methods that return constants or only access
internal fields.
To take advantage of method inlining you can do one of two things. You
can either make a method look attractive to the VM to inline or
manually inline a method if it doesn't break your object model. Manual inlining
in this context means simply moving the code from a method into the method
that is calling it.
Automatic VM inlining is illustrated using the following small example:
public class InlineMe {
int counter=0;
public void method1() {
for(int i=0;i<1000;i++)
addCount();
System.out.println("counter="+counter);
}
public int addCount() {
counter=counter+1;
return counter;
}
public static void main(String args[]) {
InlineMe im=new InlineMe();
im.method1();
}
}
In the current state the addCount method doesn't look very
attractive to the inline detector in the VM because the addCount
method returns a value. To find out if this method is inlined, run the
compiled example with profiling enabled:
java -Xrunhprof:cpu=times InlineMe
This generates a java.hprof.txt output file. The top ten methods
will look similar to this:
CPU TIME (ms) BEGIN (total = 510)
Thu Jan 28 16:56:15 1999
rank self accum count trace method
1 5.88% 5.88% 1 25 java/lang/Character.
<clinit>
2 3.92% 9.80% 5808 13 java/lang/String.charAt
3 3.92% 13.73% 1 33 sun/misc/
Launcher$AppClassLoader.
getPermissions
4 3.92% 17.65% 3 31 sun/misc/
URLClassPath.getLoader
5 1.96% 19.61% 1 39 java/net/
URLClassLoader.access$1
6 1.96% 21.57% 1000 46 InlineMe.addCount
7 1.96% 23.53% 1 21 sun/io/
Converters.newConverter
8 1.96% 25.49% 1 17 sun/misc/
Launcher$ExtClassLoader.
getExtDirs
9 1.96% 27.45% 1 49 java/util/Stack.peek
10 1.96% 29.41% 1 24 sun/misc/Launcher.<init>
If you change the addCount method to no longer return a value,
the VM will inline it for you at runtime. To make the code inline friendly
replace the addCount method with the following:
public void addCount() {
counter=counter+1;
}
And run the profiler again:
java -Xrunhprof:cpu=times InlineMe
This time the java.hprof.txt output should look different.
The addCount method has gone. It has been inlined!
CPU TIME (ms) BEGIN (total = 560)
Thu Jan 28 16:57:02 1999
rank self accum count trace method
1 5.36% 5.36% 1 27 java/lang/
Character.<clinit>
2 3.57% 8.93% 1 23 java/lang/
System.initializeSystemClass
3 3.57% 12.50% 2 47 java/io/PrintStream.<init>
4 3.57% 16.07% 5808 15 java/lang/String.charAt
5 3.57% 19.64% 1 42 sun/net/www/protocol/file/
Handler.openConnection
6 1.79% 21.43% 2 21 java/io/InputStreamReader.fill
7 1.79% 23.21% 1 54 java/lang/Thread.<init>
8 1.79% 25.00% 1 39 java/io/PrintStream.write
9 1.79% 26.79% 1 40 java/util/jar/
JarFile.getJarEntry
10 1.79% 28.57% 1 38 java/lang/Class.forName0
Streamlined synchronization
Synchronized methods and objects have until Java 2 always incurred
an additional performance hit as the mechanism used to implement the locking
of this code used a global monitor registry which was only single threaded
in some areas such as searching for existing monitors. In the Java 2
release, each thread has a monitor registry and so many of the existing
bottlenecks have been removed.
If you have previously used other locking mechanisms because of the
performance hit with synchronized methods it is now worthwhile re-visiting
this code and incorporating the new Java 2 streamlined locks.
In the following example which is creating monitors for the synchronized
block you can achieve a 40% speed up. Time taken was 14ms using JDK 1.1.7
and only 10ms with Java 2 on a Sun Ultra 1.
class MyLock {
static Integer count=new Integer(5);
int test=0;
public void letslock() {
synchronized(count) {
test++;
}
}
}
public class LockTest {
public static void main(String args[]) {
MyLock ml=new MyLock();
long time = System.currentTimeMillis();
for(int i=0;i<5000;i++ ) {
ml.letslock();
}
System.out.println("Time taken="+
(System.currentTimeMillis()-time));
}
}
Java Hotspot
The Java HotSpotTM VM is
Sun Microsystem's next-generation virtual machine implementation.
The Java HotSpot VM adheres to the same specification as
the Java 2 VM, and runs the same byte codes, but it has been re-engineered
to leverage new technologies like adaptive optimization and improved garbage
collection models to dramatically improve the speed of the Java VM.
Adaptive optimization
The Java Hotspot does not include a plug-in JIT compiler but instead
compiles and inline methods that appear it has determined as being the
most used in the application. This means that on the first pass through
the Java bytecodes are interpreted as if you did not have a JIT compiler
present. If the code then appears as being a hotspot in your application the
hotspot compiler will compiler the bytecodes into native code which is then
stored in a cache and inline methods at the same time. See the inlining section
for details on the advantages to inlining code.
One advantage to selective compilation over a JIT compiler is that the byte
compiler can be spend more time generating highly optimized for the areas
that would benefit from the optimization most. The compiler can also
avoid compiling code that may be best run in interpreted mode.
Earlier versions of the Java HotSpot VM were not able to optimize code that was
not currently in use. The downside to this is if the application was in a
huge busy loop the optimizer would not be able to compile the
code for area until the loop had finished. Later Java Hotspot VM releases
use on-stack replacement, meaning that code can be compiled into native
code even if it is in use by the interpreter.
Improved Garbage Collection
The garbage collector used in the Java HotSpot VM introduces several
improvements over existing garbage collectors. The first is that the
garbage collector is termed a fully accurate collector. What this means is
that the garbage collector knows exactly what is an object reference and
what is just data. The use of direct references to objects on the heap in a
Java HotSpot VM instead of using object handles. This increased knowledge means that memory fragmentation
can be reduced which results in a more compact memory footprint.
The second improvement is in the use of generational copying. Java creates
a large number of objects on the heap and often these objects are short
lived. By placing newly created objects in a memory bucket, waiting for
the bucket to fill up and then only copy the remaining live objects to a
new area the block of memory that the bucket used can be freed in one block.
This means that the VM does not have to search for a hole to fit each
new object in the heap and means that smaller sections of memory need to
be manipulated at a time.
For older objects the garbage collector makes a sweep through the heap
and compacts holes from dead objects directly, removing the need for a
free list used in earlier garbage collection algorithms.
The third area of improvement is to remove the perception of garbage
collection pauses by staggering the compaction of large free object spaces
into smaller groups and compacting them incrementally.
Fast Thread Synchronization
The Java HotSpot VM also improves existing synchronized code. Synchronized
methods and code blocks have always had a performance overhead when run in a Java VM. The Java HotSpot implements the monitor entry and exit synchronization points itself and does not depend on the local OS to provide this synchronization.
This results in a large speed improvement especially to often heavily
synchronized GUI applications.
Just-In-Time Compilers
The simplest tool used to increase the performance of your application is
the Just-In-Time (JIT) compiler. A JIT is a code generator that converts
Java bytecode into native machine code. Java programs invoked with
a JIT generally run much faster than when the bytecode is executed by the
interpreter. The Java Hotspot VM removes the need for a JIT compiler in most
cases however you may still find the JIT compiler being used in earlier releases.
The JIT compiler was first made available
as a performance update in the Java Development Kit
(JDKTM) 1.1.6 software release and is
now a standard tool invoked whenever you use the
java interpreter command in the Java 2 platform release. You
can disable the JIT compiler using the -Djava.compiler=NONE
option to the Java VM. This is covered in more detail at the end of the JIT
section.
How do JIT Compilers work?
JIT compilers are supplied as standalone platform-dependent native libraries.
If the JIT Compiler library exists, the Java VM initializes Java
Native Interface (JNI) native code hooks to call JIT functions available in
that library instead of the equivalent function in the interpreter.
The java.lang.Compiler class is used to load the
native library and start the initialization inside the JIT compiler.
When the Java VM invokes a Java method, it uses an invoker method as
specified in the method block of the loaded class object. The Java VM
has several invoker methods, for example, a different invoker is used if
the method is synchronized or if it is a native method.
The JIT compiler uses its own invoker. Sun production releases check the method
access bit for value ACC_MACHINE_COMPILED to notify the
interpreter that the code for this method has already been compiled and stored
in the loaded class.
When does the code become JIT compiled code?
When a method is called the first time the JIT compiler compiles the method
block into native code for this method and stored that in the code block for that method.
Once the code has been compiled the ACC_MACHINE_COMPILED bit,
which is used on the Sun platform, is set.
How can I see what the JIT compiler is doing?
The JIT_ARGS environment variable allows simple control of the
Sun Solaris JIT compiler. Two useful values are trace and exclude(list) . To exclude the methods from the InlineMe example and show a trace set
JIT_ARGS as follows:
Unix:
export JIT_ARGS="trace exclude(InlineMe.addCount
InlineMe.method1)"
$ java InlineMe
Initializing the JIT library ...
DYNAMICALLY COMPILING java/lang/System.getProperty
mb=0x63e74
DYNAMICALLY COMPILING java/util/Properties.getProperty
mb=0x6de74
DYNAMICALLY COMPILING java/util/Hashtable.get
mb=0x714ec
DYNAMICALLY COMPILING java/lang/String.hashCode
mb=0x44aec
DYNAMICALLY COMPILING java/lang/String.equals
mb=0x447f8
DYNAMICALLY COMPILING java/lang/String.valueOf
mb=0x454c4
DYNAMICALLY COMPILING java/lang/String.toString
mb=0x451d0
DYNAMICALLY COMPILING java/lang/StringBuffer.<init>
mb=0x7d690
<<<< Inlined java/lang/String.length (4)
Notice that inlined methods such as String.length are exempt.
The String.length is also a special method as it is normally
compiled into an internal shortcut bytecode by the Java Interpreter. When
using the JIT compiler these optimizations provided by the Java Interpreter
are disabled to enable the JIT compiler to understand which method is
being called.
How to use the JIT to your advantage
The first thing to remember is that the JIT compiler achieves
most of its speed improvements the second time it calls a method. The JIT
compiler does compile the whole method instead of interpreting it line by
line which can also be a performance gain for when running an application
with the JIT enabled. This means that if code is only called once you will not see a significant performance gain. The JIT compiler also ignores class constructors so if possible keep constructor code to a minimum.
The JIT compiler also achieves a minor performance gain by not pre-checking
certain Java boundary conditions such as Null pointer
or array out of bounds exceptions. The only way the JIT compiler knows it
has a null pointer exception is by a signal raised by the operating system.
Because the signal comes from the operating system and not the Java
VM, your program takes a performance hit. To ensure the best performance
when running an application with the JIT, make sure your code is very
clean with no errors like Null pointer or array out of bounds
exceptions.
You might want to disable the JIT compiler if you are running the Java VM in
remote debug mode, or if you want to see source line numbers instead of the
label (Compiled Code) in your Java stack traces. To disable the
JIT compiler, supply a blank or invalid name for the name of the JIT compiler
when you invoke the interpreter command. The following examples
show the javac command to compile the source code
into bytecodes, and two forms of the java command to invoke the
interpreter without the JIT compiler.
javac MyClass.java
java -Djava.compiler=NONE MyClass
or
javac MyClass.java
java -Djava.compiler="" MyClass
Third-Party Tools
Some of the other tools available include those that reduce the size of
the generated Java class files. The Java class file contains an area called
a constant pool. The constant pool keeps a list of strings and
other information for the class file in one place for reference. One of the
pieces of information available in the constant pool are the method and field
name.
The class file refers to a field in the class as a reference to an entry in the
constant pool. This means that as long as the references stay the same,
it does not matter what the values stored in the constant pool are. This
knowledge is exploited by several tools that rewrite the names of the field
and methods in the constant pool into shortened names. This technique can
reduce the class file by a significant percentage with the benefit that a
smaller class file means a shorter network download.
[TOP]
|