java performance tuning
TRANSCRIPT
Java Performance Tuning
Atthakorn Chanthong
What is software tuning?
The software has poor response time.
I need it runs more faster
User Experience
Software tuning is to make application runs faster
Many people think Java application is slow, why?
There are two major reasons
The first is the Bottleneck
The Bottleneck
Lots of Casts
Increased Memory Use
Automatic memory
management by Garbage Collector
All Object are allocated on the Heap.
Java application is not native
The second is The Bad Coding Practice
How to make it run faster?
The bottleneck is unavoidable
But the man couldhave a good coding practice
A good design
A good coding practice
Java application normally run fast enough
So the tuning game comes into play
Knowing the strategy
Tuning Strategy
Identify the main causes
Fix it, and repeat again for other root cause
Choose the quickest and easier one to fix
1
2
3
Inside the strategy
Tuning Strategy
Profile, MeasureProblem Priority
Identify the locationof bottleneck
Think a hypothesis
Code alteration
Test and compare Before/after alteration
Create a test scenario
Yes, it’s better
Still bad?The result isn’t good enough
Need more faster, repeat again
How to measure the software performance?
We use the profiler
Profiler is a programming toolthat can track
the performance of another computer program
Profiler
Profile applicationperformance
Monitor applicationmemory usage
The two common usages of profiler areto analyze a software problem
How we get the profiler?
Don’t pay for it!
An opensource profiler is all around
Some interesting opensource profilers
Opensource Profiler
JConsole
http://java.sun.com/developer/technicalArticles/J2SE/jconsole.html
Opensource Profiler
Eclipse TPTP
http://www.eclipse.org/tptp/index.php
Opensource Profiler
NetBeansBuilt-in Profiler
http://profiler.netbeans.org/
Opensource Profiler
https://visualvm.dev.java.net/
VisualVM
This is pulled out from NetBeans to act as standalone profiler
And much more …
Opensource Profiler
DrMem InfraRED
Jmeasurement
Profiler4j
Cougaar
TIJMP
JRat
We love opensource
Make the brain smart withgood code practice
1st RuleAvoid Object-Creation
Object-Creation causes problemWhy?
Lots of objects in memory means GC does lots of work
Avoid Object-Creation
Program is slow down when GC starts
Avoid Object-Creation
Creating object costs time and CPU effort for application
Reuse objects where possible
Pool Management
Most container (e.g. Vector) objects could be reused rather than created and thrown away
Pool Management
V1 V3 V4 V5
VectorPoolManager
V2
getVector() returnVector()
Pool Management
public static VectorPoolManager vpl = new VectorPoolManager(25)public void doSome(){
for (int i=0; i < 10; i++) {Vector v = vectorPoolManager.getVector( );… do vector manipulation stuffvectorPoolManager.returnVector(v);
}}
public void doSome(){
for (int i=0; i < 10; i++) {Vector v = new Vector()… do vector manipulation stuff
}}
Canonicalizing Objects
Replace multiple object by a single object or just a few
Canonicalizing Objects
public class VectorPoolManager{
private static final VectorPoolManager poolManager;private Vector[] pool;
private VectorPoolManager(int size){
....}public static Vector getVector(){
if (poolManager== null)poolManager = new VectorPoolManager(20);
...return pool[pool.length-1];
}}
Singleton Pattern
Canonicalizing Objects
Boolean b1 = new Boolean(true);Boolean b2 = new Boolean(false);Boolean b3 = new Boolean(false);Boolean b4 = new Boolean(false);
Boolean b1 = Boolean.TRUEBoolean b2 = Boolean.FALSEBoolean b3 = Boolean.FALSEBoolean b4 = Boolean.FALSE
4 objects in memory
2 objects in memory
Canonicalizing Objects
String string = "55";Integer theInt = new Integer(string);
String string = "55";Integer theInt = Integer.valueOf(string);
Object Cached
No Cache
private static class IntegerCache {private IntegerCache(){}
static final Integer cache[] = new Integer[-(-128) + 127 + 1];static {
for(int i = 0; i < cache.length; i++)cache[i] = new Integer(i - 128);
}}
public static Integer valueOf(int i) {final int offset = 128;if (i >= -128 && i <= 127) { // must cache
return IntegerCache.cache[i + offset];}return new Integer(i);
}
Canonicalizing Objects
Caching inside Integer.valueOf(…)
Keyword, ‘final’
Use the final modifier on variableto create immutable internally
accessible object
Keyword, ‘final’
public void doSome(Dimension width, Dimenstion height){
//Re-assign allowwidth = new Dimension(5,5);...
}
public void doSome(final Dimension width, final Dimenstion height){
//Re-assign disallowwidth = new Dimension(5,5);...
}
Auto-Boxing/Unboxing
Use Auto-Boxing as need not as always
Auto-Boxing/UnBoxingInteger i = 0;//Counting by 10Mwhile (i < 100000000){
i++;}
int p = 0;//Counting by 10Mwhile (p < 100000000){
p++;}
Takes 2313 ms
Takes 125 ms
Why it takes 2313/125 =~ 20 times longer?
Auto-Boxing/UnBoxing
Object-Creation made every time we wrap primitive by boxing
2nd RuleKnowing String Better
String is the Objectmostly used in the application
Overlook the String
The software may have the poor performance
Compile-Time String Initialization
Use the string concatenation (+) operator to create
Strings at compile-time.
Compile-Time Initialization
for (int i =0; i < loop; i++){
//Looping 10M roundsString x = "Hello" + "," +" "+ "World";
}
for (int i =0; i < loop; i++){
//Looping 10M roundsString x = new String("Hello" + "," +" "+ "World");
}
Takes 16 ms
Takes 672 ms
Runtime String Initialization
Use StringBuffers/StringBuilder to create Strings at runtime.
Runtime String InitializationString name = "Smith";for (int i =0; i < loop; i++){
//Looping 1M roundsString x = "Hello";x += ",";x += " Mr.";x += name;
}
String name = "Smith";for (int i =0; i < loop; i++){
//Looping 1M roundsString x = (new StringBuffer()).append("Hello")
.append(",").append(" ")
.append(name).toString();}
Takes 10298 ms
Takes 6187 ms
String comparison
Use appropriate methodto compare the String
To Test String is Emptyfor (int i =0; i < loop; i++){
//10m loopsif (a != null && a.equals("")){
}}.
for (int i =0; i < loop; i++){
//10m loopsif (a != null && a.length() == 0){
}}
Takes 125 ms
Takes 31 ms
If two strings have the same lengthString a = “abc”String b = “cdf”for (int i =0; i < loop; i++){
if (a.equalsIgnoreCase(b)){
}}
String a = “abc”String b = “cdf”for (int i =0; i < loop; i++){
if (a.equals(b)){
}}
Takes 750 ms
Takes 125 ms
If two strings have different lengthString a = “abc”String b = “cdfg”for (int i =0; i < loop; i++){
if (a.equalsIgnoreCase(b)){
}}
String a = “abc”String b = “cdfg”for (int i =0; i < loop; i++){
if (a.equals(b)){
}}
Takes 780 ms
Takes 858 ms
String.equalsIgnoreCase() does only 2 steps
It checks for identity and then for Strings being the same size
Intern String
To compare String by identity
Intern String
Normally, string can be created by two ways
Intern String
By String Literals
String s = “This is a string literal.”;
By new String(…)
String s = new String(“This is a string literal.”);
Intern String
Create Strings by new String(…)
JVM always allocate a new memory address for each new String createdeven if they are the same.
Intern String
String a = new String(“This is a string literal.”);String b = new String(“This is a string literal.”);
a
b
“This is a string literal.”
“This is a string literal.”
The different memory address
Intern String
Create Strings by LiteralsStrings will be stored in Pool
Double create Strings by lateralsThey will share as a unique instances
String a = “This is a string literal.”;String b = “This is a string literal.”;
a
b
“This is a string literal.”
Same memory address
Intern String
Intern String
We can point two Stings variable to the same address
if they are the same values.
By using String.intern() method
Intern String
String a = new String(“This is a string literal.”).intern();String b = new String(“This is a string literal.”).intern();
a
b
“This is a string literal.”
Same memory address
The idea is …
Intern String could be used to compare String by identity
Intern String
What “compare by identity”means?
Intern String
Intern String
If (a == b)
If (a.equals(b))
Identity comparison(by reference)
Value comparison
Intern String
By using referenceso identity comparison is fast
In traditionally style
String must be compare by equals()to avoid the negative result
Intern String
But Intern String…
If Strings have different valuethey also have different address.
If Strings have same valuethey also have the same address.
Intern String
Intern String
So we can say that
(a == b) is equivalent to (a.equals(b))
Intern String
String a = "abc";String b = "abc";String c = new String("abc").intern()
For these string variables
They are pointed to the same addresswith the same value
Intern String
for (int i =0; i < loop; i++){
if (a == b){
}}
for (int i =0; i < loop; i++){
if (a.equals(b)){
}}
Takes 312 ms
Takes 32 ms
Intern String
Wow, Intern String is good
Unfortunately, it makes code hard understand, use it carefully
String.intern() comes with overhead
as there is a step to cache
Use Intern String if they are planed to compare two or more times
Intern String
char array instead of String
Avoid doing some stuffs by String object itself for optimal performance
char arrayString x = "abcdefghijklmn";for (int i =0; i < loop; i++){
if (x.charAt(5) == 'x'){
}}
String x = "abcdefghijklmn";char y[] = x.toCharArray();for (int i =0; i < loop; i++){
if ( (20 < y.length && 20 >= 0) && y[20] == 'x'){}
}
Takes 281 ms
Takes 156 ms
3rd RuleException and Cast
Stop exception to be thrown if it is possible
Exception is really expensively to execute
Object obj = null;for (int i =0; i < loop; i++){
try{
obj.hashCode();
} catch (Exception e) {}}
Object obj = null;for (int i =0; i < loop; i++){
if (obj != null){
obj.hashCode();}
}
Takes 18563 ms
Takes 16 ms
Avoid Exception
Cast as Less
We can reduce runtime cost by grouping cast object which is several used
Integer io = new Integer(0);Object obj = (Object)io;for (int i =0; i < loop; i++){
if (obj instanceof Integer){
byte x = ((Integer) obj).byteValue();double d = ((Integer) obj).doubleValue();float f = ((Integer) obj).floatValue();
}}
for (int i =0; i < loop; i++){
if (obj instanceof Integer){
Integer icast = (Integer)obj;byte x = icast.byteValue();double d = icast.doubleValue();float f = icast.floatValue();
}}
Cast as Less
Takes 31 ms
Takes 16 ms
4th RuleThe Rhythm of Motion
Loop Optimization
There are several ways to make a faster loop
Don’t terminate loop with method calls
Eliminate Method Call
byte x[] = new byte[loop];for (int i = 0; i < x.length; i++){
for (int j = 0; j < x.length; j++){}
}
byte x[] = new byte[loop];int length = x.length;for (int i = 0; i < length; i++){
for (int j = 0; j < length; j++){
}}
Takes 109 ms
Takes 62 ms
Method Call generates some overheadin Object Oriented Paradigm
Use int to iterate over loop
Iterate over loop by int
for (int i = 0; i < length; i++){
for (int j = 0; j < length; j++){
}}
for (short i = 0; i < length; i++){
for (short j = 0; j < length; j++){
}}
Takes 62 ms
Takes 125 ms
VM is optimized to use intfor loop iteration
not by byte, short, char
Use System.arraycopy(…)for copying object
instead of running over loop
for (int i = 0; i < length; i++){
x[i] = y[i];}
System.arraycopy(x, 0, y, 0, x.length);
Takes 62 ms
Takes 16 ms
System.arraycopy(….)
System.arraycopy() is native functionIt is efficiently to use
Terminate loop by primitive usenot by function or variable
Terminate Loop by Primitive
for(int i = 0; i < countArr.length; i++){
for(int j = 0; j < countArr.length; j++){
}}
for(int i = countArr.length-1; i >= 0; i--){
for(int j = countArr.length-1; j >= 0; j--){
}}
Takes 424 ms
Takes 298 ms
Primitive comparison is more efficientthan function or variable comparison
The average time of switch vs. if-else
is about equally in random case
for(int i = 0; i < loop; i++){
if (i%10== 0){
} else if (i%10 == 1){
...} else if (i%10 == 8){} else if (i%10 == 9){}
}
for(int i = 0; i < loop; i++){
switch (i%10){
case 0: break;case 1: break;...case 7: break;case 8: break;default: break;
}}
Switch vs. If-else
Takes 2623 ms Takes 2608 ms
Switch is quite fast if the case falls into the middle
but slower than if-else in case of falling at the beginning or default case
** Test against a contiguous range of case values eg, 1,2,3,4,..
Recursive Algorithm
Recursive function is easy to readbut it costs for each recursion
Tail Recursion
A recursive function for which each recursive call to itself is a reduction of the original call.
public static long factorial1(int n){
if (n < 2) return 1L;else return n*factorial1(n-1);
}
public static long factorial1a(int n){
if (n < 2) return 1L;else return factorial1b(n, 1L);
}public static long factorial1b(int n, long result){
if (n == 2) return 2L*result;else return factorial1b(n-1, result*n);
}
Takes 172 ms
Takes 125 ms
Recursive vs. Tail-Recursive
Dynamic Cached Recursive
Do cache to gain more speed
Dynamic-Cached Recursivepublic static long factorial1(int n){
if (n < 2) return 1L;else return n*factorial1(n-1);
}
Takes 172 ms
public static final int CACHE_SIZE = 15;public static final long[ ] factorial3Cache = new long[CACHE_SIZE];public static long factorial3(int n){
if (n < 2) return 1L;else if (n < CACHE_SIZE){
if (factorial3Cache[n] == 0)factorial3Cache[n] = n*factorial3(n-1);
return factorial3Cache[n];}else return n*factorial3(n-1);
}
Takes 172 ms
Takes 94 ms
Dynamic-Cached Tail Recursive
Tail Recursive
Recursive
is better than
is better than
Recursion Summary
5th RuleUse Appropriate Collection
ArrayList vs. LinkedList
Accession
Random Access
ArrayList al = new ArrayList();
for (int i =0; i < loop; i++){
al.get(i);}
LinkedList ll = new LinkedList();for (int i =0; i < loop; i++){
ll.get(i);}
Takes 281 ms
Takes 5828 ms
Sequential Access
ArrayList al = new ArrayList();for (Iterator i = al.iterator(); i.hasNext();){
i.next();}
LinkedList ll = new LinkedList();for (Iterator i = ll.iterator(); i.hasNext();){
i.next();}
Takes 1375 ms
Takes 1047 ms
ArrayList is good for random access
LinkedList is good for sequential access
ArrayList al = new ArrayList();
for (int i =0; i < loop; i++){
al.get(i);}
Takes 281 ms
LinkedList ll = new LinkedList();for (Iterator i = ll.iterator(); i.hasNext();){
i.next();}
Random vs. Sequential Access
Takes 1047 ms
Random Access is better thanSequential Access
Insertion
ArrayList vs. LinkedList
Insertion at zero index
ArrayList al = new ArrayList();
for (int i =0; i < loop; i++){
al.add(0, Integer.valueOf(i));}
LinkedList ll = new LinkedList();for (int i =0; i < loop; i++){
ll.add(0, Integer.valueOf(i));}
Takes 328 ms
Takes 109 ms
LinkedList does insertion better than ArrayList
Vector is likely to ArrayListbut it is synchronized version
Vector vs. ArrayList
Accession and Insertion
Random Accession
ArrayList al = new ArrayList();for (int i =0; i < loop; i++){
al.get(i);}
Vector vt = new Vector();for (int i =0; i < loop; i++){
vt.get(i);}
Takes 281 ms
Takes 422 ms
Sequential Accession
ArrayList al = new ArrayList();for (Iterator i = al.iterator(); i.hasNext();){
i.next();}
Vector vt = new Vector();for (Iterator i = vt.iterator(); i.hasNext();){
i.next();}
Takes 1375 ms
Takes 1890 ms
Insertion
ArrayList al = new ArrayList();for (int i =1; i < loop; i++){
al.add(0, Integer.valueOf(i));}
Vector vt = new Vector();for (int i =0; i < loop; i++){
vt.add(0, Integer.valueOf(i));}
Takes 328 ms
Takes 360 ms
Vector is slower than ArrayList in every method
Use Vector if only synchronize needed
Summary
3601890422Vector
10910475828LinkedList
3281375281ArrayList
InsertionSequential
(Iterator)
Random
(get)
Type
Hashtable vs HashMap
Addition and Accession
Addition
Hashtable ht = new Hashtable();for (int i =0; i < loop; i++){
ht.put(Integer.valueOf(i), Integer.valueOf(i));}
HashMap hm = new HashMap();for (int i =0; i < loop; i++){
hm.put(Integer.valueOf(i), Integer.valueOf(i));}
Takes 453 ms
Takes 328 ms
Accession
Hashtable ht = new Hashtable();for (int i =0; i < loop; i++){
ht.get(Integer.valueOf(i));}
HashMap hm = new HashMap();for (int i =0; i < loop; i++){
hm.get(Integer.valueOf(i));}
Takes 94 ms
Takes 47 ms
Hashtable is synchronized so it is slower than HashMap
Q & A
Reference
� O'Reilly Java Performance Tuning 2nd
� http://www.javaperformancetuning.com
� http://www.glenmccl.com/jperf/
Future Topic
� I/O Logging, and Console Output
� Sorting
� Threading
� Tweak JVM and GC Strategy
The End