java performance tuning

Post on 10-May-2015

5.527 Views

Category:

Technology

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Java Performance Tuning

Atthakorn Chanthong

What is software tuning?

The software has poor response time.

I need it runs more faster

User Experience

Software tuning is to make application runs faster

Many people think Java application is slow, why?

There are two major reasons

The first is the Bottleneck

The Bottleneck

Lots of Casts

Increased Memory Use

Automatic memory

management by Garbage Collector

All Object are allocated on the Heap.

Java application is not native

The second is The Bad Coding Practice

How to make it run faster?

The bottleneck is unavoidable

But the man couldhave a good coding practice

A good design

A good coding practice

Java application normally run fast enough

So the tuning game comes into play

Knowing the strategy

Tuning Strategy

Identify the main causes

Fix it, and repeat again for other root cause

Choose the quickest and easier one to fix

1

2

3

Inside the strategy

Tuning Strategy

Profile, MeasureProblem Priority

Identify the locationof bottleneck

Think a hypothesis

Code alteration

Test and compare Before/after alteration

Create a test scenario

Yes, it’s better

Still bad?The result isn’t good enough

Need more faster, repeat again

How to measure the software performance?

We use the profiler

Profiler is a programming toolthat can track

the performance of another computer program

Profiler

Profile applicationperformance

Monitor applicationmemory usage

The two common usages of profiler areto analyze a software problem

How we get the profiler?

Don’t pay for it!

An opensource profiler is all around

Some interesting opensource profilers

Opensource Profiler

JConsole

http://java.sun.com/developer/technicalArticles/J2SE/jconsole.html

Opensource Profiler

Eclipse TPTP

http://www.eclipse.org/tptp/index.php

Opensource Profiler

NetBeansBuilt-in Profiler

http://profiler.netbeans.org/

Opensource Profiler

https://visualvm.dev.java.net/

VisualVM

This is pulled out from NetBeans to act as standalone profiler

And much more …

Opensource Profiler

DrMem InfraRED

Jmeasurement

Profiler4j

Cougaar

TIJMP

JRat

We love opensource

Make the brain smart withgood code practice

1st RuleAvoid Object-Creation

Object-Creation causes problemWhy?

Lots of objects in memory means GC does lots of work

Avoid Object-Creation

Program is slow down when GC starts

Avoid Object-Creation

Creating object costs time and CPU effort for application

Reuse objects where possible

Pool Management

Most container (e.g. Vector) objects could be reused rather than created and thrown away

Pool Management

V1 V3 V4 V5

VectorPoolManager

V2

getVector() returnVector()

Pool Management

public static VectorPoolManager vpl = new VectorPoolManager(25)public void doSome(){

for (int i=0; i < 10; i++) {Vector v = vectorPoolManager.getVector( );… do vector manipulation stuffvectorPoolManager.returnVector(v);

}}

public void doSome(){

for (int i=0; i < 10; i++) {Vector v = new Vector()… do vector manipulation stuff

}}

Canonicalizing Objects

Replace multiple object by a single object or just a few

Canonicalizing Objects

public class VectorPoolManager{

private static final VectorPoolManager poolManager;private Vector[] pool;

private VectorPoolManager(int size){

....}public static Vector getVector(){

if (poolManager== null)poolManager = new VectorPoolManager(20);

...return pool[pool.length-1];

}}

Singleton Pattern

Canonicalizing Objects

Boolean b1 = new Boolean(true);Boolean b2 = new Boolean(false);Boolean b3 = new Boolean(false);Boolean b4 = new Boolean(false);

Boolean b1 = Boolean.TRUEBoolean b2 = Boolean.FALSEBoolean b3 = Boolean.FALSEBoolean b4 = Boolean.FALSE

4 objects in memory

2 objects in memory

Canonicalizing Objects

String string = "55";Integer theInt = new Integer(string);

String string = "55";Integer theInt = Integer.valueOf(string);

Object Cached

No Cache

private static class IntegerCache {private IntegerCache(){}

static final Integer cache[] = new Integer[-(-128) + 127 + 1];static {

for(int i = 0; i < cache.length; i++)cache[i] = new Integer(i - 128);

}}

public static Integer valueOf(int i) {final int offset = 128;if (i >= -128 && i <= 127) { // must cache

return IntegerCache.cache[i + offset];}return new Integer(i);

}

Canonicalizing Objects

Caching inside Integer.valueOf(…)

Keyword, ‘final’

Use the final modifier on variableto create immutable internally

accessible object

Keyword, ‘final’

public void doSome(Dimension width, Dimenstion height){

//Re-assign allowwidth = new Dimension(5,5);...

}

public void doSome(final Dimension width, final Dimenstion height){

//Re-assign disallowwidth = new Dimension(5,5);...

}

Auto-Boxing/Unboxing

Use Auto-Boxing as need not as always

Auto-Boxing/UnBoxingInteger i = 0;//Counting by 10Mwhile (i < 100000000){

i++;}

int p = 0;//Counting by 10Mwhile (p < 100000000){

p++;}

Takes 2313 ms

Takes 125 ms

Why it takes 2313/125 =~ 20 times longer?

Auto-Boxing/UnBoxing

Object-Creation made every time we wrap primitive by boxing

2nd RuleKnowing String Better

String is the Objectmostly used in the application

Overlook the String

The software may have the poor performance

Compile-Time String Initialization

Use the string concatenation (+) operator to create

Strings at compile-time.

Compile-Time Initialization

for (int i =0; i < loop; i++){

//Looping 10M roundsString x = "Hello" + "," +" "+ "World";

}

for (int i =0; i < loop; i++){

//Looping 10M roundsString x = new String("Hello" + "," +" "+ "World");

}

Takes 16 ms

Takes 672 ms

Runtime String Initialization

Use StringBuffers/StringBuilder to create Strings at runtime.

Runtime String InitializationString name = "Smith";for (int i =0; i < loop; i++){

//Looping 1M roundsString x = "Hello";x += ",";x += " Mr.";x += name;

}

String name = "Smith";for (int i =0; i < loop; i++){

//Looping 1M roundsString x = (new StringBuffer()).append("Hello")

.append(",").append(" ")

.append(name).toString();}

Takes 10298 ms

Takes 6187 ms

String comparison

Use appropriate methodto compare the String

To Test String is Emptyfor (int i =0; i < loop; i++){

//10m loopsif (a != null && a.equals("")){

}}.

for (int i =0; i < loop; i++){

//10m loopsif (a != null && a.length() == 0){

}}

Takes 125 ms

Takes 31 ms

If two strings have the same lengthString a = “abc”String b = “cdf”for (int i =0; i < loop; i++){

if (a.equalsIgnoreCase(b)){

}}

String a = “abc”String b = “cdf”for (int i =0; i < loop; i++){

if (a.equals(b)){

}}

Takes 750 ms

Takes 125 ms

If two strings have different lengthString a = “abc”String b = “cdfg”for (int i =0; i < loop; i++){

if (a.equalsIgnoreCase(b)){

}}

String a = “abc”String b = “cdfg”for (int i =0; i < loop; i++){

if (a.equals(b)){

}}

Takes 780 ms

Takes 858 ms

String.equalsIgnoreCase() does only 2 steps

It checks for identity and then for Strings being the same size

Intern String

To compare String by identity

Intern String

Normally, string can be created by two ways

Intern String

By String Literals

String s = “This is a string literal.”;

By new String(…)

String s = new String(“This is a string literal.”);

Intern String

Create Strings by new String(…)

JVM always allocate a new memory address for each new String createdeven if they are the same.

Intern String

String a = new String(“This is a string literal.”);String b = new String(“This is a string literal.”);

a

b

“This is a string literal.”

“This is a string literal.”

The different memory address

Intern String

Create Strings by LiteralsStrings will be stored in Pool

Double create Strings by lateralsThey will share as a unique instances

String a = “This is a string literal.”;String b = “This is a string literal.”;

a

b

“This is a string literal.”

Same memory address

Intern String

Intern String

We can point two Stings variable to the same address

if they are the same values.

By using String.intern() method

Intern String

String a = new String(“This is a string literal.”).intern();String b = new String(“This is a string literal.”).intern();

a

b

“This is a string literal.”

Same memory address

The idea is …

Intern String could be used to compare String by identity

Intern String

What “compare by identity”means?

Intern String

Intern String

If (a == b)

If (a.equals(b))

Identity comparison(by reference)

Value comparison

Intern String

By using referenceso identity comparison is fast

In traditionally style

String must be compare by equals()to avoid the negative result

Intern String

But Intern String…

If Strings have different valuethey also have different address.

If Strings have same valuethey also have the same address.

Intern String

Intern String

So we can say that

(a == b) is equivalent to (a.equals(b))

Intern String

String a = "abc";String b = "abc";String c = new String("abc").intern()

For these string variables

They are pointed to the same addresswith the same value

Intern String

for (int i =0; i < loop; i++){

if (a == b){

}}

for (int i =0; i < loop; i++){

if (a.equals(b)){

}}

Takes 312 ms

Takes 32 ms

Intern String

Wow, Intern String is good

Unfortunately, it makes code hard understand, use it carefully

String.intern() comes with overhead

as there is a step to cache

Use Intern String if they are planed to compare two or more times

Intern String

char array instead of String

Avoid doing some stuffs by String object itself for optimal performance

char arrayString x = "abcdefghijklmn";for (int i =0; i < loop; i++){

if (x.charAt(5) == 'x'){

}}

String x = "abcdefghijklmn";char y[] = x.toCharArray();for (int i =0; i < loop; i++){

if ( (20 < y.length && 20 >= 0) && y[20] == 'x'){}

}

Takes 281 ms

Takes 156 ms

3rd RuleException and Cast

Stop exception to be thrown if it is possible

Exception is really expensively to execute

Object obj = null;for (int i =0; i < loop; i++){

try{

obj.hashCode();

} catch (Exception e) {}}

Object obj = null;for (int i =0; i < loop; i++){

if (obj != null){

obj.hashCode();}

}

Takes 18563 ms

Takes 16 ms

Avoid Exception

Cast as Less

We can reduce runtime cost by grouping cast object which is several used

Integer io = new Integer(0);Object obj = (Object)io;for (int i =0; i < loop; i++){

if (obj instanceof Integer){

byte x = ((Integer) obj).byteValue();double d = ((Integer) obj).doubleValue();float f = ((Integer) obj).floatValue();

}}

for (int i =0; i < loop; i++){

if (obj instanceof Integer){

Integer icast = (Integer)obj;byte x = icast.byteValue();double d = icast.doubleValue();float f = icast.floatValue();

}}

Cast as Less

Takes 31 ms

Takes 16 ms

4th RuleThe Rhythm of Motion

Loop Optimization

There are several ways to make a faster loop

Don’t terminate loop with method calls

Eliminate Method Call

byte x[] = new byte[loop];for (int i = 0; i < x.length; i++){

for (int j = 0; j < x.length; j++){}

}

byte x[] = new byte[loop];int length = x.length;for (int i = 0; i < length; i++){

for (int j = 0; j < length; j++){

}}

Takes 109 ms

Takes 62 ms

Method Call generates some overheadin Object Oriented Paradigm

Use int to iterate over loop

Iterate over loop by int

for (int i = 0; i < length; i++){

for (int j = 0; j < length; j++){

}}

for (short i = 0; i < length; i++){

for (short j = 0; j < length; j++){

}}

Takes 62 ms

Takes 125 ms

VM is optimized to use intfor loop iteration

not by byte, short, char

Use System.arraycopy(…)for copying object

instead of running over loop

for (int i = 0; i < length; i++){

x[i] = y[i];}

System.arraycopy(x, 0, y, 0, x.length);

Takes 62 ms

Takes 16 ms

System.arraycopy(….)

System.arraycopy() is native functionIt is efficiently to use

Terminate loop by primitive usenot by function or variable

Terminate Loop by Primitive

for(int i = 0; i < countArr.length; i++){

for(int j = 0; j < countArr.length; j++){

}}

for(int i = countArr.length-1; i >= 0; i--){

for(int j = countArr.length-1; j >= 0; j--){

}}

Takes 424 ms

Takes 298 ms

Primitive comparison is more efficientthan function or variable comparison

The average time of switch vs. if-else

is about equally in random case

for(int i = 0; i < loop; i++){

if (i%10== 0){

} else if (i%10 == 1){

...} else if (i%10 == 8){} else if (i%10 == 9){}

}

for(int i = 0; i < loop; i++){

switch (i%10){

case 0: break;case 1: break;...case 7: break;case 8: break;default: break;

}}

Switch vs. If-else

Takes 2623 ms Takes 2608 ms

Switch is quite fast if the case falls into the middle

but slower than if-else in case of falling at the beginning or default case

** Test against a contiguous range of case values eg, 1,2,3,4,..

Recursive Algorithm

Recursive function is easy to readbut it costs for each recursion

Tail Recursion

A recursive function for which each recursive call to itself is a reduction of the original call.

public static long factorial1(int n){

if (n < 2) return 1L;else return n*factorial1(n-1);

}

public static long factorial1a(int n){

if (n < 2) return 1L;else return factorial1b(n, 1L);

}public static long factorial1b(int n, long result){

if (n == 2) return 2L*result;else return factorial1b(n-1, result*n);

}

Takes 172 ms

Takes 125 ms

Recursive vs. Tail-Recursive

Dynamic Cached Recursive

Do cache to gain more speed

Dynamic-Cached Recursivepublic static long factorial1(int n){

if (n < 2) return 1L;else return n*factorial1(n-1);

}

Takes 172 ms

public static final int CACHE_SIZE = 15;public static final long[ ] factorial3Cache = new long[CACHE_SIZE];public static long factorial3(int n){

if (n < 2) return 1L;else if (n < CACHE_SIZE){

if (factorial3Cache[n] == 0)factorial3Cache[n] = n*factorial3(n-1);

return factorial3Cache[n];}else return n*factorial3(n-1);

}

Takes 172 ms

Takes 94 ms

Dynamic-Cached Tail Recursive

Tail Recursive

Recursive

is better than

is better than

Recursion Summary

5th RuleUse Appropriate Collection

ArrayList vs. LinkedList

Accession

Random Access

ArrayList al = new ArrayList();

for (int i =0; i < loop; i++){

al.get(i);}

LinkedList ll = new LinkedList();for (int i =0; i < loop; i++){

ll.get(i);}

Takes 281 ms

Takes 5828 ms

Sequential Access

ArrayList al = new ArrayList();for (Iterator i = al.iterator(); i.hasNext();){

i.next();}

LinkedList ll = new LinkedList();for (Iterator i = ll.iterator(); i.hasNext();){

i.next();}

Takes 1375 ms

Takes 1047 ms

ArrayList is good for random access

LinkedList is good for sequential access

ArrayList al = new ArrayList();

for (int i =0; i < loop; i++){

al.get(i);}

Takes 281 ms

LinkedList ll = new LinkedList();for (Iterator i = ll.iterator(); i.hasNext();){

i.next();}

Random vs. Sequential Access

Takes 1047 ms

Random Access is better thanSequential Access

Insertion

ArrayList vs. LinkedList

Insertion at zero index

ArrayList al = new ArrayList();

for (int i =0; i < loop; i++){

al.add(0, Integer.valueOf(i));}

LinkedList ll = new LinkedList();for (int i =0; i < loop; i++){

ll.add(0, Integer.valueOf(i));}

Takes 328 ms

Takes 109 ms

LinkedList does insertion better than ArrayList

Vector is likely to ArrayListbut it is synchronized version

Vector vs. ArrayList

Accession and Insertion

Random Accession

ArrayList al = new ArrayList();for (int i =0; i < loop; i++){

al.get(i);}

Vector vt = new Vector();for (int i =0; i < loop; i++){

vt.get(i);}

Takes 281 ms

Takes 422 ms

Sequential Accession

ArrayList al = new ArrayList();for (Iterator i = al.iterator(); i.hasNext();){

i.next();}

Vector vt = new Vector();for (Iterator i = vt.iterator(); i.hasNext();){

i.next();}

Takes 1375 ms

Takes 1890 ms

Insertion

ArrayList al = new ArrayList();for (int i =1; i < loop; i++){

al.add(0, Integer.valueOf(i));}

Vector vt = new Vector();for (int i =0; i < loop; i++){

vt.add(0, Integer.valueOf(i));}

Takes 328 ms

Takes 360 ms

Vector is slower than ArrayList in every method

Use Vector if only synchronize needed

Summary

3601890422Vector

10910475828LinkedList

3281375281ArrayList

InsertionSequential

(Iterator)

Random

(get)

Type

Hashtable vs HashMap

Addition and Accession

Addition

Hashtable ht = new Hashtable();for (int i =0; i < loop; i++){

ht.put(Integer.valueOf(i), Integer.valueOf(i));}

HashMap hm = new HashMap();for (int i =0; i < loop; i++){

hm.put(Integer.valueOf(i), Integer.valueOf(i));}

Takes 453 ms

Takes 328 ms

Accession

Hashtable ht = new Hashtable();for (int i =0; i < loop; i++){

ht.get(Integer.valueOf(i));}

HashMap hm = new HashMap();for (int i =0; i < loop; i++){

hm.get(Integer.valueOf(i));}

Takes 94 ms

Takes 47 ms

Hashtable is synchronized so it is slower than HashMap

Q & A

Reference

� O'Reilly Java Performance Tuning 2nd

� http://www.javaperformancetuning.com

� http://www.glenmccl.com/jperf/

Future Topic

� I/O Logging, and Console Output

� Sorting

� Threading

� Tweak JVM and GC Strategy

The End

top related