a brief explanation on serialization

23
Serialization in Java February 3 2013 This document describes the serialization in Java. SIDIBE Ali Broma [email protected]

Upload: sidibe-ali-broma

Post on 26-May-2015

4.286 views

Category:

Technology


0 download

DESCRIPTION

This documents describe object serialization in Java Context...

TRANSCRIPT

Page 1: A brief explanation on Serialization

Serialization in Java

February 3

2013This document describes the serialization in Java.

SIDIBE Ali Broma [email protected]

Page 2: A brief explanation on Serialization

Introduction ............................................................................................................. 2

Serialization ............................................................................................................. 3

Uses of serialization ............................................................................................................................. 3

Example of Serialization Use.............................................................................................................. 4

Serialization in Java ................................................................................................. 4

Interface Serializable: ......................................................................................................................... 4

Example: Serialization of an Object.................................................................................................. 5

Working with ObjectOutputStream and ObjectInputStream ....................................................... 7

Default Serialization/Deserialization:................................................................................................ 8

Custom Serialization/Deserialization ................................................................................................ 8

Example: Custom Serialization of an Object Developer ................................................................. 9

Serialization and IS-A HAS-A relationship ............................................................. 9

IS-A Relationship ................................................................................................................................ 9

Extends :........................................................................................................................................... 10

Implements :..................................................................................................................................... 10

How Inheritance Affects Serialization .............................................................................................. 10

Constructor chaining :...................................................................................................................... 11

Developer developer=new Developer("nameTest", "skillTest", "logintest", "passwordtest"); .................. 11

Super Constructor and Serialization: ............................................................................................. 12

Relationship HAS- A and Serialization ........................................................................................... 12

Object graphs ................................................................................................................................... 13

Serialization and StackOverflow............................................................................ 14

Serialization and Security ...................................................................................... 15

Avoid Serialization ............................................................................................................................ 16

Avoid Deserlization of our object: ................................................................................................... 16

Serialization ID:................................................................................................................................. 17

Computing Serialization serialiversionUID Algorithm:................................................................ 18

InvalidClassException .......................................................................................... 19

Inheritence ......................................................................................................................................... 19

Let’s produce this exception: ........................................................................................................... 20

Page 3: A brief explanation on Serialization

Introduction

Imagine you want to save the state of one or more objects. For example an Object User you'd have to

use one of the I/O classes to write out the state of the instance variables of all the objects you want to

save. The worst part would be trying to reconstruct new objects that were virtually identical to the

objects you were trying to save. You'd need your own protocol for the way in which you wrote and

restored the state of each object, or you could end up setting variables with the wrong values. For

example, imagine you stored an object that has instance variables for login and password. At the time

you save the state of the object, you could write out the height and weight as two String object in a file,

but the order in which you write them is crucial.

Maybe, you‘d use for example an Object JSON (external lib) to write your object in list of key/value

and restore your object manually or automatically by I/O.

It would be all too easy to re-create the object but mix up the login and password values—using the

saved password as the value for the new object's login and vice versa.

Serialization lets you simply say "save this object and all of its instance variables."

Actually it is a little more interesting than that, because you can add, "... unless I've explicitly marked a

variable as transient, which means, don't include the transient variable's value as part of the object's

serialized state."

As we know Object and instances variable lives always in heap. Local variable live in stack l and they

will be cleaned when method finish to run. Once you've declared and initialized a variable, a natural

question is "How long will this variable be around?" This is a question regarding the scope of

variables.

For the purposes of discussing the scope of variables, we can say that there are four basic scopes:

Static variables have the longest scope; they are created when the class is loaded, and they

survive as long as the class stays loaded in the Java Virtual Machine (JVM).

Instance variables are the next most long-lived; they are created when a new instance is

created, and they live until the instance is removed.

Local variables are next; they live as long as their method remains on the stack. As we'll soon

see, however, local variables can be alive, and still be "out of scope".

Block variables live only as long as the code block is executing

Page 4: A brief explanation on Serialization

As we see in the preview scope, any object will live after application exit in JVM. And my first

question is: “how persist object to exist beyond the lifetime of the JMV and to restore at any

time” this is the main purpose of the processing called “Serialization”.

Serialization Serialization is the process of converting a set of object instance that contain references to each other

into a linear stream of bytes, which can then be sent through a socket, stored to a file, or simply

manipulated as a stream of data, as well as the process of rebuilding those bytes into a live object at

some future time. So when is serialization used? Serialization is used when you want to persist the

object. It is also used by RMI to pass objects between JVMs, either as arguments in a method

invocation from a client to a server or as return values from a method invocation. In general,

serialization is used when we want the object to exist beyond the lifetime of the JVM.

The most obvious is that you can transmit the serialized class over a network, and the recipient can

construct a duplicate of the original instance. Likewise, you can save a serialized structure to a file system. Also, note that serialization is recursive, so you can serialize an entire heterogeneous data

structure in one time

Uses of serialization

Serialization is very used in input/ output operation in Java. Here they are several reasons:

Communication: If you have two machines that are running the same code, and they need to communicate, an easy way is for one machine to build an object with information that it would like to transmit, and then serialize that object to the other machine. It's not the best method for

communication, but it gets the job done.

Persistence: If you want to store the state of a particular operation in a database, it can be easily

serialized to a byte array, and stored in the database for later retrieval.

Deep Copy: If you need an exact replica of an Object, and don't want to go to the trouble of writing your own specialized clone() class, simply serializing the object to a byte array, and then

de-serializing it to another object achieves this goal. Caching: Really just an application of the above, but sometimes an object takes 10 minutes to

build, but would only take 10 seconds to de-serialize. So, rather than hold onto the giant object in memory, just cache it out to a file via serialization, and read it in later when it's needed.

Cross JVM Synchronization: Serialization works across different JVMs that may be running on

different architectures.

In addition of this precedent reason, here is other:

To send data to a remote computer using such client/server Java technologies as RMI or socket

programming.

To "flatten" an object into array of bytes in memory.

To exchange data between applets and servlets.

To store user session in Web applications.

To activate/passivation enterprise java beans.

To send objects between the servers in a cluste

Page 5: A brief explanation on Serialization

Example of Serialization Use

When you want to send something money or other thing, you should assemble, marshal all

and wrap it (Serialization) and send, save or stock it, depending what you want. After if you

or receiver want retrieve the pieces, you have to unwrap (Deserialization)

Banking example: When the account holder tries to withdraw money from the server

through ATM, the account holder information along with the withdraw details will be

serialized (marshalled/flattened to bytes) and sent to server where the details are

deserialized (rebuilt the bytes) and used to perform operations. This will reduce the

network calls as we are serializing the whole object and sending to server and further

request for information from client is not needed by the server.

Stock example: Let’s say a user wants the stock updates immediately when he request for

it. To achieve this, every time we have an update, we can serialize it and save it in a file.

When user requests the information, deserialized it from file and provide the information.

This way we don’t need to make the user wait for the information until we hit the

database, perform computations and get the result.

So far we saw what and when serialization used, in the following paragraph, we will see Serialization

in Java Context. How Java perform Serialization.

Serialization in Java

Java Language provides an interface java.io.Serializable to make class serializable. The serialization

interface has no methods or fields and serves only to identify the semantics of being serializable. It’s

maker interface. Interface serializable in java.io package confirm so far we saw that interface is relate

to input output operation.

Interface Serializable :

Serializable interface is just a marker interface. As we say, it has no method and no field. You can

extend it by another interface or implemented by any class.

The interface java.io.Serializable defines no messages (such interfaces are called “marker” or “tag”

interfaces). Implementing Serializable or extending a class that implements Serializable identifies the

class as one that participates in serialization. Its instances can be used as the argument

of ObjectOutputStream.writeObject and as the result of ObjectInputStream.readObject. If an object is

encountered that is not serializable (e.g., a collection element), these methods

throw NotSerializableException.

Page 6: A brief explanation on Serialization

Most library classes are serializable, including String, collection classes, wrapper classes, GUI

component classes, Date, Colour, Point, and URL. Library classes that are not serializable

include Thread, reflection classes (Method, etc.), stream classes, Socket , Graphics ,

and Image. Generally, these are the classes that have implementations or "peers" that are system-

dependent.

The Java compiler uses the "default serialization" mechanism described in the next section for

implementor of Serializable. (We will see how to customize serialization below.) It stores all non-

static instance variables referents that are serializable objects or primitive types, and all such variables

inherited from serializable ancestors. The default implementation handles shared and circular object

references and class identity. However, if an object includes variables of class type that refer to

objects whose classes are not serializable, the object stream methods will

signal NotSerializableException when attempting to write or read an instance. Similarly, if a

collection is serializable but contains objects that are not serializable, an exception will be

thrown. Note that this is a run-time exception, rather than a compiler error. For example, an object

(like all collections) may have a field of type Object, which is not serializable. If that field refers to an

instance of a Serializable class, no exception occurs upon serialization. We will see below that

variables marked as transient are not serialized. If the default mechanism is adequate (i.e., all fields

are serializable and no special processing is needed), a class need only declare that it

implements Serializable to be serializable

Example : Serialization of an Object

In this documentation we will create a utility class SerializationUtils to process Serialization and

Deserialization operations. SerializationUtils class have two static generic method. We use Generic to

ensure that all object is really pass the IS-A-SERIALIZABLE test. A simple way for that is to use

Serializable instance simply in parameter of the first method and the return type of the second method.

In this document we will not focused in Generic context. We use it to simplify our work.

public final class SerializationUtils {

private SerializationUtils() { }

/** * Write the byte stream of object in parameter in the file

* @param serializable : object to serialize * @param outPutfileName : The file where the stream will be write * @throws IOException : throws if in io error occurs in serialization

*/ public static <T extends Serializable> void serializeObject(T serializable,String outPutfile-

Name) throws IOException { FileOutputStream fileOutputStream=new FileOutputStream(outPutfileName); ObjectOutputStream outputStreamWriter=new ObjectOutputStream(fileOutputStream);

outputStreamWriter.writeObject(serializable);

Page 7: A brief explanation on Serialization

} /**

* Deserialize stream stored in file to an object serializable * @param fileName : The file where are stored the stream file

* @return : Object serializable. * @throws IOException * @throws ClassNotFoundException

*/ public static <T extends Serializable> T deserializeObject(String fileName) throws IOExcep-

tion, ClassNotFoundException{ FileInputStream fileInputStream=new FileInputStream(fileName); ObjectInputStream objectInputStream=new ObjectInputStream(fileInputStream);

return (T)objectInputStream.readObject(); }

}

With this class all we have is just create an class MainTest and use or SerializaleUtil class to process

Serizalization/Deserialization

To continue with our example Serialization, let create a class Developer implements Serializable. As

see so far, we can serialize it.

public class Developer implements Serializable { private String login; private String password; public Developer(String login,String password) { this.login=login; this.password=password; } }

MainClassTest: In our main method, we create and object Developer at first. The output filename is

serialization.ser. You can use all name you want. The extension (.ser) don’t matter. I thing there are

not mistake to understand what this code can do. I’m sorry if it ‘is not clean!

public class MainClassTest {

private static final String IO_FILENAME="serialization.ser";

public static void main(String[] args) {

System.out.println("Begin of Serizalization "); try {

Developer developper =new Developer("teslogin ","tespasswort "); SerializationUtils.serializeObject(developper, IO_FILENAME);

Page 8: A brief explanation on Serialization

System.out.println("Serialization success : "); } catch (IOException e) {

System.out.println("Serialization failed : "+e.getMessage()); e.printStackTrace();

} System.out.println("Begin of Derizalization ");

try { Developer developper=SerializationUtils.deserializeObject(IO_FILENAME);

System.out.println("Deserialization success : "+developper.toString()); } catch (IOException e) { System.out.println("Deserialization failed : "+e.getMessage());

e.printStackTrace(); } catch (ClassNotFoundException e) {

System.out.println("Deserialization failed : "+e.getMessage()); e.printStackTrace(); }

}

}

We must note that all field of class Developer are serializable and the class Developer extends directly

Object. It’s important to note that because if this class reference another Object (HAS-A relationship)

not serializable, the serialization process will be different. At this state, we use serializat ion in very

basic case way.

Working with ObjectOutputStream and ObjectInputStream

The magic of basic serialization happens with just two methods: one to serialize objects and write them

to a stream, and a second to read the stream and deserialize objects.

ObjectOutputStream.writeObject() // serialize and write

ObjectInputStream.readObject() // read and deserialize

The writeObject method is responsible for writing the state of the object for its particular class so that the corresponding readObject method can restore it. The default mechanism for saving the Object's

fields can be invoked by calling out.defaultWriteObject. The method does not need to concern itself with the state belonging to its superclasses or subclasses. State is saved by writing the individual fields

to the ObjectOutputStream using the writeObject method or by using the methods for primitive data types supported by DataOutput.

The readObject method is responsible for reading from the stream and restoring the classes fields. It may call in.defaultReadObject to invoke the default mechanism for restoring the object's non-static and

non-transient fields. The defaultReadObject method uses information in the stream to assign the fields of the object saved in the stream with the correspondingly named fields in the current object. This han-

dles the case when the class has evolved to add new fields. The method does not need to concern itself with the state belonging to its superclasses or subclasses. State is saved by writing the individual fields

Page 9: A brief explanation on Serialization

to the ObjectOutputStream using the writeObject method or by using the methods for primitive data types supported by DataOutput.

The java.io.ObjectOutputStream and java.io.ObjectInputStream classes are considered to be higher-

level classes in the java.io package, that means that you'll wrap them around lower-level classes, such

as java.io.FileOutputStream and java.io.FileInputStream.

Default Serialization/Deserializat ion:

The default serialization mechanism for an object writes the class name of the object, the class

signature, and the values of all non-transient and non-static fields. To use default serialization, a class

implements Serializable or extends a serializable class. If a class's superclass is not serializable, it can

still implement Serializable if the superclass has a no-argument constructor. We will see that a class

must be serializable for it to be used as the parameter or return type of a remote method. The example

above is the default Serialization

If an instance variable should not be serialized, mark it as transient . For example, we would declare

an instance variable transient if its type is not serializable, or its value depends on run-time conditions

or can be computed from other information in the object.

Noticed default serialization give to all transient field the default value.

Custom Serialization/Deserialization

Not every piece of program state can, or should be, serialized. Some things, like FileDescriptor or

Thread instance objects, or Thread are inherently platform-specific or virtual-machine-dependent. If a

FileDescriptor were serialized, it would have no meaning when deserialized in a different virtual

machine. For this reason, and also for important security reasons, not all objects can be serialized.

Even when an object is serializable, it may not make sense for it to serialize all of its state.

The transient modifier keyword has always been a legal part of the Java language, but it was not

assigned any meaning until Java 1.1.

There are situations where a field is not transient--i.e., it does contain an important part of an object's

state--but for some reason (security) it cannot be successfully serialized. A class can define custom

serialization and deserialization behavior for its objects by implementing writeObject() and

readObject() methods.

The methods must be declared private, which is also suprising if you think about it, as they are called

from outside of the class during serialization and deserialization. If a class defines these methods, the

appropriate one is invoked by the ObjectOutputStream orObjectInputStream when an object is

serialized or deserialized.

The fact that these methods are private also prevents them from being declared in the. Sometimes it’s

necessary to use custom Serialization:

Let modify our class Developer by setting password transient. We cannot set field password transient

and use default serialization (because it will be null after deserialization process and for security rea-son we do not want to store the password or send it over a network without encoding it. The variable password is marked transient so that the default mechanism does not serialize its

value. The class defines readObject to call defaultReadObject to serialize the values for all other

Page 10: A brief explanation on Serialization

instance variables and handle the object's class identity, and to use its private decode method when

deserializing the value for the password variable. The writeObject method performs the corresponding

operations in the same order. Note that the methods for readObject and writeObject do not handle the

exceptions that can occur, but propagate them to the caller.

Example : Custom Serialization of an Object Deve loper

To avoid the default serialization/Deserialization we override the writeObject and readObject method

what will be used in serialization process. So we tell to JVM: “Please, if you have to

serialize/Deserialize me, use my own writeObject method than yours, and I know what I do”. Never

forget to use Please because JVM like the good manner.”

public class Developer implements Serializable { private String login; private String password; public Developer(String login,String password) { super(); this.login=login; this.password=password; } private void writeObject(ObjectOutputStream os) throws IOException { os.defaultWriteObject(); os.writeObject(SerializationUtils.encode(password)); } private void readObject(ObjectInputStream is) throws IOException, ClassNotFoundEx-ception { is.defaultReadObject(); String value=(String)is.readObject(); System.out.println("Password : to decode :"+value); password=SerializationUtils.decode(value); } }

So we use our MainTestClass to test serialization and deserialization process. Don’t forget to define method static encode and decode (you have to code how you encode/decode) in your

SerializationUtils. In the precedent part, we have use serialization with a class Developer. As we see, class developer

don’t extends by any other class than Object and have only the field serialization ( String is serizalizable). But what happen if Developer extends another class for exemple Person or have and instance of Computer ?

Serialization and IS-A HAS-A relationship

IS-A Relationship

Page 11: A brief explanation on Serialization

In OO, the concept of IS-A is based on class inheritance or interface implementation. IS-A is a way of

saying, "this thing is a type of that thing." For example, a Developer is a type of Person( it’s

debatable), so in OO terms we can say, " Developer IS-A Person. You express the IS-A relationship in

Java through the keywords extends (for class inheritance) and implements (for interface

implementation).

Extends :

Given a class SuperClass/ and B. If B extends SuperClass so all object of type B pass IS-A SuperClass

test . And it’s not only B, it’s also legal for all object of subclass of B. If the expression (Foo

instanceof Bar) is true, then class Foo IS-A Bar, even if Foo doesn't directly extend Bar, but instead

extends some other class that is a subclass of Bar.

Implements :

Given an interface Interface and B. If B implemts Interface so all object of type B pass IS-A Interface

test. And it’s not only B, it’s also legal for all object of subclass of B, even it not directly implements

Interface.On one word If the expression (Foo instanceof IBar) is true, then class Foo IS-A IBar, even if

Foo doesn't directly implements Bar, but instead extends some other class that is a subclass of Bar.

This is very important for our case. In next step, we will see if an class B is Serializable so, all subclass

of B is also Serializable.

How Inheritance Affects Serialization

Serialization is very cool, but in order to apply it effectively you're going to have to understand how your class's superclasses affect serialization. In this step, we will discuss whether the

superclass is Serializable or not but before look at the object construction step by step.

If a superclass is Serializable, then according to normal Java interface rules, all subclasses of that class au-

tomatically implement Serializable implicitly. In other words, a subclass of a class marked Serializable

passes the IS-A test for Serializable, and thus can be saved without having to explicitly mark the subclass

as Serializable. You simply cannot tell whether a class is or is not Serializable UNLESS you can see the

class inheritance tree to see if any other super classes implement Serializable. If the class does not explicit-

ly extend any other class, and does not implement Serializable, then you know for CERTAIN that the class

is not Serializable, because class Object does NOT implement Serializable. We will not discuss about this

but there is no problem here.

That brings up another key issue with serialization...what happens if a superclass is not marked Serializa-

ble, but the subclass is?

Before to enter in detail, let see constructor call order.

To explain that, given two class Developer and Person (Developer extends Person) let see what happen in

developer instantiation process.

This is the code of Person class, ( for Developer, please add extends Person)

public class Person {

Page 12: A brief explanation on Serialization

protected String name; private String skill; public Person(String name, String skill) { super(); this.name = name; this.skill = skill; } public Person() { super(); // TODO Auto-generated constructor stub } } And Developer class declaration : public class Developer extends Person implements Serializable { private String login; private String password; public Developer(String name, String skill, String login, String password) { super(name, skill); this.login = login; this.password = password; }

}

Now what happen when we want to create a developer object? We know that constructors are invoked

at runtime when you say new on some class type as follows

Constructor chaining :

Developer developer=new Developer("nameTest", "skillTest", "logintest",

"passwordtest");

Developer constructor is invoked. Every constructor invokes the constructor of its superclass

with an (implicit) call to super(), unless the constructor invokes an overloaded constructor of

the same class

Person constructor is invoked (Person is the superclass of Developer).

Object constructor is invoked (Object is the ultimate superclass of all classes, so class Person

extends Object even though you don't actually type "extends Object" into the Person class

declaration. It's implicit.) At this point we're on the top of the stack.

Object instance variables are given their explicit values. By explicit values, we mean values

that are assigned at the time the variables are declared, like "int x = 27", where "27" is the

explicit value (as opposed to the default value) of the instance variable.

Object constructor completes.

Person instance variables are given their explicit values (if any).

Page 13: A brief explanation on Serialization

Person constructor completes.

Developer instance variables are given their explicit values (if any).

Developer constructor completes.

Now we will focused about the Constructor call in Serialization Context!

Super Constructor and Serialization:

But these things do NOT happen when an object is deserialized. When an instance of

a serializable class is deserialized, the constructor does not run, and instance variables are NOT given their initially assigned values! Think about it—if the constructor

were invoked, and/or instance variables were assigned the values given in their declarations, the object you're trying to restore would revert back to its original state, rather than coming back reflecting the changes in its state that happened

sometime after it was created.

Because Person is NOT serializable, any state maintained in the Person class, even though the state variable is inherited by the Developer, isn't going to be restored with the Developer when it's deserial-

ized! The reason is, the (unserialized) Developer part of the Developer is going to be reinitialized just as it would be if you were making a new Developer (as opposed to deserializing one). That means all

the things that happen to an object during construction, will happen—but only to the Person parts of a Developer.

In other words, the instance variables from the Developer's class (private int login; private boolean pass-

word) will be serialized and deserialized correctly, but the inherited variables from the non-serializable

Developer superclass (protected String name;private String skill) will come back with their default/initially assigned values rather than the values they had at the time of serialization. If you are a serializable class, but your superclass is NOT serializable, then any instance variables you

INHERIT from that superclass will be reset to the values they were given during the original construc-tion of the object. This is because the nonserializable class constructor WILL run!

In fact, every constructor ABOVE the first non-serializable class constructor will also run, no matter what, because once the first super constructor is invoked, (during deserialization), it of course invokes its super constructor and so on up the inheritance tree.

Relationship HAS- A and Serialization

HAS-A relationships are based on usage, rather than inheritance. In other words, class A HAS-A B if

code in class A has a reference to an instance of class B. For example, you can say the following HAS-

A relationships allow you to design classes that follow good OO practices by not having monolithic

classes that do a gazillion different things. Classes (andtheir resulting objects) should be specialists.

Specialized classes can actually help reduce bugs. The more specialized the class, the more likely it is

that you can reuse the class in other applications. If you put all the Computer-related code directly into

the Developer class, you'll end up duplicating code in the Designer class, Manager class, and any other

class that might need Computer behavior. By keeping the Computer code in a separate, specialized

Computer class, you have the chance to reuse the Computer class in multiple applications like Design-

er Class or Manager class.

Page 14: A brief explanation on Serialization

Developer class has a Computer, because Developer declares an instance variable of type Computer.

When code invokes writeCode() on a Developer instance, the Developer invokes writeCode() on the

Developer object’s Computer instance variable.

As you see, the computer instance can be used by developer to writeCode or Designer to drawGraph()

or Web. By this, we understand a good pratice of OO oriented programming by giving to each class

one main purpose and only one.

This design is much more cohesive. Instead of one class that does everything, we've broken the system

into three main classes, each with a very specific, or cohesive, role. Because we've built these special-

ized, reusable classes. You guest already something…Cohesion. We will discuss this concept in other

document.

In the serialization of Diaffrin class, we note that it has only primitive as field. Really in our developer

life, we don’t meet only primitive as field. The class instance can have also object references. The seri-

alizability of Diaffrin could be affected depending on whether the fields are serializable or not, We will

discuss all these aspects in the following.

Object graphs

An Object graph is a view of an object system at a particular point in time. Whereas a normal data

model such as a UML Class diagram details the relationships between classes, the object graph relates

their instances. Object diagrams are subsets of the overall object graph. Object-oriented applications

contain complex webs of interrelated objects. Objects are linked to each other by one object either

owning or containing another object or holding a reference to another object. This web of objects is

called an object graph and it is the more abstract structure that can be used in discussing an applica-

tion's state An object graph is a directed graph, which might be cyclic. When stored in RAM, objects occupy dif-

ferent segments of the memory with their attributes and function table, while relationships are repre-

sented by pointers or a different type of global handler in higher- level languages

What does it really mean to save an object? If the instance variables are all primitive types, it's pretty

straightforward. But what if the instance variables are themselves references to objects? What gets

saved? Clearly in Java it wouldn't make any sense to save the actual value of a reference variable, be-

cause the value of a Java reference has meaning only within the context of a single instance of a JVM.

In other words, if you tried to restore the object in another instance of the JVM, even running on the

same computer on which the object was originally serialized, the reference would be useless

Now modify our object Class Developer and let have an instance of Computer.

class Developer { private String login; private Computer computer; } class Computer { private int power; }

Page 15: A brief explanation on Serialization

Now what happens if you save the Developer? If the goal is to save and then restore a Developer, and the

restored Developer is an exact duplicate of the Developer that was saved, then the Developer needs a Com-

puter that is an exact duplicate of the Developer's Computer at the time the Developer was saved. That

means both the Developer and the Computer should be saved. And what if the Computer itself had refer-

ences to other objects—like perhaps a Battery object? This gets quite complicated very quickly. If it were

up to the programmer to know the internal structure of each object the Developer referred to, so that the

programmer could be sure to save all the state of all those objects…whew. That would be a nightmare with

even the simplest of objects.

Fortunately, the Java serialization mechanism takes care of all of this. When you serialize an object, Java

serialization takes care of saving that object's entire "object graph." That means a deep copy of everything

the saved object needs to be restored.

For example, if you serialize a Developer object, the Computer will be serialized automatically. And if the

Computer class contained a reference to another object, THAT object would also be serialized, and so on.

And the only object you have to worry about saving and restoring is the Developer. The other objects re-

quired to fully reconstruct that Developer are saved (and restored) automatically through serialization. Re-

member, you do have to make a conscious choice to create objects that are serializable, by implementing

the Serializable interface

But when we try to serialize the Developer object we get a runtime exception something like this ja-

va.io.NotSerializableException: Computer

What did we forget? The Computer class must ALSO be Serializable. If we modify the Computer class and

make it serializable, then there's no problem. But what would happen if we didn't have access to the Com-

puter class source code?

In other words, what if making the Computer class serializable was not an option!; The Computer class

might itself refer to other non-serializable objects, and without knowing the internal structure of Computer,

you aren't able to make all these fixes.

We cannot make Computer transient because we want to retrieve our Developer instance as we have saved

, not a Developer with Computer null.

The solution of the problem is to let computer transient and use a custom serialization as so far we saw.

Serialization and StackOverflow

In this part we will learn how Serialization/Deserialization can throw StackOverflowError. Here we are explain-

ing something which never must happen in our daily programming but if you’re smarter…it can happen.

public class LinkObject implements Serializable { LinkObject linkObject; public LinkObject(LinkObject linkObject) { super(); this.linkObject = linkObject; }

And in our MainClassTest try to build an big number of LinkObject...

Page 16: A brief explanation on Serialization

public static void main(String ...args){ LinkObject linkObject=null; for(int i=0;i<10001000;i++){ linkObject=new LinkObject(linkObject); } try { SerializationUtils.serializeObject(linkObject, "fail.ser"); } catch (IOException e) { e.printStackTrace(); } try { SerializationUtils.deserializeObject("fail.ser"); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } catch (ClassNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); }

}

Above snippet code will throw StackOverflowError? Why?

The possibility of running out of resources is not generally considered when one writes documentation, only the

expected behaviour given the input and sufficient resources. The documentation doesn't know how much stack

space you have available, so it can't specify number of link in your object graph.

Ok, with 100000 links I got it to produce a stack overflow. Still no JVM crash though.

To summarize, Java's built-in serialization implementation uses excessive stack space when serializing deeply

nested object graphs. So, if you have to serialize/deserialize such graphs as a single entity, you may have to

increase the stack size.

Or provide your own serialization implementation, since you have a clearer understanding of your object model

than Java does. To me this seems as if the default serialization algorithm recursively traverses a graph structure,

which is a bad idea, because the stack is finite and rather small. The heap however can grow and is generally

larger. Is this the whole problem? The default implementation of Java Serialization algorithm is not so

good…You can provide your own custom implementation which can enhanced the reaching object graph.

Serialization and Security

As we learn above, we convert an object developer to byte stream and store it into external file. Serial-ize un object technical mean store all information about necessary to re-create this object by deseriali-zation, so the byte stream content package info, class info, object properties even private. Until some-

one or other problem can access to the file, it can modify it and read it.

Let see how like our Developer object

Page 17: A brief explanation on Serialization

Developer developer=new Developer("logintest", "passwordtest"); bytestream:

By this byte we know or package is : org.sidibe.learning.serialization. Our class is Developer. Class Developer have and java.lang.String login field and password as String. And without wen, we see di-

rectly the value of our Object Developer. Now someone with this byte have all information of about our Developer class. This is a case where Serialization can violate our company security rule. How we can avoid this failure in security.

Avoid Serialization

Serialization is dangerous because it allows adversaries to get their hands on the internal state of your

objects. An adversary can serialize one of your objects into a byte array that can be read. This allows the adversary to inspect the full internal state of your object, including any fields you marked private as well as the internal state of any objects you reference.

To prevent this, you can make your object impossible to serialize. The way to do this is to declare the writeObject method:

private final void writeObject(ObjectOutputStream out) throws java.io.IOException { throw new java.io.IOException("Object cannot be serialized");

}

This method is declared final so that a subclass defined by the adversary cannot override it.

Avoid Deserlization of our object:

This rule is even more important than the preceding one. Even if your class is not serializeable, it may

still be deserializeable. An adversary can create a sequence of bytes that happens to deserialize to an instance of your class. This is dangerous, since you do not have control over what state the deserialized

object is in. You can think of deserialization as another kind of public constructor for your object; un-fortunately, it is a kind of constructor that is difficult for you to control.

You can prevent this kind of attack by making it impossible to deserialize a byte stream into an in-stance of your class. You can do this by declaring the readObject method:

private final void readObject(ObjectInputStream in) throws java.io.IOException { throw new java.io.IOException("Class cannot be deserialized"); }

Page 18: A brief explanation on Serialization

This method is declared final to prevent the adversary from overriding it. To learn more about code security follow this link : http://www.securingjava.com/chapter-seven/chapter-seven-1.html

Serialization ID:

Last December, We have define our class Developer by public class Developer implements Serializable { private String login; private String password; public Developer(String login,String password) { super(); this.login=login; this.password=password; }

We serialize it in the file developer.ser and save in our file storage space. After our come-back from

holiday in Wassoulou(Mali), we decide to add a field private String skill to the class Developer.

Maybe the constructor have changed or other thing about class Developer, anyway, there are some

additional information in class Developer which are not in the file developer.ser.

We know at this time, the byte stream in the file miss some information to build or actual developer

object. So it not represente really our actual class Developer. . Now what happen when we want to

deserilize the stored bytestream into an Object Developer?

The serialVersionUID facilitates versioning of serialized data. Its value is stored with the data when serializing. When de-serializing, the same version is checked to see how the serialized data matches

the current code. If you want to version your data, you normally start with a serialVersionUID of 0, and bump it with every structural change to your class which alters the serialized data (adding or removing non-transient

fields). The built-in de-serialization mechanism (in.defaultReadObject()) will refuse to de-serialize from old

versions of the data. But if you want to you can define your own readObject()-function which can read back old data. This custom code can then check the serialVersionUID in order to know which version the data is in and decide how to de-serialize it. This versioning technique is useful if you store

serialized data which survives several versions of your code. But storing serialized data for such a long time span is not very common. It is far more common to use

the serialization mechanism to temporarily write data to for instance a cache or send it over the network to another program with the same version of the relevant parts of the codebase.

In this case you are not interested in maintaining backwards compatibility. You are only concerned

with making sure that the code bases which are communicating indeed have the same versions of relevant classes. In order to facilitate such a check, you must maintain the serialVersionUID just like

before and not forget to update it when making changes to your classes. If you do forget to update the field, you might end up with two different versions of a class with different structure but with the same serialVersionUID. If this happens, the default mechanism

(in.defaultReadObject()) will not detect any difference, and try to de-serialize incompatible data. Now

Page 19: A brief explanation on Serialization

you might end up with a cryptic runtime error or silent failure (null fields). These types of errors might be hard to find.

So to help this usecase, the Java platform offers you a choice of not setting the serialVersionUIDmanually. Instead, a hash of the class structure will be generated at compile-time

and used as id. This mechanism will make sure that you never have different class structures with the same id, and so you will not get these hard-to-trace runtime serialization failures mentioned above. But there is a backside to the auto-generated id strategy. Namely that the generated ids for the same

class might differ between compilers (as mentioned by Jon Skeet above). So if you communicate serialized data between code compiled with different compilers, it is recommended to maintain the ids

manually anyway.

And if you are backwards-compatible with your data like in the first use case mentioned, you also probably want to maintain the id yourself. This in order to get readable ids and have greater control

over when and how they change.

Computing Serialization serialivers ionUID Algorithm:

The serialVersionUID is computed using the signature of a stream of bytes that reflect the class defini-tion. The sequence of items in the stream is as follows:

1. The class name.

2. The class modifiers written as a 32-bit integer. 3. The name of each interface sorted by name.

4. For each field of the class sorted by field name (except private static and private transi-ent fields:

a. The name of the field.

b. The modifiers of the field written as a 32-bit integer. c. The descriptor of the field.

5. If a class initializer exists, write out the following: a. The name of the method, <clinit>. b. The modifier of the method, java.lang.reflect.Modifier.STATIC, written as a 32-bit in-

teger. c. The descriptor of the method, ()V.

6. For each non-private constructor sorted by method name and signature: a. The name of the method, <init>. b. The modifiers of the method written as a 32-bit integer.

c. The descriptor of the method. 7. For each non-private method sorted by method name and signature:

a. The name of the method. b. The modifiers of the method written as a 32-bit integer. c. The descriptor of the method.

8. The SHA-1 algorithm is executed on the stream of bytes produced by DataOutputStream and produces five 32-bit values sha[0..4].

9. The hash value is assembled from the first and second 32-bit values of the SHA-1 message di-gest. If the result of the message digest, the five 32-bit words H0 H1 H2 H3 H4, is in an array of five int values namedsha, the hash value would be computed as follows:

10. long hash = ((sha[0] >>> 24) & 0xFF) | 11. ((sha[0] >>> 16) & 0xFF) << 8 |

12. ((sha[0] >>> 8) & 0xFF) << 16 | 13. ((sha[0] >>> 0) & 0xFF) << 24 |

Page 20: A brief explanation on Serialization

14. ((sha[1] >>> 24) & 0xFF) << 32 | 15. ((sha[1] >>> 16) & 0xFF) << 40 |

16. ((sha[1] >>> 8) & 0xFF) << 48 | ((sha[1] >>> 0) & 0xFF) << 56;

To learn more about serialization UID follow the link :

InvalidClassException

Inheritence

The Invalid class exception is one of the commonly experienced exception by the java programmers

who use object serialization in their program. There are three main causes for this exception to be

thrown.

They are,

1. serial version of the class

2. containing unknown data types,

3. no-arg constructor.

As the name of the exception indicates that, the class of the object which is serialized or deserialized

becomes invalid due to one of the reasons which I have listed out before. This causes the class to be

invalid and the objects of which cannot be serialized or deserialized. InvalidClassException class ex-

tends the ObjectStreamException.

By removing the default constructor in the precedent sample, the serialization will success but the

deserialization will throw InvalidClassException because there are no default contructor and the class

Developer is not Serializable.

Here we will discuss about the third reason "No-arg constructor". How the absence of a no-arg con-

structor causes this exception to be thrown.

This type of exception is thrown when inheritance is involved in the program. When inheritance is in-

volved, the serialization process proceeds by serializing the objects of child classes first and then

moves up the hierarchy until the non-serializable parent class is reached.

When the objects are to be deserialized it starts from the non-serializable parent class and moves

down the hierarchy. Since the parent class is non-serializable the state information about the members

of the parent class can only be retrieved from the default constructor as it cannot be retrieved from the

stream. Since this state information is available only in the default constructor the absence of which

makes the class invalid. To solve it, we must add a default constructor to the super-class.

In last part of this document we saw that this exception is thrown when the Serialization runtime

detects one of the following problems with a Class.

The serial version of the class does not match that of the class descriptor read from the stream

The class contains unknown datatypes The class does not have an accessible no-arg constructor.

We have explain about no-arg constructor now, we will focused on the first point ie when he serial

version of the class does not match that of the class descriptor read from the stream.

Page 21: A brief explanation on Serialization

Whenever object serialization is performed the objects are saved in a particular file format. This file

format contains a class descriptor for each class of the object that is saved. The class descriptor usually

contains the

class name

serial version unique ID

set of flags

description of the data fields

If the value of the serial version unique Id is not explicitly specified then the jvm will automatically

assigns the value for this variable using class info. We can also be able to assign value for this varia-

ble like 1L or 2L.

One thing should be importantly noted that the serial version unique Id should be same during object

serialization and deserialization. During serialization this serial version unique Id value would be

recorded in the class descriptor. While deserialization current serial version unique Id value would be

compared with the one in the class descriptor. If there is any mismatch between the values this excep-

tion would be thrown.

Let’s produce this exception:

Given our class

public class Developer implements Serializable { /** * */ private static final long serialVersionUID = 1L; private String login; private String password; public Developer(String login, String password) { this.login = login; this.password = password; }

}

Let’s serialize or Object Developer : private static final String IO_FILENAME="serialization_iud.ser"; public static void main(String[] args) { System.out.println("Begin of Serizalization "); try { Developer developer=new Developer("logintest", "passwordtest"); // SerializationUtils.serializeObject(developer, IO_FILENAME); System.out.println("Serialization success : "); } catch (Exception e) { System.out.println("Serialization failed : "+e.getMessage()); e.printStackTrace(); }

Page 22: A brief explanation on Serialization

}

Our Object developer is now serialized in the file serialization_iud.ser.

Later we modifie juste the field private static final long serialVersionUID = 2L;

And deserialized it System.out.println("Begin of Derizalization "); try { Developer develop-per=SerializationUtils.deserializeObject(IO_FILENAME); System.out.println("Deserialization success : "+developper.toString()); } catch (IOException e) { System.out.println("Deserialization failed : "+e.getMessage()); e.printStackTrace(); } catch (ClassNotFoundException e) { System.out.println("Deserialization failed : "+e.getMessage()); e.printStackTrace();

}

We will receive

java.io.InvalidClassException: Developer; local class incompatible: stream classdesc seri-alVersionUID = 1, local class serialVersionUID = 2 at java.io.ObjectStreamClass.initNonProxy(Unknown Source) at java.io.ObjectInputStream.readNonProxyDesc(Unknown Source) at java.io.ObjectInputStream.readClassDesc(Unknown Source) at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source) at java.io.ObjectInputStream.readObject0(Unknown Source) at java.io.ObjectInputStream.readObject(Unknown Source) at

In Deserialization Process, the Runtime check if the serialVersionUID have to match during the serial-

ization and deserialization process. If different, it throw java.io.InvalidClassException but if their matching, the serialization will success even you have add a field to the class.

Only serializationVersionUID matter.

Page 23: A brief explanation on Serialization