a brief explanation on serialization
Post on 26-May-2015
4.287 Views
Preview:
DESCRIPTION
TRANSCRIPT
Serialization in Java
February 3
2013This document describes the serialization in Java.
SIDIBE Ali Broma jahbromo@gmail.com
Introduction ............................................................................................................. 2
Serialization ............................................................................................................. 3
Uses of serialization ............................................................................................................................. 3
Example of Serialization Use.............................................................................................................. 4
Serialization in Java ................................................................................................. 4
Interface Serializable: ......................................................................................................................... 4
Example: Serialization of an Object.................................................................................................. 5
Working with ObjectOutputStream and ObjectInputStream ....................................................... 7
Default Serialization/Deserialization:................................................................................................ 8
Custom Serialization/Deserialization ................................................................................................ 8
Example: Custom Serialization of an Object Developer ................................................................. 9
Serialization and IS-A HAS-A relationship ............................................................. 9
IS-A Relationship ................................................................................................................................ 9
Extends :........................................................................................................................................... 10
Implements :..................................................................................................................................... 10
How Inheritance Affects Serialization .............................................................................................. 10
Constructor chaining :...................................................................................................................... 11
Developer developer=new Developer("nameTest", "skillTest", "logintest", "passwordtest"); .................. 11
Super Constructor and Serialization: ............................................................................................. 12
Relationship HAS- A and Serialization ........................................................................................... 12
Object graphs ................................................................................................................................... 13
Serialization and StackOverflow............................................................................ 14
Serialization and Security ...................................................................................... 15
Avoid Serialization ............................................................................................................................ 16
Avoid Deserlization of our object: ................................................................................................... 16
Serialization ID:................................................................................................................................. 17
Computing Serialization serialiversionUID Algorithm:................................................................ 18
InvalidClassException .......................................................................................... 19
Inheritence ......................................................................................................................................... 19
Let’s produce this exception: ........................................................................................................... 20
Introduction
Imagine you want to save the state of one or more objects. For example an Object User you'd have to
use one of the I/O classes to write out the state of the instance variables of all the objects you want to
save. The worst part would be trying to reconstruct new objects that were virtually identical to the
objects you were trying to save. You'd need your own protocol for the way in which you wrote and
restored the state of each object, or you could end up setting variables with the wrong values. For
example, imagine you stored an object that has instance variables for login and password. At the time
you save the state of the object, you could write out the height and weight as two String object in a file,
but the order in which you write them is crucial.
Maybe, you‘d use for example an Object JSON (external lib) to write your object in list of key/value
and restore your object manually or automatically by I/O.
It would be all too easy to re-create the object but mix up the login and password values—using the
saved password as the value for the new object's login and vice versa.
Serialization lets you simply say "save this object and all of its instance variables."
Actually it is a little more interesting than that, because you can add, "... unless I've explicitly marked a
variable as transient, which means, don't include the transient variable's value as part of the object's
serialized state."
As we know Object and instances variable lives always in heap. Local variable live in stack l and they
will be cleaned when method finish to run. Once you've declared and initialized a variable, a natural
question is "How long will this variable be around?" This is a question regarding the scope of
variables.
For the purposes of discussing the scope of variables, we can say that there are four basic scopes:
Static variables have the longest scope; they are created when the class is loaded, and they
survive as long as the class stays loaded in the Java Virtual Machine (JVM).
Instance variables are the next most long-lived; they are created when a new instance is
created, and they live until the instance is removed.
Local variables are next; they live as long as their method remains on the stack. As we'll soon
see, however, local variables can be alive, and still be "out of scope".
Block variables live only as long as the code block is executing
As we see in the preview scope, any object will live after application exit in JVM. And my first
question is: “how persist object to exist beyond the lifetime of the JMV and to restore at any
time” this is the main purpose of the processing called “Serialization”.
Serialization Serialization is the process of converting a set of object instance that contain references to each other
into a linear stream of bytes, which can then be sent through a socket, stored to a file, or simply
manipulated as a stream of data, as well as the process of rebuilding those bytes into a live object at
some future time. So when is serialization used? Serialization is used when you want to persist the
object. It is also used by RMI to pass objects between JVMs, either as arguments in a method
invocation from a client to a server or as return values from a method invocation. In general,
serialization is used when we want the object to exist beyond the lifetime of the JVM.
The most obvious is that you can transmit the serialized class over a network, and the recipient can
construct a duplicate of the original instance. Likewise, you can save a serialized structure to a file system. Also, note that serialization is recursive, so you can serialize an entire heterogeneous data
structure in one time
Uses of serialization
Serialization is very used in input/ output operation in Java. Here they are several reasons:
Communication: If you have two machines that are running the same code, and they need to communicate, an easy way is for one machine to build an object with information that it would like to transmit, and then serialize that object to the other machine. It's not the best method for
communication, but it gets the job done.
Persistence: If you want to store the state of a particular operation in a database, it can be easily
serialized to a byte array, and stored in the database for later retrieval.
Deep Copy: If you need an exact replica of an Object, and don't want to go to the trouble of writing your own specialized clone() class, simply serializing the object to a byte array, and then
de-serializing it to another object achieves this goal. Caching: Really just an application of the above, but sometimes an object takes 10 minutes to
build, but would only take 10 seconds to de-serialize. So, rather than hold onto the giant object in memory, just cache it out to a file via serialization, and read it in later when it's needed.
Cross JVM Synchronization: Serialization works across different JVMs that may be running on
different architectures.
In addition of this precedent reason, here is other:
To send data to a remote computer using such client/server Java technologies as RMI or socket
programming.
To "flatten" an object into array of bytes in memory.
To exchange data between applets and servlets.
To store user session in Web applications.
To activate/passivation enterprise java beans.
To send objects between the servers in a cluste
Example of Serialization Use
When you want to send something money or other thing, you should assemble, marshal all
and wrap it (Serialization) and send, save or stock it, depending what you want. After if you
or receiver want retrieve the pieces, you have to unwrap (Deserialization)
Banking example: When the account holder tries to withdraw money from the server
through ATM, the account holder information along with the withdraw details will be
serialized (marshalled/flattened to bytes) and sent to server where the details are
deserialized (rebuilt the bytes) and used to perform operations. This will reduce the
network calls as we are serializing the whole object and sending to server and further
request for information from client is not needed by the server.
Stock example: Let’s say a user wants the stock updates immediately when he request for
it. To achieve this, every time we have an update, we can serialize it and save it in a file.
When user requests the information, deserialized it from file and provide the information.
This way we don’t need to make the user wait for the information until we hit the
database, perform computations and get the result.
So far we saw what and when serialization used, in the following paragraph, we will see Serialization
in Java Context. How Java perform Serialization.
Serialization in Java
Java Language provides an interface java.io.Serializable to make class serializable. The serialization
interface has no methods or fields and serves only to identify the semantics of being serializable. It’s
maker interface. Interface serializable in java.io package confirm so far we saw that interface is relate
to input output operation.
Interface Serializable :
Serializable interface is just a marker interface. As we say, it has no method and no field. You can
extend it by another interface or implemented by any class.
The interface java.io.Serializable defines no messages (such interfaces are called “marker” or “tag”
interfaces). Implementing Serializable or extending a class that implements Serializable identifies the
class as one that participates in serialization. Its instances can be used as the argument
of ObjectOutputStream.writeObject and as the result of ObjectInputStream.readObject. If an object is
encountered that is not serializable (e.g., a collection element), these methods
throw NotSerializableException.
Most library classes are serializable, including String, collection classes, wrapper classes, GUI
component classes, Date, Colour, Point, and URL. Library classes that are not serializable
include Thread, reflection classes (Method, etc.), stream classes, Socket , Graphics ,
and Image. Generally, these are the classes that have implementations or "peers" that are system-
dependent.
The Java compiler uses the "default serialization" mechanism described in the next section for
implementor of Serializable. (We will see how to customize serialization below.) It stores all non-
static instance variables referents that are serializable objects or primitive types, and all such variables
inherited from serializable ancestors. The default implementation handles shared and circular object
references and class identity. However, if an object includes variables of class type that refer to
objects whose classes are not serializable, the object stream methods will
signal NotSerializableException when attempting to write or read an instance. Similarly, if a
collection is serializable but contains objects that are not serializable, an exception will be
thrown. Note that this is a run-time exception, rather than a compiler error. For example, an object
(like all collections) may have a field of type Object, which is not serializable. If that field refers to an
instance of a Serializable class, no exception occurs upon serialization. We will see below that
variables marked as transient are not serialized. If the default mechanism is adequate (i.e., all fields
are serializable and no special processing is needed), a class need only declare that it
implements Serializable to be serializable
Example : Serialization of an Object
In this documentation we will create a utility class SerializationUtils to process Serialization and
Deserialization operations. SerializationUtils class have two static generic method. We use Generic to
ensure that all object is really pass the IS-A-SERIALIZABLE test. A simple way for that is to use
Serializable instance simply in parameter of the first method and the return type of the second method.
In this document we will not focused in Generic context. We use it to simplify our work.
public final class SerializationUtils {
private SerializationUtils() { }
/** * Write the byte stream of object in parameter in the file
* @param serializable : object to serialize * @param outPutfileName : The file where the stream will be write * @throws IOException : throws if in io error occurs in serialization
*/ public static <T extends Serializable> void serializeObject(T serializable,String outPutfile-
Name) throws IOException { FileOutputStream fileOutputStream=new FileOutputStream(outPutfileName); ObjectOutputStream outputStreamWriter=new ObjectOutputStream(fileOutputStream);
outputStreamWriter.writeObject(serializable);
} /**
* Deserialize stream stored in file to an object serializable * @param fileName : The file where are stored the stream file
* @return : Object serializable. * @throws IOException * @throws ClassNotFoundException
*/ public static <T extends Serializable> T deserializeObject(String fileName) throws IOExcep-
tion, ClassNotFoundException{ FileInputStream fileInputStream=new FileInputStream(fileName); ObjectInputStream objectInputStream=new ObjectInputStream(fileInputStream);
return (T)objectInputStream.readObject(); }
}
With this class all we have is just create an class MainTest and use or SerializaleUtil class to process
Serizalization/Deserialization
To continue with our example Serialization, let create a class Developer implements Serializable. As
see so far, we can serialize it.
public class Developer implements Serializable { private String login; private String password; public Developer(String login,String password) { this.login=login; this.password=password; } }
MainClassTest: In our main method, we create and object Developer at first. The output filename is
serialization.ser. You can use all name you want. The extension (.ser) don’t matter. I thing there are
not mistake to understand what this code can do. I’m sorry if it ‘is not clean!
public class MainClassTest {
private static final String IO_FILENAME="serialization.ser";
public static void main(String[] args) {
System.out.println("Begin of Serizalization "); try {
Developer developper =new Developer("teslogin ","tespasswort "); SerializationUtils.serializeObject(developper, IO_FILENAME);
System.out.println("Serialization success : "); } catch (IOException e) {
System.out.println("Serialization failed : "+e.getMessage()); e.printStackTrace();
} System.out.println("Begin of Derizalization ");
try { Developer developper=SerializationUtils.deserializeObject(IO_FILENAME);
System.out.println("Deserialization success : "+developper.toString()); } catch (IOException e) { System.out.println("Deserialization failed : "+e.getMessage());
e.printStackTrace(); } catch (ClassNotFoundException e) {
System.out.println("Deserialization failed : "+e.getMessage()); e.printStackTrace(); }
}
}
We must note that all field of class Developer are serializable and the class Developer extends directly
Object. It’s important to note that because if this class reference another Object (HAS-A relationship)
not serializable, the serialization process will be different. At this state, we use serializat ion in very
basic case way.
Working with ObjectOutputStream and ObjectInputStream
The magic of basic serialization happens with just two methods: one to serialize objects and write them
to a stream, and a second to read the stream and deserialize objects.
ObjectOutputStream.writeObject() // serialize and write
ObjectInputStream.readObject() // read and deserialize
The writeObject method is responsible for writing the state of the object for its particular class so that the corresponding readObject method can restore it. The default mechanism for saving the Object's
fields can be invoked by calling out.defaultWriteObject. The method does not need to concern itself with the state belonging to its superclasses or subclasses. State is saved by writing the individual fields
to the ObjectOutputStream using the writeObject method or by using the methods for primitive data types supported by DataOutput.
The readObject method is responsible for reading from the stream and restoring the classes fields. It may call in.defaultReadObject to invoke the default mechanism for restoring the object's non-static and
non-transient fields. The defaultReadObject method uses information in the stream to assign the fields of the object saved in the stream with the correspondingly named fields in the current object. This han-
dles the case when the class has evolved to add new fields. The method does not need to concern itself with the state belonging to its superclasses or subclasses. State is saved by writing the individual fields
to the ObjectOutputStream using the writeObject method or by using the methods for primitive data types supported by DataOutput.
The java.io.ObjectOutputStream and java.io.ObjectInputStream classes are considered to be higher-
level classes in the java.io package, that means that you'll wrap them around lower-level classes, such
as java.io.FileOutputStream and java.io.FileInputStream.
Default Serialization/Deserializat ion:
The default serialization mechanism for an object writes the class name of the object, the class
signature, and the values of all non-transient and non-static fields. To use default serialization, a class
implements Serializable or extends a serializable class. If a class's superclass is not serializable, it can
still implement Serializable if the superclass has a no-argument constructor. We will see that a class
must be serializable for it to be used as the parameter or return type of a remote method. The example
above is the default Serialization
If an instance variable should not be serialized, mark it as transient . For example, we would declare
an instance variable transient if its type is not serializable, or its value depends on run-time conditions
or can be computed from other information in the object.
Noticed default serialization give to all transient field the default value.
Custom Serialization/Deserialization
Not every piece of program state can, or should be, serialized. Some things, like FileDescriptor or
Thread instance objects, or Thread are inherently platform-specific or virtual-machine-dependent. If a
FileDescriptor were serialized, it would have no meaning when deserialized in a different virtual
machine. For this reason, and also for important security reasons, not all objects can be serialized.
Even when an object is serializable, it may not make sense for it to serialize all of its state.
The transient modifier keyword has always been a legal part of the Java language, but it was not
assigned any meaning until Java 1.1.
There are situations where a field is not transient--i.e., it does contain an important part of an object's
state--but for some reason (security) it cannot be successfully serialized. A class can define custom
serialization and deserialization behavior for its objects by implementing writeObject() and
readObject() methods.
The methods must be declared private, which is also suprising if you think about it, as they are called
from outside of the class during serialization and deserialization. If a class defines these methods, the
appropriate one is invoked by the ObjectOutputStream orObjectInputStream when an object is
serialized or deserialized.
The fact that these methods are private also prevents them from being declared in the. Sometimes it’s
necessary to use custom Serialization:
Let modify our class Developer by setting password transient. We cannot set field password transient
and use default serialization (because it will be null after deserialization process and for security rea-son we do not want to store the password or send it over a network without encoding it. The variable password is marked transient so that the default mechanism does not serialize its
value. The class defines readObject to call defaultReadObject to serialize the values for all other
instance variables and handle the object's class identity, and to use its private decode method when
deserializing the value for the password variable. The writeObject method performs the corresponding
operations in the same order. Note that the methods for readObject and writeObject do not handle the
exceptions that can occur, but propagate them to the caller.
Example : Custom Serialization of an Object Deve loper
To avoid the default serialization/Deserialization we override the writeObject and readObject method
what will be used in serialization process. So we tell to JVM: “Please, if you have to
serialize/Deserialize me, use my own writeObject method than yours, and I know what I do”. Never
forget to use Please because JVM like the good manner.”
public class Developer implements Serializable { private String login; private String password; public Developer(String login,String password) { super(); this.login=login; this.password=password; } private void writeObject(ObjectOutputStream os) throws IOException { os.defaultWriteObject(); os.writeObject(SerializationUtils.encode(password)); } private void readObject(ObjectInputStream is) throws IOException, ClassNotFoundEx-ception { is.defaultReadObject(); String value=(String)is.readObject(); System.out.println("Password : to decode :"+value); password=SerializationUtils.decode(value); } }
So we use our MainTestClass to test serialization and deserialization process. Don’t forget to define method static encode and decode (you have to code how you encode/decode) in your
SerializationUtils. In the precedent part, we have use serialization with a class Developer. As we see, class developer
don’t extends by any other class than Object and have only the field serialization ( String is serizalizable). But what happen if Developer extends another class for exemple Person or have and instance of Computer ?
Serialization and IS-A HAS-A relationship
IS-A Relationship
In OO, the concept of IS-A is based on class inheritance or interface implementation. IS-A is a way of
saying, "this thing is a type of that thing." For example, a Developer is a type of Person( it’s
debatable), so in OO terms we can say, " Developer IS-A Person. You express the IS-A relationship in
Java through the keywords extends (for class inheritance) and implements (for interface
implementation).
Extends :
Given a class SuperClass/ and B. If B extends SuperClass so all object of type B pass IS-A SuperClass
test . And it’s not only B, it’s also legal for all object of subclass of B. If the expression (Foo
instanceof Bar) is true, then class Foo IS-A Bar, even if Foo doesn't directly extend Bar, but instead
extends some other class that is a subclass of Bar.
Implements :
Given an interface Interface and B. If B implemts Interface so all object of type B pass IS-A Interface
test. And it’s not only B, it’s also legal for all object of subclass of B, even it not directly implements
Interface.On one word If the expression (Foo instanceof IBar) is true, then class Foo IS-A IBar, even if
Foo doesn't directly implements Bar, but instead extends some other class that is a subclass of Bar.
This is very important for our case. In next step, we will see if an class B is Serializable so, all subclass
of B is also Serializable.
How Inheritance Affects Serialization
Serialization is very cool, but in order to apply it effectively you're going to have to understand how your class's superclasses affect serialization. In this step, we will discuss whether the
superclass is Serializable or not but before look at the object construction step by step.
If a superclass is Serializable, then according to normal Java interface rules, all subclasses of that class au-
tomatically implement Serializable implicitly. In other words, a subclass of a class marked Serializable
passes the IS-A test for Serializable, and thus can be saved without having to explicitly mark the subclass
as Serializable. You simply cannot tell whether a class is or is not Serializable UNLESS you can see the
class inheritance tree to see if any other super classes implement Serializable. If the class does not explicit-
ly extend any other class, and does not implement Serializable, then you know for CERTAIN that the class
is not Serializable, because class Object does NOT implement Serializable. We will not discuss about this
but there is no problem here.
That brings up another key issue with serialization...what happens if a superclass is not marked Serializa-
ble, but the subclass is?
Before to enter in detail, let see constructor call order.
To explain that, given two class Developer and Person (Developer extends Person) let see what happen in
developer instantiation process.
This is the code of Person class, ( for Developer, please add extends Person)
public class Person {
protected String name; private String skill; public Person(String name, String skill) { super(); this.name = name; this.skill = skill; } public Person() { super(); // TODO Auto-generated constructor stub } } And Developer class declaration : public class Developer extends Person implements Serializable { private String login; private String password; public Developer(String name, String skill, String login, String password) { super(name, skill); this.login = login; this.password = password; }
}
Now what happen when we want to create a developer object? We know that constructors are invoked
at runtime when you say new on some class type as follows
Constructor chaining :
Developer developer=new Developer("nameTest", "skillTest", "logintest",
"passwordtest");
Developer constructor is invoked. Every constructor invokes the constructor of its superclass
with an (implicit) call to super(), unless the constructor invokes an overloaded constructor of
the same class
Person constructor is invoked (Person is the superclass of Developer).
Object constructor is invoked (Object is the ultimate superclass of all classes, so class Person
extends Object even though you don't actually type "extends Object" into the Person class
declaration. It's implicit.) At this point we're on the top of the stack.
Object instance variables are given their explicit values. By explicit values, we mean values
that are assigned at the time the variables are declared, like "int x = 27", where "27" is the
explicit value (as opposed to the default value) of the instance variable.
Object constructor completes.
Person instance variables are given their explicit values (if any).
Person constructor completes.
Developer instance variables are given their explicit values (if any).
Developer constructor completes.
Now we will focused about the Constructor call in Serialization Context!
Super Constructor and Serialization:
But these things do NOT happen when an object is deserialized. When an instance of
a serializable class is deserialized, the constructor does not run, and instance variables are NOT given their initially assigned values! Think about it—if the constructor
were invoked, and/or instance variables were assigned the values given in their declarations, the object you're trying to restore would revert back to its original state, rather than coming back reflecting the changes in its state that happened
sometime after it was created.
Because Person is NOT serializable, any state maintained in the Person class, even though the state variable is inherited by the Developer, isn't going to be restored with the Developer when it's deserial-
ized! The reason is, the (unserialized) Developer part of the Developer is going to be reinitialized just as it would be if you were making a new Developer (as opposed to deserializing one). That means all
the things that happen to an object during construction, will happen—but only to the Person parts of a Developer.
In other words, the instance variables from the Developer's class (private int login; private boolean pass-
word) will be serialized and deserialized correctly, but the inherited variables from the non-serializable
Developer superclass (protected String name;private String skill) will come back with their default/initially assigned values rather than the values they had at the time of serialization. If you are a serializable class, but your superclass is NOT serializable, then any instance variables you
INHERIT from that superclass will be reset to the values they were given during the original construc-tion of the object. This is because the nonserializable class constructor WILL run!
In fact, every constructor ABOVE the first non-serializable class constructor will also run, no matter what, because once the first super constructor is invoked, (during deserialization), it of course invokes its super constructor and so on up the inheritance tree.
Relationship HAS- A and Serialization
HAS-A relationships are based on usage, rather than inheritance. In other words, class A HAS-A B if
code in class A has a reference to an instance of class B. For example, you can say the following HAS-
A relationships allow you to design classes that follow good OO practices by not having monolithic
classes that do a gazillion different things. Classes (andtheir resulting objects) should be specialists.
Specialized classes can actually help reduce bugs. The more specialized the class, the more likely it is
that you can reuse the class in other applications. If you put all the Computer-related code directly into
the Developer class, you'll end up duplicating code in the Designer class, Manager class, and any other
class that might need Computer behavior. By keeping the Computer code in a separate, specialized
Computer class, you have the chance to reuse the Computer class in multiple applications like Design-
er Class or Manager class.
Developer class has a Computer, because Developer declares an instance variable of type Computer.
When code invokes writeCode() on a Developer instance, the Developer invokes writeCode() on the
Developer object’s Computer instance variable.
As you see, the computer instance can be used by developer to writeCode or Designer to drawGraph()
or Web. By this, we understand a good pratice of OO oriented programming by giving to each class
one main purpose and only one.
This design is much more cohesive. Instead of one class that does everything, we've broken the system
into three main classes, each with a very specific, or cohesive, role. Because we've built these special-
ized, reusable classes. You guest already something…Cohesion. We will discuss this concept in other
document.
In the serialization of Diaffrin class, we note that it has only primitive as field. Really in our developer
life, we don’t meet only primitive as field. The class instance can have also object references. The seri-
alizability of Diaffrin could be affected depending on whether the fields are serializable or not, We will
discuss all these aspects in the following.
Object graphs
An Object graph is a view of an object system at a particular point in time. Whereas a normal data
model such as a UML Class diagram details the relationships between classes, the object graph relates
their instances. Object diagrams are subsets of the overall object graph. Object-oriented applications
contain complex webs of interrelated objects. Objects are linked to each other by one object either
owning or containing another object or holding a reference to another object. This web of objects is
called an object graph and it is the more abstract structure that can be used in discussing an applica-
tion's state An object graph is a directed graph, which might be cyclic. When stored in RAM, objects occupy dif-
ferent segments of the memory with their attributes and function table, while relationships are repre-
sented by pointers or a different type of global handler in higher- level languages
What does it really mean to save an object? If the instance variables are all primitive types, it's pretty
straightforward. But what if the instance variables are themselves references to objects? What gets
saved? Clearly in Java it wouldn't make any sense to save the actual value of a reference variable, be-
cause the value of a Java reference has meaning only within the context of a single instance of a JVM.
In other words, if you tried to restore the object in another instance of the JVM, even running on the
same computer on which the object was originally serialized, the reference would be useless
Now modify our object Class Developer and let have an instance of Computer.
class Developer { private String login; private Computer computer; } class Computer { private int power; }
Now what happens if you save the Developer? If the goal is to save and then restore a Developer, and the
restored Developer is an exact duplicate of the Developer that was saved, then the Developer needs a Com-
puter that is an exact duplicate of the Developer's Computer at the time the Developer was saved. That
means both the Developer and the Computer should be saved. And what if the Computer itself had refer-
ences to other objects—like perhaps a Battery object? This gets quite complicated very quickly. If it were
up to the programmer to know the internal structure of each object the Developer referred to, so that the
programmer could be sure to save all the state of all those objects…whew. That would be a nightmare with
even the simplest of objects.
Fortunately, the Java serialization mechanism takes care of all of this. When you serialize an object, Java
serialization takes care of saving that object's entire "object graph." That means a deep copy of everything
the saved object needs to be restored.
For example, if you serialize a Developer object, the Computer will be serialized automatically. And if the
Computer class contained a reference to another object, THAT object would also be serialized, and so on.
And the only object you have to worry about saving and restoring is the Developer. The other objects re-
quired to fully reconstruct that Developer are saved (and restored) automatically through serialization. Re-
member, you do have to make a conscious choice to create objects that are serializable, by implementing
the Serializable interface
But when we try to serialize the Developer object we get a runtime exception something like this ja-
va.io.NotSerializableException: Computer
What did we forget? The Computer class must ALSO be Serializable. If we modify the Computer class and
make it serializable, then there's no problem. But what would happen if we didn't have access to the Com-
puter class source code?
In other words, what if making the Computer class serializable was not an option!; The Computer class
might itself refer to other non-serializable objects, and without knowing the internal structure of Computer,
you aren't able to make all these fixes.
We cannot make Computer transient because we want to retrieve our Developer instance as we have saved
, not a Developer with Computer null.
The solution of the problem is to let computer transient and use a custom serialization as so far we saw.
Serialization and StackOverflow
In this part we will learn how Serialization/Deserialization can throw StackOverflowError. Here we are explain-
ing something which never must happen in our daily programming but if you’re smarter…it can happen.
public class LinkObject implements Serializable { LinkObject linkObject; public LinkObject(LinkObject linkObject) { super(); this.linkObject = linkObject; }
And in our MainClassTest try to build an big number of LinkObject...
public static void main(String ...args){ LinkObject linkObject=null; for(int i=0;i<10001000;i++){ linkObject=new LinkObject(linkObject); } try { SerializationUtils.serializeObject(linkObject, "fail.ser"); } catch (IOException e) { e.printStackTrace(); } try { SerializationUtils.deserializeObject("fail.ser"); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } catch (ClassNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); }
}
Above snippet code will throw StackOverflowError? Why?
The possibility of running out of resources is not generally considered when one writes documentation, only the
expected behaviour given the input and sufficient resources. The documentation doesn't know how much stack
space you have available, so it can't specify number of link in your object graph.
Ok, with 100000 links I got it to produce a stack overflow. Still no JVM crash though.
To summarize, Java's built-in serialization implementation uses excessive stack space when serializing deeply
nested object graphs. So, if you have to serialize/deserialize such graphs as a single entity, you may have to
increase the stack size.
Or provide your own serialization implementation, since you have a clearer understanding of your object model
than Java does. To me this seems as if the default serialization algorithm recursively traverses a graph structure,
which is a bad idea, because the stack is finite and rather small. The heap however can grow and is generally
larger. Is this the whole problem? The default implementation of Java Serialization algorithm is not so
good…You can provide your own custom implementation which can enhanced the reaching object graph.
Serialization and Security
As we learn above, we convert an object developer to byte stream and store it into external file. Serial-ize un object technical mean store all information about necessary to re-create this object by deseriali-zation, so the byte stream content package info, class info, object properties even private. Until some-
one or other problem can access to the file, it can modify it and read it.
Let see how like our Developer object
Developer developer=new Developer("logintest", "passwordtest"); bytestream:
By this byte we know or package is : org.sidibe.learning.serialization. Our class is Developer. Class Developer have and java.lang.String login field and password as String. And without wen, we see di-
rectly the value of our Object Developer. Now someone with this byte have all information of about our Developer class. This is a case where Serialization can violate our company security rule. How we can avoid this failure in security.
Avoid Serialization
Serialization is dangerous because it allows adversaries to get their hands on the internal state of your
objects. An adversary can serialize one of your objects into a byte array that can be read. This allows the adversary to inspect the full internal state of your object, including any fields you marked private as well as the internal state of any objects you reference.
To prevent this, you can make your object impossible to serialize. The way to do this is to declare the writeObject method:
private final void writeObject(ObjectOutputStream out) throws java.io.IOException { throw new java.io.IOException("Object cannot be serialized");
}
This method is declared final so that a subclass defined by the adversary cannot override it.
Avoid Deserlization of our object:
This rule is even more important than the preceding one. Even if your class is not serializeable, it may
still be deserializeable. An adversary can create a sequence of bytes that happens to deserialize to an instance of your class. This is dangerous, since you do not have control over what state the deserialized
object is in. You can think of deserialization as another kind of public constructor for your object; un-fortunately, it is a kind of constructor that is difficult for you to control.
You can prevent this kind of attack by making it impossible to deserialize a byte stream into an in-stance of your class. You can do this by declaring the readObject method:
private final void readObject(ObjectInputStream in) throws java.io.IOException { throw new java.io.IOException("Class cannot be deserialized"); }
This method is declared final to prevent the adversary from overriding it. To learn more about code security follow this link : http://www.securingjava.com/chapter-seven/chapter-seven-1.html
Serialization ID:
Last December, We have define our class Developer by public class Developer implements Serializable { private String login; private String password; public Developer(String login,String password) { super(); this.login=login; this.password=password; }
We serialize it in the file developer.ser and save in our file storage space. After our come-back from
holiday in Wassoulou(Mali), we decide to add a field private String skill to the class Developer.
Maybe the constructor have changed or other thing about class Developer, anyway, there are some
additional information in class Developer which are not in the file developer.ser.
We know at this time, the byte stream in the file miss some information to build or actual developer
object. So it not represente really our actual class Developer. . Now what happen when we want to
deserilize the stored bytestream into an Object Developer?
The serialVersionUID facilitates versioning of serialized data. Its value is stored with the data when serializing. When de-serializing, the same version is checked to see how the serialized data matches
the current code. If you want to version your data, you normally start with a serialVersionUID of 0, and bump it with every structural change to your class which alters the serialized data (adding or removing non-transient
fields). The built-in de-serialization mechanism (in.defaultReadObject()) will refuse to de-serialize from old
versions of the data. But if you want to you can define your own readObject()-function which can read back old data. This custom code can then check the serialVersionUID in order to know which version the data is in and decide how to de-serialize it. This versioning technique is useful if you store
serialized data which survives several versions of your code. But storing serialized data for such a long time span is not very common. It is far more common to use
the serialization mechanism to temporarily write data to for instance a cache or send it over the network to another program with the same version of the relevant parts of the codebase.
In this case you are not interested in maintaining backwards compatibility. You are only concerned
with making sure that the code bases which are communicating indeed have the same versions of relevant classes. In order to facilitate such a check, you must maintain the serialVersionUID just like
before and not forget to update it when making changes to your classes. If you do forget to update the field, you might end up with two different versions of a class with different structure but with the same serialVersionUID. If this happens, the default mechanism
(in.defaultReadObject()) will not detect any difference, and try to de-serialize incompatible data. Now
you might end up with a cryptic runtime error or silent failure (null fields). These types of errors might be hard to find.
So to help this usecase, the Java platform offers you a choice of not setting the serialVersionUIDmanually. Instead, a hash of the class structure will be generated at compile-time
and used as id. This mechanism will make sure that you never have different class structures with the same id, and so you will not get these hard-to-trace runtime serialization failures mentioned above. But there is a backside to the auto-generated id strategy. Namely that the generated ids for the same
class might differ between compilers (as mentioned by Jon Skeet above). So if you communicate serialized data between code compiled with different compilers, it is recommended to maintain the ids
manually anyway.
And if you are backwards-compatible with your data like in the first use case mentioned, you also probably want to maintain the id yourself. This in order to get readable ids and have greater control
over when and how they change.
Computing Serialization serialivers ionUID Algorithm:
The serialVersionUID is computed using the signature of a stream of bytes that reflect the class defini-tion. The sequence of items in the stream is as follows:
1. The class name.
2. The class modifiers written as a 32-bit integer. 3. The name of each interface sorted by name.
4. For each field of the class sorted by field name (except private static and private transi-ent fields:
a. The name of the field.
b. The modifiers of the field written as a 32-bit integer. c. The descriptor of the field.
5. If a class initializer exists, write out the following: a. The name of the method, <clinit>. b. The modifier of the method, java.lang.reflect.Modifier.STATIC, written as a 32-bit in-
teger. c. The descriptor of the method, ()V.
6. For each non-private constructor sorted by method name and signature: a. The name of the method, <init>. b. The modifiers of the method written as a 32-bit integer.
c. The descriptor of the method. 7. For each non-private method sorted by method name and signature:
a. The name of the method. b. The modifiers of the method written as a 32-bit integer. c. The descriptor of the method.
8. The SHA-1 algorithm is executed on the stream of bytes produced by DataOutputStream and produces five 32-bit values sha[0..4].
9. The hash value is assembled from the first and second 32-bit values of the SHA-1 message di-gest. If the result of the message digest, the five 32-bit words H0 H1 H2 H3 H4, is in an array of five int values namedsha, the hash value would be computed as follows:
10. long hash = ((sha[0] >>> 24) & 0xFF) | 11. ((sha[0] >>> 16) & 0xFF) << 8 |
12. ((sha[0] >>> 8) & 0xFF) << 16 | 13. ((sha[0] >>> 0) & 0xFF) << 24 |
14. ((sha[1] >>> 24) & 0xFF) << 32 | 15. ((sha[1] >>> 16) & 0xFF) << 40 |
16. ((sha[1] >>> 8) & 0xFF) << 48 | ((sha[1] >>> 0) & 0xFF) << 56;
To learn more about serialization UID follow the link :
InvalidClassException
Inheritence
The Invalid class exception is one of the commonly experienced exception by the java programmers
who use object serialization in their program. There are three main causes for this exception to be
thrown.
They are,
1. serial version of the class
2. containing unknown data types,
3. no-arg constructor.
As the name of the exception indicates that, the class of the object which is serialized or deserialized
becomes invalid due to one of the reasons which I have listed out before. This causes the class to be
invalid and the objects of which cannot be serialized or deserialized. InvalidClassException class ex-
tends the ObjectStreamException.
By removing the default constructor in the precedent sample, the serialization will success but the
deserialization will throw InvalidClassException because there are no default contructor and the class
Developer is not Serializable.
Here we will discuss about the third reason "No-arg constructor". How the absence of a no-arg con-
structor causes this exception to be thrown.
This type of exception is thrown when inheritance is involved in the program. When inheritance is in-
volved, the serialization process proceeds by serializing the objects of child classes first and then
moves up the hierarchy until the non-serializable parent class is reached.
When the objects are to be deserialized it starts from the non-serializable parent class and moves
down the hierarchy. Since the parent class is non-serializable the state information about the members
of the parent class can only be retrieved from the default constructor as it cannot be retrieved from the
stream. Since this state information is available only in the default constructor the absence of which
makes the class invalid. To solve it, we must add a default constructor to the super-class.
In last part of this document we saw that this exception is thrown when the Serialization runtime
detects one of the following problems with a Class.
The serial version of the class does not match that of the class descriptor read from the stream
The class contains unknown datatypes The class does not have an accessible no-arg constructor.
We have explain about no-arg constructor now, we will focused on the first point ie when he serial
version of the class does not match that of the class descriptor read from the stream.
Whenever object serialization is performed the objects are saved in a particular file format. This file
format contains a class descriptor for each class of the object that is saved. The class descriptor usually
contains the
class name
serial version unique ID
set of flags
description of the data fields
If the value of the serial version unique Id is not explicitly specified then the jvm will automatically
assigns the value for this variable using class info. We can also be able to assign value for this varia-
ble like 1L or 2L.
One thing should be importantly noted that the serial version unique Id should be same during object
serialization and deserialization. During serialization this serial version unique Id value would be
recorded in the class descriptor. While deserialization current serial version unique Id value would be
compared with the one in the class descriptor. If there is any mismatch between the values this excep-
tion would be thrown.
Let’s produce this exception:
Given our class
public class Developer implements Serializable { /** * */ private static final long serialVersionUID = 1L; private String login; private String password; public Developer(String login, String password) { this.login = login; this.password = password; }
}
Let’s serialize or Object Developer : private static final String IO_FILENAME="serialization_iud.ser"; public static void main(String[] args) { System.out.println("Begin of Serizalization "); try { Developer developer=new Developer("logintest", "passwordtest"); // SerializationUtils.serializeObject(developer, IO_FILENAME); System.out.println("Serialization success : "); } catch (Exception e) { System.out.println("Serialization failed : "+e.getMessage()); e.printStackTrace(); }
}
Our Object developer is now serialized in the file serialization_iud.ser.
Later we modifie juste the field private static final long serialVersionUID = 2L;
And deserialized it System.out.println("Begin of Derizalization "); try { Developer develop-per=SerializationUtils.deserializeObject(IO_FILENAME); System.out.println("Deserialization success : "+developper.toString()); } catch (IOException e) { System.out.println("Deserialization failed : "+e.getMessage()); e.printStackTrace(); } catch (ClassNotFoundException e) { System.out.println("Deserialization failed : "+e.getMessage()); e.printStackTrace();
}
We will receive
java.io.InvalidClassException: Developer; local class incompatible: stream classdesc seri-alVersionUID = 1, local class serialVersionUID = 2 at java.io.ObjectStreamClass.initNonProxy(Unknown Source) at java.io.ObjectInputStream.readNonProxyDesc(Unknown Source) at java.io.ObjectInputStream.readClassDesc(Unknown Source) at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source) at java.io.ObjectInputStream.readObject0(Unknown Source) at java.io.ObjectInputStream.readObject(Unknown Source) at
In Deserialization Process, the Runtime check if the serialVersionUID have to match during the serial-
ization and deserialization process. If different, it throw java.io.InvalidClassException but if their matching, the serialization will success even you have add a field to the class.
Only serializationVersionUID matter.
top related