java threading
DESCRIPTION
A basic explanation of Java threadsTRANSCRIPT
Guillermo SchwarzSun Certified Enterprise Architect
What are threads?Each thread is like an operating system
process, except that all threads in a process share the memory.
Threads execute code concurrently and therefore must carefully control access to shared resources.Otherwise data can become inconsistent.
All memory is shared?No, only globals and heap.In Java, classes are global and static
variables declared inside classes are global too. All the rest is local.
All variables declared inside a method are locals, therefore they are not shared.
The heap memory is never a problem even though it is shared, because variables pointing to them are either global or local.
VisuallyThread-1
Thread-2
Thread-3
Stack-1 Stack-2 Stack-3
Globals
Heap
Visuallyclass Example{
static String salutation = “hello”; // to be protectedString person = “world”;public void concat( char c ) // offending method{ String aux = “$” + c + “$”; synchronized ( salutation ) // protection { salutation += aux; // potential offense
} person += aux;
}}
Java ThreadsOriginally Java threads were green threads,
meaning the JVM simulated threads using only one operating system thread.
Since Java 2, threads are implemented using operating system threads, meaning when multiple processors are available, they actually execute in parallel.
Green and OS threads can have the same problems when used carelessly.
J2EEThe J2EE standard recommends not to use
synchronization:The rationale is that synchronization is too
difficult to get right (dead locks, race conditions, contention, convoy formations, etc.).
This also means you can’t use static variables unless you make them final (i.e.: you assign them only once).
In J2EE data resides in the database, where it belongs, databases use database transactions and that protects the integrity of data.
J2EEIn practice, it is ok to synchronize on static
variables as long as:Changes are consistent once they leave the
synchronized block.They don’t involve database changes (use
database transactions for that) or any other non memory change, as for example files, sockets, etc.
They either change one variable or they involve only one lock, or if they change more than one variable and involve more than one lock, they are made in a manner that does not produce deadlocks, live locks, etc.
Problems with SynchronizationForget to synchronize some global memory
=> race problems.Too much synchronization => dead lock and
live lock (contention).Synchronizing little blocks of code instead of
meaningful chunks => contention + race problems.
Synchronizing too big chunks of code => contention.
Synchronization MechanismsThe synchronized keyword.Read/write locks built on top of the synchronized keyword:Useful for read dominated data.Part of Java 5.
It is even better to use non blocking data structures (no contention).
What about other shared resources?The synchronized keyword is not meant to
be used with other resources except memory.Using synchronized to access files, for
example, has no defined effect. May work on some technology stacks (JVM + operating system), may fail in others or may produce hard to track and unpredictable side effects in all of them.
J2EEJ2EE recommends not using threads at the
application level, but leaving it up to the application server.
Spring Framework author, Rod Johnson, prefers that developers use the thread API.(1)
Nevertheless, Rod suggest that developers should not use JDBC directly, because it is easy to mess up. (2)
JDBC is way too easy compared to the thread API.
(1): See J2EE Development Without EJB, Rod Johson et al., page 344.(2): See same book, on page 145 under “JDBC support”.
The Real ProblemWhy do we want to use the thread API?
Is it liveliness? The producer consumer design pattern is more lively
and the thread API can be hidden behind the pattern for application programmers to use.
Is it better throughput? Again, the producer consumer pattern is better.
Is it handling more concurrent users? That is a good one, but any servlet conatiner already
handles it the best way. Do not store information in the session for best
results.
SummarizingThreads should be hidden in low level libraries.In applications, only globals (static variables)
must be synchronized.The shorter the span they are synchronized, the
better (non blocking being the best).Using the database is even better, since it
already has “serialization” built-in under a mechanism called “transaction”, meaning that even thought everything occurs simultaneously, the end result is as if every thread executed one after the other.
Guillermo SchwarzSun Certified Enterprise Architect