document classification. mapreduce. software · pdf filemapreduce examples distributed grep:...

78
Document Classification. MapReduce. Software Transactional Memory Parallel and Distributed Computing Department of Computer Science and Engineering (DEI) Instituto Superior T´ ecnico December 6, 2010 Jos´ e Monteiro & Jos´ e Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 1 / 48

Upload: nguyennguyet

Post on 06-Feb-2018

241 views

Category:

Documents


0 download

TRANSCRIPT

Document Classification. MapReduce.Software Transactional Memory

Parallel and Distributed Computing

Department of Computer Science and Engineering (DEI)Instituto Superior Tecnico

December 6, 2010

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 1 / 48

Outline

Document Classification

MapReduce

Software Transactional Memory

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 2 / 48

Document Classification

1 search directories, subdirectories for documents(look for .html, .txt, .tex, etc.)

2 using a dictionary of key words, create a profile vector for eachdocument

3 store profile vectors

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 3 / 48

Document Classification

Data dependency graph:

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 4 / 48

Document Classification: Partitioning

Most time spent reading documents and generating profile vectors⇒ create two primitive tasks for each document

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 5 / 48

Document Classification: Partitioning

Most time spent reading documents and generating profile vectors⇒ create two primitive tasks for each document

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 5 / 48

Document Classification: Agglomeration & Mapping

number of tasks not known at compile time

tasks do not communicate with each other

time needed to perform tasks varies widely

Strategy: dynamic scheduling (work-pool model)

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 6 / 48

Document Classification: Agglomeration & Mapping

number of tasks not known at compile time

tasks do not communicate with each other

time needed to perform tasks varies widely

Strategy: dynamic scheduling (work-pool model)

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 6 / 48

Document Classification: Master & Slaves

Roles of Master and Workers:

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 7 / 48

MapReduce Paradigm

MapReduce: a simple programming model, developed by Google,motivated by large-scale data processing, applicable to many computingproblems

MapReduce provides:

Automatic parallelization and distribution

Fault-tolerance

I/O scheduling

Status and monitoring

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 8 / 48

MapReduce Usage

Programmer specifies two functions:

map (in key, in value) −→ list(out key, intermediate value)

Processes input key/value pair

Produces set of intermediate pairs

reduce (out key, list(intermediate value)) −→ list(out value)

Combines all intermediate values for a particular key

Produces a set of merged output values (usually just one)

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 9 / 48

What can you do with it?

Seems like a limited model.

But...

Many string processing problems fit naturally

Can be used iteratively

MapReduce libraries have been written in C++, C#, Erlang, Java,Python, F#, R and other programming languages.

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 10 / 48

What can you do with it?

Seems like a limited model.

But...

Many string processing problems fit naturally

Can be used iteratively

MapReduce libraries have been written in C++, C#, Erlang, Java,Python, F#, R and other programming languages.

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 10 / 48

What can you do with it?

Seems like a limited model.

But...

Many string processing problems fit naturally

Can be used iteratively

MapReduce libraries have been written in C++, C#, Erlang, Java,Python, F#, R and other programming languages.

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 10 / 48

What can you do with it?

Seems like a limited model.

But...

Many string processing problems fit naturally

Can be used iteratively

MapReduce libraries have been written in C++, C#, Erlang, Java,Python, F#, R and other programming languages.

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 10 / 48

Example: Counting Words in Web Pages

Input: files with one document per record

Specify a map function that takes a key/value pair where

key = document URL

value = document contents

Output of map function is (potentially many) key/value pairs. In our case,output (word, “1”) once per word in the document.

Example:

If we have as input “document1” and “to be or not to be”

we get as output the following key/value pairs:“to”, “1”“be”, “1”“or”, “1”

“not”, “1”“to”, “1”“be”, “1”

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 11 / 48

Example: Counting Words in Web Pages

Input: files with one document per record

Specify a map function that takes a key/value pair where

key = document URL

value = document contents

Output of map function is (potentially many) key/value pairs. In our case,output (word, “1”) once per word in the document.

Example:

If we have as input “document1” and “to be or not to be”

we get as output the following key/value pairs:“to”, “1”“be”, “1”“or”, “1”

“not”, “1”“to”, “1”“be”, “1”

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 11 / 48

Counting Words in Web Pages

MapReduce library gathers together all pairs with the same key.

We must specify a reduce function that combines the values for a key.

Example:

Compute the sum of the values for the different keys:key = “be”values = “1”, “1”

key = “not”values = “1”

key = “or”values = “1”

key = “to”values = “1”, “1”

Output of reduce paired with key:“be”, “2”“not”, “1”

“or”, “1”“to”, “2”

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 12 / 48

Counting Words in Web Pages

MapReduce library gathers together all pairs with the same key.

We must specify a reduce function that combines the values for a key.

Example:

Compute the sum of the values for the different keys:key = “be”values = “1”, “1”

key = “not”values = “1”

key = “or”values = “1”

key = “to”values = “1”, “1”

Output of reduce paired with key:“be”, “2”“not”, “1”

“or”, “1”“to”, “2”

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 12 / 48

Counting Words in Web Pages

MapReduce library gathers together all pairs with the same key.

We must specify a reduce function that combines the values for a key.

Example:

Compute the sum of the values for the different keys:key = “be”values = “1”, “1”

key = “not”values = “1”

key = “or”values = “1”

key = “to”values = “1”, “1”

Output of reduce paired with key:“be”, “2”“not”, “1”

“or”, “1”“to”, “2”

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 12 / 48

MapReduce Execution Overview

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 13 / 48

MapReduce Examples

Distributed Grep: The map function emits a line if it matches a given pattern.The reduce function is an identity function that just copies the suppliedintermediate data to the output.

Count of URL Access Frequency: The map function processes logs of web pagerequests and outputs <URL, 1>. The reduce function adds together all values forthe same URL and emits a <URL, total count> pair.

Reverse Web-Link Graph: The map function outputs <target, source> pairs foreach link to a target URL found in a page named “source”. The reduce functionconcatenates the list of all source URLs associated with a given target URL andemits the pair: <target, list(source)>.

Inverted Index: The map function parses each document, and emits a sequence of

<word, document ID> pairs. The reduce function accepts all pairs for a given

word, sorts the corresponding document IDs and emits a <word, list(document

ID)> pair. The set of all output pairs forms a simple inverted index. It is easy to

augment this computation to keep track of word positions.

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 14 / 48

MapReduce Examples

Distributed Grep: The map function emits a line if it matches a given pattern.The reduce function is an identity function that just copies the suppliedintermediate data to the output.

Count of URL Access Frequency: The map function processes logs of web pagerequests and outputs <URL, 1>. The reduce function adds together all values forthe same URL and emits a <URL, total count> pair.

Reverse Web-Link Graph: The map function outputs <target, source> pairs foreach link to a target URL found in a page named “source”. The reduce functionconcatenates the list of all source URLs associated with a given target URL andemits the pair: <target, list(source)>.

Inverted Index: The map function parses each document, and emits a sequence of

<word, document ID> pairs. The reduce function accepts all pairs for a given

word, sorts the corresponding document IDs and emits a <word, list(document

ID)> pair. The set of all output pairs forms a simple inverted index. It is easy to

augment this computation to keep track of word positions.

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 14 / 48

MapReduce Examples

Distributed Grep: The map function emits a line if it matches a given pattern.The reduce function is an identity function that just copies the suppliedintermediate data to the output.

Count of URL Access Frequency: The map function processes logs of web pagerequests and outputs <URL, 1>. The reduce function adds together all values forthe same URL and emits a <URL, total count> pair.

Reverse Web-Link Graph: The map function outputs <target, source> pairs foreach link to a target URL found in a page named “source”. The reduce functionconcatenates the list of all source URLs associated with a given target URL andemits the pair: <target, list(source)>.

Inverted Index: The map function parses each document, and emits a sequence of

<word, document ID> pairs. The reduce function accepts all pairs for a given

word, sorts the corresponding document IDs and emits a <word, list(document

ID)> pair. The set of all output pairs forms a simple inverted index. It is easy to

augment this computation to keep track of word positions.

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 14 / 48

MapReduce Examples

Distributed Grep: The map function emits a line if it matches a given pattern.The reduce function is an identity function that just copies the suppliedintermediate data to the output.

Count of URL Access Frequency: The map function processes logs of web pagerequests and outputs <URL, 1>. The reduce function adds together all values forthe same URL and emits a <URL, total count> pair.

Reverse Web-Link Graph: The map function outputs <target, source> pairs foreach link to a target URL found in a page named “source”. The reduce functionconcatenates the list of all source URLs associated with a given target URL andemits the pair: <target, list(source)>.

Inverted Index: The map function parses each document, and emits a sequence of

<word, document ID> pairs. The reduce function accepts all pairs for a given

word, sorts the corresponding document IDs and emits a <word, list(document

ID)> pair. The set of all output pairs forms a simple inverted index. It is easy to

augment this computation to keep track of word positions.

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 14 / 48

MapReduce Fault Tolerance

On worker failure:

Detect failure via periodic heartbeats

Re-execute completed and in-progress map tasks

Re-execute in progress reduce tasks

Task completion committed through master

Master failure:

Could handle, but don’t yet (master failure unlikely)

Robust: lost 1600 of 1800 machines once, but finished fine

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 15 / 48

MapReduce Use in Industry

Introduced by Google

Yahoo! is running on a 10k Linux cluster with 5 Petabytes of data

Amazon is leasing servers to run map reduce computations

Microsoft is developing Dryad to supersede Map-Reduce

Facebook, Twitter and others are also using Map-Reduce

Hadoop is an open source implementation of MapReduce.

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 16 / 48

Hadoop

Hadoop is a software platform written in Java that lets one easily writeand run applications that process vast amounts of data.

Hadoop is sub-project of the Apache foundation and receives sponsorshipfrom Google, Yahoo, Microsoft, HP and others.

Scalable: Hadoop can reliably store and process Petabytes

Economical: it distributes the data and processing across clusters ofcommonly available computers

these clusters can number into the thousands of nodes.

Efficient: By distributing the data, Hadoop can process it in parallelon the nodes where the data is located

This makes it extremely efficient.

Reliable: Automatically maintains multiple copies of data andautomatically redeploys computing tasks based on failures

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 17 / 48

MapReduce Example using Hadoop

/**

* Counts the words in each line.

* For each line of input, break the line into words and emit them as (word, 1).

*/

public static class Map extends MapReduceBase

implements Mapper<LongWritable, Text, Text, IntWritable>

{

private final static IntWritable one = new IntWritable(1);

private Text word = new Text();

public void map(LongWritable key, Text value,

OutputCollector<Text, IntWritable> output,

Reporter reporter) throws IOException

{

String line = value.toString();

StringTokenizer tokenizer = new StringTokenizer(line);

while (tokenizer.hasMoreTokens()) {

word.set(tokenizer.nextToken());

output.collect(word, one);

}

}

}

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 18 / 48

MapReduce Example using Hadoop

/**

* A reducer class that just emits the sum of the input values.

*/

public static class Reduce extends MapReduceBase

implements Reducer<Text, IntWritable, Text, IntWritable>

{

public void reduce(Text key, Iterator<IntWritable> values,

OutputCollector<Text, IntWritable> output, Reporter reporter)

throws IOException

{

int sum = 0;

while (values.hasNext()) {

sum += values.next().get();

}

output.collect(key, new IntWritable(sum));

}

}

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 19 / 48

MapReduce Example using Hadoop

public static void main(String[] args) throws Exception

{

if (args.length != 2) {

printUsage();

System.exit(1);

}

JobConf conf = new JobConf(WordCount.class);

conf.setJobName("wordcount");

conf.setOutputKeyClass(Text.class);

conf.setOutputValueClass(IntWritable.class);

conf.setMapperClass(Map.class);

conf.setCombinerClass(Reduce.class);

conf.setReducerClass(Reduce.class);

conf.setInputFormat(TextInputFormat.class);

conf.setOutputFormat(TextOutputFormat.class);

FileInputFormat.setInputPaths(new Path(args[0]));

FileOutputFormat.setOutputPath(new Path(args[1]));

JobClient.runJob(conf);

}

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 20 / 48

Meanwhile in Google...

MapReduce would receive the epic amounts of webpage datacollected by Google’s crawlers, and it would crunch this down to thelinks and metadata needed to actually search these pages.

The whole process would take 8 hours and then it had to be startedall over again. In the age of the “real time” web that is too long...

On Setember 2010 Google switched its search infrastructure toCaffeine

indexes are updated by making direct changes to the web map alreadystored in databaseCaffeine its completely incremental

With Caffeine, Google moved its back-end indexing system away fromMapReduce and onto BigTable, a fast, extremely large-scale,distributed DBMS developed by Google.

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 21 / 48

Critical Regions

Critical Region

Sections of the code that access a shared resource which must not beaccessed concurrently by another thread.

Example, insertion in a doubly linked list:

newNode->prev = node;newNode->next = node->next;node->next->prev = newNode;node->next = newNode;

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 22 / 48

Mutual Exclusion

We have solved the problem of critical sections with mutexes (locks):

non-criticalentry regioncritical-regionleave regionnon-critical

all threads must check the state of the mutex before entering thecritical region.

if the mutex is locked, then there is a thread in the critical sectionand this thread blocks, waiting on the mutex

if mutex is unlocked, then no thread is currently in the critical section

this thread is allowed to enter, simultaneously locking the mutexthread unlocks the mutex when exiting the critical section, waking upany thread waiting on this mutex

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 23 / 48

Limitations of Mutual Exclusion

Deadlocks

Priority inversion

Relies on conventions

Conservative

Not Composable

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 24 / 48

Limitations of Mutual Exclusion

Priority inversion

low priority task may hold a shared resource

high priority tasks get blocked if they request the same resource

intermediate priority tasks preempt the low priority task that holdsthe resource

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 25 / 48

Limitations of Mutual Exclusion

Relies on conventions

Relationship between lock and shared data is in programmer’s mind.

Actual comment from Linux kernel:

/** When a locked buffer is visible to the I/O layer* BH_Launder is set. This means before unlocking* we must clear BH_Launder,mb() on alpha and then* clear BH_Lock, so no reader can see BH_Launder set* on an unlocked buffer and then risk to deadlock.*/

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 26 / 48

Limitations of Mutual Exclusion

Conservative

There might be a conflict, be on the safe side.

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 27 / 48

Limitations of Mutual Exclusion

Conservative

There might be a conflict, be on the safe side.

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 27 / 48

Limitations of Mutual Exclusion

Conservative

There might be a conflict, be on the safe side.

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 27 / 48

Limitations of Mutual Exclusion

Not Composable

Operation: move item from Hash Table T1 to Hash Table T2.

Implementation:

delete(T1, item);add(T2, item);

Both delete and add may have been protected as critical sections,however externally the situation where item is in neither Hash Tables willbe visible and interruptible.

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 28 / 48

Transactions

Transaction

Operations in a transaction either all occur or none occur.

Atomic operation:

Commit: takes effect

Abort: effects rolled back

Usually retried

Linearizable

Appear to happen in one-at-a-time order

Transactional Memory

A section of code with reads and writes to shared memory which logicallyoccur at a single instant in time.

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 29 / 48

Transactions

Transaction

Operations in a transaction either all occur or none occur.

Atomic operation:

Commit: takes effect

Abort: effects rolled back

Usually retried

Linearizable

Appear to happen in one-at-a-time order

Transactional Memory

A section of code with reads and writes to shared memory which logicallyoccur at a single instant in time.

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 29 / 48

Software Transactional Memory

Software Transactional Memory (STM) has been proposed as analternative to Lock-based Synchronization.

Concurrency Unlocked

no thread control when entering critical regions

if there are no memory access conflicts during thread execution,operations executed by thread are accepted

in case of conflict, program state is rolled-back to the state it wasbefore the thread entered the critical region

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 30 / 48

Software Transactional Memory

Benefits of STM

Optimistic: increased concurrency

Composable: define atomic set of operations

Conditional Critical Regions

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 31 / 48

Optimistic

Increased concurrency: threads are not blocked.

Conflicts only arise when more than one thread makes an access to thesame memory position.

Conflicts are rare ⇒ small number of roll-backs.

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 32 / 48

Composable Atomic Operations

Keyword atomic allows the definition of the set of operations that makeup the transaction.

atomic {delete(T1, item);add(T2, item);

}

atomic {newNode->prev = node;newNode->next = node->next;node->next->prev = newNode;node->next = newNode;

}

either all happen or none at all

remaining threads never see intermediate values

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 33 / 48

Conditional Critical Regions

The default action when a transaction fails is to retry.

What if the successful completion of a transaction is dependent on somevariable?

Example: consumer accessing an empty queue.

⇒ thread enters a cycle of failures (i.e., active wait!)

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 34 / 48

Conditional Critical Regions

The default action when a transaction fails is to retry.

What if the successful completion of a transaction is dependent on somevariable?

Example: consumer accessing an empty queue.

⇒ thread enters a cycle of failures (i.e., active wait!)

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 34 / 48

Conditional Critical Regions

If the successful completion of a transaction is dependent on some variable

⇒ use a guard condition.

atomic (queueSize > 0) {remove item from queue

}

If condition is not satisfied, thread will be blocked until a commit has beenmade that affects the condition.

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 35 / 48

Conditional Critical Regions

Keyword retry allows the thread to abort and block at any point in thetransaction:

atomic {if(queueSize > 0)

remove item from queueelse

retry;}

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 36 / 48

Conditional Critical Regions

STM includes the possibility of alternative course of actions when atransaction fails:⇒ orElse keyword

atomic {delete(T1, item)

orElsedelete(T2, item);add(T3, item);

}

if delete in T1 fails, a delete in T2 is attempted.

if delete in T2 fails, the whole transaction retries.

add is only performed after a successful delete of either T1 or T2.

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 37 / 48

Software Transactional Memory

Benefits of STM

Optimistic: increased concurrency

Composable: define atomic set of operations

Conditional Critical Regions

Problems with STM

overhead for conflict detection, both computational and memory

overhead from commit

cannot be used when operations cannot be undone (i.e., I/O)

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 38 / 48

Software Transactional Memory

Benefits of STM

Optimistic: increased concurrency

Composable: define atomic set of operations

Conditional Critical Regions

Problems with STM

overhead for conflict detection, both computational and memory

overhead from commit

cannot be used when operations cannot be undone (i.e., I/O)

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 38 / 48

Software Transactional Memory

Benefits of STM

Optimistic: increased concurrency

Composable: define atomic set of operations

Conditional Critical Regions

Problems with STM

overhead for conflict detection, both computational and memory

overhead from commit

cannot be used when operations cannot be undone (i.e., I/O)

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 38 / 48

Software Transactional Memory

Benefits of STM

Optimistic: increased concurrency

Composable: define atomic set of operations

Conditional Critical Regions

Problems with STM

overhead for conflict detection, both computational and memory

overhead from commit

cannot be used when operations cannot be undone (i.e., I/O)

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 38 / 48

Software Transactional Memory

Benefits of STM

Optimistic: increased concurrency

Composable: define atomic set of operations

Conditional Critical Regions

Problems with STM

overhead for conflict detection, both computational and memory

overhead from commit

cannot be used when operations cannot be undone (i.e., I/O)

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 38 / 48

STM Overhead

Advantage of STM: much fewer conflicts.⇒ most transactions will commit.

However, every commit has a potentially large overhead...

Note that if there are no conflicts, the only overhead of the mutexapproach is in locking and unlocking.

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 39 / 48

Implementation Issues

Transaction Log

each read and write in transaction is logged to a thread-localtransaction log

writes go to the log only, not to memory

at the end, the transaction tries to commit to memory

in case commit fails, discard log and retry transaction

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 40 / 48

Implementation Issues

Transaction Log

each read and write in transaction is logged to a thread-localtransaction log

writes go to the log only, not to memory

at the end, the transaction tries to commit to memory

in case commit fails, discard log and retry transaction

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 40 / 48

Implementation Issues

Transaction Log

each read and write in transaction is logged to a thread-localtransaction log

writes go to the log only, not to memory

at the end, the transaction tries to commit to memory

in case commit fails, discard log and retry transaction

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 40 / 48

Implementation Issues

Transaction Log

each read and write in transaction is logged to a thread-localtransaction log

writes go to the log only, not to memory

at the end, the transaction tries to commit to memory

in case commit fails, discard log and retry transaction

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 40 / 48

Implementation Issues

Commit-time Locking

uses a global clock

each memory location maintains an access time, Ta

marks time at beginning of transaction, Tt

for every read/write, if Ta > Tt , abort transaction

during commit, all write locations are locked and access timesupdated

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 41 / 48

Implementation Issues

Commit-time Locking

uses a global clock

each memory location maintains an access time, Ta

marks time at beginning of transaction, Tt

for every read/write, if Ta > Tt , abort transaction

during commit, all write locations are locked and access timesupdated

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 41 / 48

Implementation Issues

Commit-time Locking

uses a global clock

each memory location maintains an access time, Ta

marks time at beginning of transaction, Tt

for every read/write, if Ta > Tt , abort transaction

during commit, all write locations are locked and access timesupdated

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 41 / 48

Implementation Issues

Commit-time Locking

uses a global clock

each memory location maintains an access time, Ta

marks time at beginning of transaction, Tt

for every read/write, if Ta > Tt , abort transaction

during commit, all write locations are locked and access timesupdated

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 41 / 48

Implementation Issues

Commit-time Locking

uses a global clock

each memory location maintains an access time, Ta

marks time at beginning of transaction, Tt

for every read/write, if Ta > Tt , abort transaction

during commit, all write locations are locked and access timesupdated

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 41 / 48

Implementation Issues

Commit-time locking: roll-back simple; commit expensive...

Encounter-time Locking

memory positions inside a transaction are locked

thread has exclusive access to them during execution of transaction

remaining threads abort immediately when accessing one of thesepositions

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 42 / 48

Implementation Issues

Commit-time locking: roll-back simple; commit expensive...

Encounter-time Locking

memory positions inside a transaction are locked

thread has exclusive access to them during execution of transaction

remaining threads abort immediately when accessing one of thesepositions

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 42 / 48

Implementation Issues

Commit-time locking: roll-back simple; commit expensive...

Encounter-time Locking

memory positions inside a transaction are locked

thread has exclusive access to them during execution of transaction

remaining threads abort immediately when accessing one of thesepositions

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 42 / 48

Implementation Issues

Commit-time locking: roll-back simple; commit expensive...

Encounter-time Locking

memory positions inside a transaction are locked

thread has exclusive access to them during execution of transaction

remaining threads abort immediately when accessing one of thesepositions

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 42 / 48

Hardware Support

Hardware support for transactional memories has been proposed long ago.

Performance of STM would improve considerably!

Sun’s Rock Processor

First multicore (16 cores) designed for hardware transaction memory.Special Assembly instructions:

chkpt <fail pc>: begin a transactioncommit: commit the transaction

Canceled! (official November 2009)

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 43 / 48

Hardware Support

Hardware support for transactional memories has been proposed long ago.

Performance of STM would improve considerably!

Sun’s Rock Processor

First multicore (16 cores) designed for hardware transaction memory.Special Assembly instructions:

chkpt <fail pc>: begin a transactioncommit: commit the transaction

Canceled! (official November 2009)

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 43 / 48

Distributed STM

The concept of Software Transactional Memory can be extended todistributed systems.

IST’s Fenix is based on a distributed STM concept.

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 44 / 48

STM in Fenix

Fenix is a large web application, with a rich domain model

Before STM (2005), Fenix had major problems:

frequent bugs

poor performance

Root of the problems: Locks used for concurrency control

Idea: Wrap each HTTP request with a transaction.

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 45 / 48

STM in Fenix

The JVSTM went into production by September 2005.

Major benefits:

The data-corruption errors disappeared

they were caused mostly by misplaced locks

There was a perceived increase in the performance

after an initial warm-up

New functionalities are developed significantly faster

it requires less coding and less debugging

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 46 / 48

Review

Document Classification

MapReduce

Software Transactional Memory

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 47 / 48

Next Classes

Cache Coherent NUMA

Jose Monteiro & Jose Costa (DEI / IST) Parallel and Distributed Computing – 22 2010-12-06 48 / 48