parallel programming patterns (ua)
DESCRIPTION
Presentation at ITEvent 2011 http://itevent.if.ua/programa-it-event-2011-osinTRANSCRIPT
Зміст
- Тренд- Основні терміни- Managing state- Паралелізм- Засоби
Вчора
Сьогодні
Завтра
Що відбувається?
- Ріст частоти CPU вповільнився- Через фізичні обмеження- Free lunch is over- ПЗ більше не стає швидшим саме по собі
Сучасні тренди
- Manycore, multicore- GPGPU, GPU acceleration, heterogeneous
computing- Distributed computing, HPC
Основні поняття
- Concurrency- Many interleaved threads of control
- Parallelism- Same result, but faster
- Concurrency != Parallelism- It is not always necessary to care about
concurrency while implementing parallelism- Multithreading- Asynchrony
Задачі
- CPU-bound- number crunching
- I/O-bound- network, disk
Стан
- Shared- accessible by more than one thread- sharing is transitive
- Private- used by single thread only
Task-based program
Application
Tasks (CPU, I/O)
Runtime (queuing, scheduling)
Processors (threads, processes)
Managing state
Isolation
- Avoiding shared state- Own copy of state- Examples:
- process isolation- intraprocess isolation- by convention
Immutability
- Multiple read -- not a problem!- All functions are pure- Requires immutable collections- Functional way: Haskell, F#, Lisp
Synchronization
- The only thing that remains to deal with shared mutable state
- Kinds:- data synchronization- control synchronization
Data synchronization
- Why? To avoid race conditions and data corruption
- How? Mutual exclusion- Data remains consistent- Critical regions
- locks, monitors, critical sections, spin locks- Code-centered
- rather than associated with data
Critical region|Thread 1|// ...|lock (locker)|{| // ...| data.Operation();| // ...|}|// ...|||
|Thread 2|// ...|||||||lock (locker)|{| // ...| data.Operation();| // ...|}
Control synchronization
- To coordinate control flow- exchange data- orchestrate threads
- Waiting, notifications- spin waiting- events- alternative: continuations
Three ways to manage state
- Isolation: simple, loosely coupled, highly scalable, right data structures, locality- Immutability: avoids sync- Synchronization: complex, runtime overheads, contention
- in that order
Паралелізм
Підходи до розбиття задач
- Data parallelism- Task parallelism- Message based parallelism
Data parallelism
How?
- Data is divided up among hardware processors- Same operation is performed on elements - Optionally -- final aggregation
Data parallelism
When?
- Large amounts of data- Processing operation is costly- or both
Data parallelism
Why?
- To achieve speedup
- For example, with GPU acceleration:- hours instead of days!
Data parallelism
Embarrassingly parallel problems- parallelizable loops- image processing
Non-embarrassingly parallel problems- parallel QuickSort
Data parallelism
...
Thread 1
...
Thread 2
Data parallelism
Structured parallelism
- Well defined begin and end points- Examples:
- CoBegin- ForAll
CoBegin
var firstDataset = new DataItem[1000];var secondDataset = new DataItem[1000];var thirdDataset = new DataItem[1000];
Parallel.Invoke( () => Process(firstDataset), () => Process(secondDataset), () => Process(thirdDataset) );
Parallel For
var items = new DataItem[1000 * 1000];// ...Parallel.For(0, items.Length, i => { Process(items[i]); });
Parallel ForEach
var tickers = GetNasdaqTickersStream();Parallel.ForEach(tickers, ticker => { Process(ticker); });
Striped Partitioning
Thread 1 Thread 2
...
Iterate complex data structures
var tree = new TreeNode();// ...Parallel.ForEach( TraversePreOrder(tree), node => { Process(node); });
Iterate complex data
...
Thread 1
Thread 2
Declarative parallelismvar items = new DataItem[1000 * 1000];// ...var validItems = from item in items.AsParallel() let processedItem = Process(item) where processedItem.Property > 42 select Convert(processedItem);
foreach (var item in validItems){ // ...}
Data parallelism
Challenges
- Partitioning- Scheduling- Ordering- Merging- Aggregation- Concurrency hazards: data races, contention
Task parallelism
How?
- Programs are already functionally partitioned: statements, methods etc.- Run independent pieces in parallel- Control synchronization- State isolation
Task parallelism
Why?
- To achieve speedup
Task parallelism
Kinds- Structured
- clear begin and end points- Unstructured
- often demands explicit synchronization
Fork/join
- Fork: launch tasks asynchronously- Join: wait until they complete- CoBegin, ForAll- Recursive decomposition
Fork/join
Task 1
Task 2
Task 3
SeqSeq
Fork/join
Parallel.Invoke( () => LoadDataFromFile(), () => SavePreviousDataToDB(), () => RenewOtherDataFromWebService());
Fork/join
Task loadData = Task.Factory.StartNew(() => { // ... });Task saveAnotherDataToDB = Task.Factory.StartNew(() => { // ... });// ...Task.WaitAll(loadData, saveAnotherDataToDB);// ...
Fork/join
void Walk(TreeNode node) { var tasks = new[] { Task.Factory.StartNew(() => Process(node.Value)), Task.Factory.StartNew(() => Walk(node.Left)), Task.Factory.StartNew(() => Walk(node.Right)) }; Task.WaitAll(tasks);}
Fork/join recursive
Root
SeqSeq Left
Right
Node
Left
Right
Node
Left
Right
Dataflow parallelism: Futures
Task<DataItem[]> loadDataFuture = Task.Factory.StartNew(() => { //... return LoadDataFromFile(); });
var dataIdentifier = SavePreviousDataToDB();RenewOtherDataFromWebService(dataIdentifier);//...DisplayDataToUser(loadDataFuture.Result);
Dataflow parallelism: Futures
Seq
Future
Seq Seq
Dataflow parallelism: Futures
Seq
Future
Seq
Future
Seq
Future
Seq Seq
Continuations
Seq
Task
Seq Seq
Task
Task
Continuationsvar loadData = Task.Factory.StartNew(() => { return LoadDataFromFile(); });
var writeToDB = loadData.ContinueWith(dataItems => { WriteToDatabase(dataItems.Result); });
var reportToUser = writeToDB.ContinueWith(t => { // ... });reportProgressToUser.Wait();
Producer/consumerpipeline
lines parsedlines DB
reading parsing storing
Producer/consumerpipeline
lines
parsedlines
DB
Producer/consumer
var lines = new BlockingCollection<string>();
Task.Factory.StartNew(() => { foreach (var line in File.ReadLines(...)) lines.Add(line); lines.CompleteAdding(); });
Producer/consumer
var dataItems = new BlockingCollection<DataItem>();
Task.Factory.StartNew(() => { foreach (var line in lines.GetConsumingEnumerable() ) dataItems.Add(Parse(line)); dataItems.CompleteAdding(); });
Producer/consumer
var dbTask = Task.Factory.StartNew(() => { foreach (var item in dataItems.GetConsumingEnumerable() ) WriteToDatabase(item); });
dbTask.Wait();
Task parallelism
Challenges
- Scheduling- Cancellation- Exception handling- Concurrency hazards: deadlocks, livelocks, priority inversions etc.
Message based parallelism
- Accessing shared state vs. local state- No distinction, unfortunately- Idea: encapsulate shared state changes into
messages- Async events- Actors, agents
Засоби
Concurrent data structures
- Concurrent Queues, Stacks, Sets, Lists- Blocking collections, - Work stealing queues- Lock free data structures- Immutable data structures
Synchronization primitives
- Critical sections, - Monitors, - Auto- and Manual-Reset Events,- Coundown Events, - Mutexes, - Semaphores, - Timers, - RW locks- Barriers
Thread local state
- A way to achieve isolation
var parser = new ThreadLocal<Parser>( () => CreateParser());
Parallel.ForEach(items, item => parser.Value.Parse(item));
Thread pools
ThreadPool.QueueUserWorkItem(_ => { // do some work });
AsyncTask.Factory.StartNew(() => { //... return LoadDataFromFile(); }) .ContinueWith(dataItems => { WriteToDatabase(dataItems.Result); }) .ContinueWith(t => { // ... });
Asyncvar dataItems = await LoadDataFromFileAsync();
textBox.Text = dataItems.Count.ToString();
await WriteToDatabaseAsync(dataItems);
// continue work
Технології
- TPL, PLINQ, C# async, TPL Dataflow- PPL, Intel TBB, OpenMP- CUDA, OpenCL, C++ AMP- Actors, STM- Many others
Підсумок
- Програмування для багатьох CPU- Concurrency != parallelism- CPU-bound vs. I/O-bound tasks- Private vs. shared state
Підсумок
- Managing state:- Isolation - Immutability- Synchronization
- Data: mutual exclusion- Control: notifications
Підсумок
- Паралелізм:- Data parallelism: scalable- Task parallelism: less scalable- Message based parallelism
Підсумок
- Data parallelism- CoBegin- Parallel ForAll- Parallel ForEach- Parallel ForEach over complex data structures- Declarative data parallelism
- Challenges: partitioning, scheduling, ordering, merging, aggregation, concurrency hazards
Підсумок
- Task parallelism: structured, unstructured- Fork/Join
- CoBegin- Recursive decomposition
- Futures- Continuations- Producer/consumer (pipelines)
- Challenges: scheduling, cancellation, exceptions, concurrency hazards
Підсумок
- Засоби/інструменти- Компілятори, бібліотеки- Concurrent data structures- Synchronization primitives- Thread local state- Thread pools- Async invocations- ...
Q/A