Parallel programming patterns (UA)

Download Parallel programming patterns (UA)

Post on 21-Jun-2015




4 download

Embed Size (px)


Presentation at ITEvent 2011


<ul><li> 1. Parallel Programming Patterns: ,</li></ul> <p> 2. - - - Managing state- - 3. 4. 5. 6. ?- CPU - - Free lunch is over- 7. - Manycore, multicore- GPGPU, GPU acceleration, heterogeneouscomputing- Distributed computing, HPC 8. - Concurrency- Many interleaved threads of control- Parallelism- Same result, but faster- Concurrency != Parallelism- It is not always necessary to care about concurrencywhile implementing parallelism- Multithreading- Asynchrony 9. - CPU-bound- number crunching- I/O-bound- network, disk 10. - Shared- accessible by more than one thread- sharing is transitive- Private- used by single thread only 11. Task-based program Application Tasks (CPU, I/O)Runtime (queuing, scheduling)Processors (threads, processes) 12. Managing state 13. Isolation- Avoiding shared state- Own copy of state- Examples:- process isolation- intraprocess isolation- by convention 14. Immutability- Multiple read -- not a problem!- All functions are pure- Requires immutable collections- Functional way: Haskell, F#, Lisp 15. Synchronization- The only thing that remains to deal withshared mutable state- Kinds:- data synchronization- control synchronization 16. Data synchronization- Why? To avoid race conditions and datacorruption- How? Mutual exclusion- Data remains consistent- Critical regions- locks, monitors, critical sections, spin locks- Code-centered- rather than associated with data 17. Critical region|Thread 1 |Thread 2|// ... |// ...|lock (locker)||{|| // ...|| data.Operation(); || // ...||}||// ... |lock (locker)| |{| | // ...| | data.Operation();| // ... 18. Control synchronization- To coordinate control flow- exchange data- orchestrate threads- Waiting, notifications- spin waiting- events- alternative: continuations 19. Three ways to manage state- Isolation: simple, loosely coupled, highlyscalable, right data structures, locality- Immutability: avoids sync- Synchronization: complex, runtimeoverheads, contention- in that order 20. 21. - Data parallelism- Task parallelism- Message based parallelism 22. Data parallelismHow?- Data is divided up among hardware processors- Same operation is performed on elements- Optionally -- final aggregation 23. Data parallelismWhen?- Large amounts of data- Processing operation is costly- or both 24. Data parallelismWhy?- To achieve speedup- For example, with GPU acceleration:- hours instead of days! 25. Data parallelismEmbarrassingly parallel problems- parallelizable loops- image processingNon-embarrassingly parallel problems- parallel QuickSort 26. Data parallelism ... ...Thread 1Thread 2 27. Data parallelismStructured parallelism- Well defined begin and end points- Examples:- CoBegin- ForAll 28. CoBeginvar firstDataset = new DataItem[1000];var secondDataset = new DataItem[1000];var thirdDataset = new DataItem[1000];Parallel.Invoke(() =&gt; Process(firstDataset),() =&gt; Process(secondDataset),() =&gt; Process(thirdDataset)); 29. Parallel Forvar items = new DataItem[1000 * 1000];// ...Parallel.For(0, items.Length,i =&gt;{Process(items[i]);}); 30. Parallel ForEachvar tickers = GetNasdaqTickersStream();Parallel.ForEach(tickers,ticker =&gt;{Process(ticker);}); 31. Striped Partitioning ...Thread 1Thread 2 32. Iterate complex data structuresvar tree = new TreeNode();// ...Parallel.ForEach(TraversePreOrder(tree),node =&gt;{Process(node);}); 33. Iterate complex dataThread 1Thread 2 ... 34. Declarative parallelismvar items = new DataItem[1000 * 1000];// ...var validItems =from item in items.AsParallel()let processedItem = Process(item)where processedItem.Property &gt; 42select Convert(processedItem);foreach (var item in validItems){// ...} 35. Data parallelismChallenges- Partitioning- Scheduling- Ordering- Merging- Aggregation- Concurrency hazards: data races, contention 36. Task parallelismHow?- Programs are already functionally partitioned:statements, methods etc.- Run independent pieces in parallel- Control synchronization- State isolation 37. Task parallelismWhy?- To achieve speedup 38. Task parallelismKinds- Structured- clear begin and end points- Unstructured- often demands explicit synchronization 39. Fork/join- Fork: launch tasks asynchronously- Join: wait until they complete- CoBegin, ForAll- Recursive decomposition 40. Fork/join Task 1 Task 2Task 3Seq Seq 41. Fork/joinParallel.Invoke(() =&gt; LoadDataFromFile(),() =&gt; SavePreviousDataToDB(),() =&gt; RenewOtherDataFromWebService()); 42. Fork/joinTask loadData =Task.Factory.StartNew(() =&gt; {// ...});Task saveAnotherDataToDB =Task.Factory.StartNew(() =&gt; {// ...});// ...Task.WaitAll(loadData, saveAnotherDataToDB);// ... 43. Fork/joinvoid Walk(TreeNode node) {var tasks = new[] {Task.Factory.StartNew(() =&gt;Process(node.Value)),Task.Factory.StartNew(() =&gt;Walk(node.Left)),Task.Factory.StartNew(() =&gt;Walk(node.Right))};Task.WaitAll(tasks);} 44. Fork/join recursive RootNode LeftSeqLeft Seq Right Right Node Left Right 45. Dataflow parallelism: FuturesTask loadDataFuture =Task.Factory.StartNew(() =&gt;{//...return LoadDataFromFile();});var dataIdentifier = SavePreviousDataToDB();RenewOtherDataFromWebService(dataIdentifier);//...DisplayDataToUser(loadDataFuture.Result); 46. Dataflow parallelism: FuturesFutureSeqSeq Seq 47. Dataflow parallelism: FuturesFuture FutureFutureSeq Seq SeqSeq Seq 48. Continuations TaskTask TaskSeq Seq Seq 49. Continuationsvar loadData = Task.Factory.StartNew(() =&gt; {return LoadDataFromFile();});var writeToDB = loadData.ContinueWith(dataItems =&gt;{WriteToDatabase(dataItems.Result);});var reportToUser = writeToDB.ContinueWith(t =&gt;{// ...});reportProgressToUser.Wait(); 50. Producer/consumer pipelinereading parsingstoringparsedlinesDB lines 51. Producer/consumer pipelinelinesparsed lines DB 52. Producer/consumervar lines =new BlockingCollection();Task.Factory.StartNew(() =&gt;{foreach (var line in File.ReadLines(...))lines.Add(line);lines.CompleteAdding();}); 53. Producer/consumervar dataItems =new BlockingCollection();Task.Factory.StartNew(() =&gt;{foreach (var line inlines.GetConsumingEnumerable())dataItems.Add(Parse(line));dataItems.CompleteAdding();}); 54. Producer/consumervar dbTask = Task.Factory.StartNew(() =&gt;{foreach (var item indataItems.GetConsumingEnumerable())WriteToDatabase(item);});dbTask.Wait(); 55. Task parallelismChallenges- Scheduling- Cancellation- Exception handling- Concurrency hazards:deadlocks, livelocks, priority inversions etc. 56. Message based parallelism- Accessing shared state vs. local state- No distinction, unfortunately- Idea: encapsulate shared state changes intomessages- Async events- Actors, agents 57. 58. Concurrent data structures- Concurrent Queues, Stacks, Sets, Lists- Blocking collections,- Work stealing queues- Lock free data structures- Immutable data structures 59. Synchronization primitives- Critical sections,- Monitors,- Auto- and Manual-Reset Events,- Coundown Events,- Mutexes,- Semaphores,- Timers,- RW locks- Barriers 60. Thread local state- A way to achieve isolationvar parser = new ThreadLocal(() =&gt; CreateParser());Parallel.ForEach(items,item =&gt; parser.Value.Parse(item)); 61. Thread poolsThreadPool.QueueUserWorkItem(_ =&gt;{// do some work}); 62. AsyncTask.Factory.StartNew(() =&gt;{//...return LoadDataFromFile();}).ContinueWith(dataItems =&gt;{WriteToDatabase(dataItems.Result);}).ContinueWith(t =&gt;{// ...}); 63. Asyncvar dataItems =await LoadDataFromFileAsync();textBox.Text = dataItems.Count.ToString();await WriteToDatabaseAsync(dataItems);// continue work 64. - TPL, PLINQ, C# async, TPL Dataflow- PPL, Intel TBB, OpenMP- CUDA, OpenCL, C++ AMP- Actors, STM- Many others 65. - CPU- Concurrency != parallelism- CPU-bound vs. I/O-bound tasks- Private vs. shared state 66. - Managing state:- Isolation- Immutability- Synchronization - Data: mutual exclusion - Control: notifications 67. - :- Data parallelism: scalable- Task parallelism: less scalable- Message based parallelism 68. - Data parallelism- CoBegin- Parallel ForAll- Parallel ForEach- Parallel ForEach over complex data structures- Declarative data parallelism- Challenges:partitioning, scheduling, ordering, merging, aggregation, concurrency hazards 69. - Task parallelism: structured, unstructured- Fork/Join - CoBegin - Recursive decomposition- Futures- Continuations- Producer/consumer (pipelines)- Challenges:scheduling, cancellation, exceptions, concurrency hazards 70. - /- , - Concurrent data structures- Synchronization primitives- Thread local state- Thread pools- Async invocations- ... 71. Q/A</p>


View more >