developing scalable - amddeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · developing...

22

Upload: vuhanh

Post on 22-Aug-2019

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DEVELOPING SCALABLE - AMDdeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, Microsoft
Page 2: DEVELOPING SCALABLE - AMDdeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, Microsoft

DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, MicrosoftSenior Program Manager

Page 3: DEVELOPING SCALABLE - AMDdeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, Microsoft

3 | Developing Scalable Applications with Microsoft’s C++ Concurrency Runtime| June 15, 2011

DEVELOPING SCALABLE APPLICATIONS WITH

MICROSOFT’S C++ CONCURRENCY RUNTIME

Page 4: DEVELOPING SCALABLE - AMDdeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, Microsoft

4 | Developing Scalable Applications with Microsoft’s C++ Concurrency Runtime| June 15, 2011

What we are talking about…

This is a C++ talk

I will not talk about C++ AMP in this session– Go to Daniel Moth’s session at 2pm in 406 titled “Blazing-fast code using GPUs and more, with

Microsoft Visual C++”

The focus of this talk is on refactoring using:– Parallel Patterns Library (PPL) – Asynchronous Agents Library (Agents)– Performance analysis with Visual Studio– And some debugging…

Page 5: DEVELOPING SCALABLE - AMDdeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, Microsoft

5 | Developing Scalable Applications with Microsoft’s C++ Concurrency Runtime| June 15, 2011

Map of C++ Concurrency

Parallel Pattern Library

Resource Manager

Task Scheduler

Concurrency Runtime

Programming Models

Dat

a St

ruct

ures

AgentsLibrary

Tools

ParallelTasksStacks

Debugger Windows

Concurrency Visualizer

Page 6: DEVELOPING SCALABLE - AMDdeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, Microsoft

6 | Developing Scalable Applications with Microsoft’s C++ Concurrency Runtime| June 15, 2011

Parallel Programming is Hard

Success hinges on:• Experience • Friends• Luck

…and cool tools

Page 7: DEVELOPING SCALABLE - AMDdeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, Microsoft

7 | Developing Scalable Applications with Microsoft’s C++ Concurrency Runtime| June 15, 2011

PERFORMANCE: that’s what counts…

This is what you want…

Page 8: DEVELOPING SCALABLE - AMDdeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, Microsoft

8 | Developing Scalable Applications with Microsoft’s C++ Concurrency Runtime| June 15, 2011

PERFORMANCE: that’s what counts…

This is what you have…

Page 9: DEVELOPING SCALABLE - AMDdeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, Microsoft

9 | Developing Scalable Applications with Microsoft’s C++ Concurrency Runtime| June 15, 2011

Refactoring your code to scaleIs it your code?– I need to understand what is going

on

Where do you start?– I need tools to guide where I focus

my time

How much improvement can I expect to achieve?– Without spending too much time– …and (re)writing a lot of code

Page 10: DEVELOPING SCALABLE - AMDdeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, Microsoft

10 | Developing Scalable Applications with Microsoft’s C++ Concurrency Runtime| June 15, 2011

REMEMBER AMDAHL

Set realistic expectations

Focus your time where it counts 0%

500%

1000%

1500%

2000%

2500%

2 4 8 16 32 64 128

256

512

1024

2048

4096

8192

1638

432

768

Obs

erve

d Sp

eedu

p

Linear speedup (Cores)

10%

25%

50%

70%

90%

95%

1

(1-P)+P/S

Page 11: DEVELOPING SCALABLE - AMDdeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, Microsoft

11 | Developing Scalable Applications with Microsoft’s C++ Concurrency Runtime| June 15, 2011

DEMO: REFACTORING OCEANS

…A STUDY IN REFACTORING UNFAMILIAR CODE FOR

SCALING

Page 12: DEVELOPING SCALABLE - AMDdeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, Microsoft

12 | Developing Scalable Applications with Microsoft’s C++ Concurrency Runtime| June 15, 2011

Lessons Learned

KISS • Optimize Serial Code First• Optimize your loop size on serial code

One step at a time

• Add parallel code one piece at a time• Focus on the high value code first• What is high value changes for each run

Fill in the gaps

• Add asynchronous code to leverage both CPU and GPU at the same time.

Don’t cut corners

• Sloppy coding even when working with spaghetti will come to haunt you

• Never ever use [&] in a lambda

Page 13: DEVELOPING SCALABLE - AMDdeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, Microsoft

13 | Developing Scalable Applications with Microsoft’s C++ Concurrency Runtime| June 15, 2011

What did you just see?

Guided refactoring using Visual Studio Performance analysis

Easy scaling without a lot of rewriting using PPL

Understanding the concurrency using the concurrency visualizer and debugger

Page 14: DEVELOPING SCALABLE - AMDdeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, Microsoft

14 | Developing Scalable Applications with Microsoft’s C++ Concurrency Runtime| June 15, 2011

Agents and Data Flow Concepts

Agents• Has a queryable lifetime state• You can wait for an agent to finish• You can cancel an agent

Data Flow• Connect message blocks together• Filters may be defined • Control flow with joins and choices

6/23/2011 14

Page 15: DEVELOPING SCALABLE - AMDdeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, Microsoft

15 | Developing Scalable Applications with Microsoft’s C++ Concurrency Runtime| June 15, 2011

Asynchronous Agents Library

15

Message Blocks for Storing Data• unbounded_buffer<T>• overwrite_buffer<T>• single_assignment<T>

Agent base class• Start, stop• State management

Message Control flow• Choice, coin• transformer<T,U>, call<T>• link_target

Send and receive• co-operatively send & receive

messages• asend, send• receive, try_receive

Page 16: DEVELOPING SCALABLE - AMDdeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, Microsoft

16 | Developing Scalable Applications with Microsoft’s C++ Concurrency Runtime| June 15, 2011

Parallel Pattern Library Concepts

Task • a computation that may be internally decomposed into additional tasks.

Task group • a collection of tasks that form a logical computation or sub-computation.

Parallel Loops

• STL style algorithms with support for cancellation and composition.

Continuation • a task that runs after a task; allows for wait-free algorithms

Page 17: DEVELOPING SCALABLE - AMDdeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, Microsoft

17 | Developing Scalable Applications with Microsoft’s C++ Concurrency Runtime| June 15, 2011

Parallel Pattern Library

Task Parallelism • task_handle• task_group• structured_task_group• task<T>

• continue_with<T>• when_all<T>• when_any<T>• operators &&, ||

Concurrent Containers• concurrent_queue• concurrent_vector• combinable • concurrent_unordered_map• concurrent_unordered_multimap• concurrent_unordered_set• concurrent_unordered_multisetl

Parallel Algorithms• parallel_for• parallel_for_each• parallel_invoke

Sample Pack• parallel_for_fixed• parallel_transform• parallel_reduce• parallel_sort• parallel_buffered_sort• parallel_radixsort

Sample Pack• task<T>

• continue_with<T>• when_all<T>• when_any<T>• operators &&, ||

Sample Pack• concurrent_unordered_map• concurrent_unordered_multimap• concurrent_unordered_set• concurrent_unordered_multisetl

Page 18: DEVELOPING SCALABLE - AMDdeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, Microsoft

18 | Developing Scalable Applications with Microsoft’s C++ Concurrency Runtime| June 15, 2011

Parallel Pattern Library

Task Parallelism • task_handle• task_group• structured_task_group• task<T>

• continue_with<T>• when_all<T>• when_any<T>• operators &&, ||

Concurrent Containers• concurrent_queue• concurrent_vector• combinable • concurrent_unordered_map• concurrent_unordered_multimap• concurrent_unordered_set• concurrent_unordered_multisetl

Parallel Algorithms• parallel_for• parallel_for_each• parallel_invoke

V.next• partitioners• parallel_transform• parallel_reduce• parallel_sort• parallel_buffered_sort• parallel_radixsort

V.next• task<T>

• then<T>• when_all<T>• when_any<T>• operators &&, ||

V.next• concurrent_unordered_map• concurrent_unordered_multimap• concurrent_unordered_set• concurrent_unordered_multisetl

Page 19: DEVELOPING SCALABLE - AMDdeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, Microsoft

19 | Developing Scalable Applications with Microsoft’s C++ Concurrency Runtime| June 15, 2011

Other Planned Improvements

Control NUMA workloads through locality

Control Process Affinity

No blocking when passing asynchronous messages

Lower runtime overhead on large core-count machines

Page 20: DEVELOPING SCALABLE - AMDdeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, Microsoft

20 | Developing Scalable Applications with Microsoft’s C++ Concurrency Runtime| June 15, 2011

Resources

ConcRT Extras / Sample Pack–http://code.msdn.com/ConcRTExtras

Native Concurrency Forum–http://social.msdn.microsoft.com/Forums/parallelcppnative/

Native Concurrency Team Blog–http://blogs.msdn.com/nativeconcurrency

C++ Team Blog–http://blogs.msdn.com/vcblog

MSDN Concurrency–http://www.msdn.com/concurrency

Page 21: DEVELOPING SCALABLE - AMDdeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, Microsoft

21 | Developing Scalable Applications with Microsoft’s C++ Concurrency Runtime| June 15, 2011

Questions?

Parallel Pattern Library

Resource Manager

Task Scheduler

[email protected]

Concurrency Runtime

Programming Models

Dat

a St

ruct

ures

AgentsLibrary

Tools

ParallelTasksStacks

Debugger Windows

Concurrency Visualizer

Page 22: DEVELOPING SCALABLE - AMDdeveloper.amd.com/wordpress/media/2013/06/2261_final.pdf · DEVELOPING SCALABLE APPLICATIONS with Microsoft’s C++ Concurrency Runtime Dana Groff, Microsoft

22 | Developing Scalable Applications with Microsoft’s C++ Concurrency Runtime| June 15, 2011

Disclaimer & AttributionThe information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors.

The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limitedto product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. There is no obligation to update or otherwise correct or revise this information. However, we reserve the right to revise this information and to make changes from time to time to the content hereof without obligation to notify any person of such revisions or changes.

NO REPRESENTATIONS OR WARRANTIES ARE MADE WITH RESPECT TO THE CONTENTS HEREOF AND NO RESPONSIBILITY IS ASSUMED FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION.

ALL IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE ARE EXPRESSLY DISCLAIMED. IN NO EVENT WILL ANY LIABILITY TO ANY PERSON BE INCURRED FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

AMD, the AMD arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc. All other names used in this presentation are for informational purposes only and may be trademarks of their respective owners.

The contents of this presentation were provided by individual(s) and/or company listed on the title page. The information and opinions presented in this presentation may not represent AMD’s positions, strategies or opinions. Unless explicitly stated, AMD is not responsible for the content herein and no endorsements are implied.