효율적인모델기반설계를위한최적화코드생성 · row-major vs. column-major....

48
효율적인 모델 기반 설계를 위한 최적화 코드 생성 김학범, MathWorks Korea

Upload: others

Post on 29-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

효율적인모델기반설계를위한최적화코드생성김학범, MathWorks Korea

Page 2: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Alenia Aermacchi develops

autopilot software for DO-

178B level A certification

ITK engineering develops IEC

62304 compliant controller for

dental drill motor with MBD

Stem accelerates development

of power electronics control

system with MBD

IDNEO develops embedded

computer vision and

machine learning algorithms

for interpreting blood type

results

The new XC 90 is build on

SPA platform utilizing

Model-Based Design and

AUTOSAR in Volvo

Code Generation Utilized in Various Applications and Industries

1

Page 3: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Test &

Verification

Modeling &

Simulation

Code

Generation

• Accelerate development process

• Reduce translation error

• Enable rapid iterative workflows

2

Code Generation Connects Model-Based Design Workflows

Page 4: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Design an Embedded Controller

3

+ Plant orEnvironment

EmbeddedSystem

Controller orApplication

-

How to optimize embedded software?

Page 5: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

4

▪ Model level & Algorithm level analysis

▪ Application-aware optimizations (modeling pattern)

Model Analysis

▪ Implementation level analysis

▪ Target-aware optimizations (resources)

Code Generation

Approach to Code Efficiency with Model-Based Design

Page 6: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Demo: Embedded Coder Quick Start

5

Page 7: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

6

Lines of code

Usage of

global variables

Estimated stack size

/Cyclomatic complexity

Static Code Metrics Report

Page 8: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Minimize global

accesses, Line Of Code

ROM

Efficiency

Execution

Speed

RAM

Efficiency

RAM, ROM and Execution Performance

Data copy

reduction

Buffer reuse

ROM

Efficiency

7

Page 9: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

ROM

Efficiency

Execution

Speed

RAM

Efficiency

RAM, ROM and Execution Performance

ROM

Efficiency

8

Page 10: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Challenge: Maintaining Large and Complex Systems ROM

Efficiency

▪ Size and complexity of systems are

increasing

– “Typical ECU contains 2000 function

components that are each developed by a

different person” Automotive customer

▪ ”Enforcement of low complexity”

required for model standards

– ISO 26262-6 “Product Development at the

Software Level”, Table 1

9

Page 11: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Challenge: Maintaining Large and Complex Systems

10

▪ Studies estimate 13-20% of code in large

systems are cloned *

▪ Old fashion modeling patterns appear:

System

constant

Old New

* Source: Roy and Cordy A Survey on Software Clone Detection Research, Sept 2007

Baker. On Finding Duplication and Near-Duplication in Large Software Systems. In Proceedings of the Second Working Conference on Reverse Engineering (WCRE’95), July 1995.

Copied

Subsystems

ROM

Efficiency

Page 12: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Clone Detection

• Find duplicate model content in your design

• Locate opportunities to optimize with a library

Refactoring

• Replace exact clones with library blocks

• Improve reuse and maintainability

With Identified

Clones Highlighted:Original

model

Refactored Model Clones replaced with

library block

11

Clone Detection & Refactoring ROM

Efficiency

Page 13: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

12

DEMO: Detect Clone in Model ROM

Efficiency

Page 14: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Review Generated Code

13

SS3

SS5

Refactoring SS3,5

Before After

ROM

Efficiency

ROM

Efficiency

Page 15: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Library-Based Subsystem Code Generation

▪ System complexity → Unit testing

– Models

– Subsystems

▪ Library-based Subsystem Code Generation

– Lock down function interfaces

– Generate small reusable sub-functions

– Verify usage within a model using SIL/PIL

– SIL/PIL unit test in library with code coverageNormal

SIL

PIL

Library

C/C++

Code

Generate

14

ROM

Efficiency

Page 16: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Minimize global

Accesses, line of code

RAM

Efficiency

Execution

Speed

ROM

EfficiencyRAM

Efficiency

Data copy

reduction

Buffer reuse

RAM, ROM and Execution Performance

15

Page 17: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

RAM

Efficiency

Execution

Speed

ROM

EfficiencyRAM

Efficiency

RAM, ROM and Execution Performance

16

Page 18: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Reuse Local Block Output RAM

Efficiency

Model_B.yModel_B.Add Model_B.Subsystem1

Model_B.Add Model_B.y

3 buffers

2 buffers

Model_B.y 1 buffer

17

RAM

Efficiency

Page 19: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

18

▪ Reuse buffers with different sizes and/or shapes (dimensions)

– Different buffers collapse to one, the biggest size is kept

Model_B.subsys Model_B.subsys1 Model_Y.Out1

Model_Y.Out1

Reuse Local Block Output RAM

Efficiency

Page 20: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

19

Model.Switch1

Model.Switch2

Reuse signal storage

Reuse Local Block Output RAM

Efficiency

Page 21: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

20

▪ Review Code Generation Report

Use only 3 variables for 9 blocks

-. Reduce 6 variables

Reuse Local Block Output RAM

Efficiency

Page 22: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

21

Before After

Lines of

Code33 30

Total Lines 89 78

▪ Review Static Code Metrics Report

Reuse Local Block Output RAM

Efficiency

Page 23: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

22

▪ Using Signal Labels to Guide Buffer Reuse

– Case#1: Same variable for the Atomic Subsystem and Saturation block outputs

– Case#2: Same variable for the Atomic Subsystem and Gain block outputs

Reuse buffer case#2

Reuse buffer case#1

Reuse Buffer Using Signal Labels RAM

Efficiency

Page 24: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

23

▪ Using Signal Labels to Guide Buffer Reuse

– Same variable for the Atomic Subsystem and Saturation block outputs

Reuse same buffer

Reuse Buffer Using Signal Labels RAM

Efficiency

Page 25: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Subsystem

24

Reduce Code Complexity by Refactoring Subsystem RAM

Efficiency

Page 26: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

▪ Review Code Generation Report

Create subsystem function

Reduce Code Complexity by Refactoring Subsystem RAM

Efficiency

25

Page 27: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Before After

Complexity 6 4

Lines of

Code33 11

Total Lines 78 58

26

Reduce Code Complexity by Refactoring Subsystem RAM

Efficiency

▪ Review Static Code Metrics Report

Page 28: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Optimize Generate Code in Reusable Subsystem

27

RAM

Efficiency

▪ Passing Reusable Subsystem Outputs as Structure Reference

Data StoreData Copy

Reference

Subsystem

StorageExternal

Output

Page 29: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Optimize Generate Code in Reusable Subsystem

28

RAM

Efficiency

▪ Passing Reusable Subsystem Outputs as Individual Argument

Passing data

as argument

Reference

External

Output

• Save buffer storage

• Reduce data copy

• RAM efficiency

• Execution speed

Page 30: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Optimize Generate Code in Reusable Subsystem

29

RAM

Efficiency

Page 31: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Optimize Generate Code in Reusable Subsystem

30

▪ Review Code Generation Report

3 times data copy from

subsystem structure

1 times data passing

using arguments

[Individual argument][Structure reference]

RAM

Efficiency

Page 32: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Optimize Generate Code in Reusable Subsystem

31

▪ Review Static Code Metrics Report

RAM

Efficiency

Global variables: -24bytes

Read/Write: -4 times

[Individual argument][Structure reference]

Page 33: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

32

Off

Automatically

checked options

Easy to Configure Options for Optimizing Code RAM

Efficiency

Page 34: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Minimize global

Accesses, line of code

ROM

EfficiencyRAM

Efficiency

Data copy

reduction

Buffer reuse

Execution

Speed

RAM, ROM and Execution Performance

33

Page 35: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

ROM

EfficiencyRAM

Efficiency

Execution

Speed

RAM, ROM and Execution Performance

34

Page 36: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Row-Major vs. Column-Major Execution

Speed

▪ Row-Major layout

– Elements of the rows are contiguous

– C and C++ use row-major layout

▪ Column-Major layout

– Elements of the columns are contiguous

– MATLAB® and Fortran use column-major layout

X =

𝑥1 𝑥2 𝑥3𝑥4 𝑥5 𝑥6𝑥7 𝑥8 𝑥9

The elements of the array are stored :

𝑥1 𝑥4 𝑥7 𝑥2 𝑥5 𝑥8 𝑥3 𝑥6 𝑥9

X =

𝑥1 𝑥2 𝑥3𝑥4 𝑥5 𝑥6𝑥7 𝑥8 𝑥9

The elements of the array are stored :

𝑥1 𝑥2 𝑥3 𝑥4 𝑥5 𝑥6 𝑥7 𝑥8 𝑥9

Page 37: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

MATLAB: M =

11 12

21 22

31 32

Column-major

code

generation:

M[] = {11, 21, 31, 12, 22, 32}; M[2] = 31

Row-major

code

generation:

M[] = {11, 12, 21, 22, 31, 32}; M[2] = 21 Row-major

indexing

Column-major

indexing

Row-Major vs. Column-Major Execution

Speed

36

Page 38: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Row-Major vs. Column-Major

37

A = 𝑎11 𝑎12 𝑎13𝑎21 𝑎22 𝑎23

Address Access Value

0 A[0][0] 𝑎11

1 A[0][1] 𝑎12

2 A[0][2] 𝑎13

3 A[1][0] 𝑎21

4 A[1][1] 𝑎22

5 A[1][2] 𝑎23

[Raw-major order]

Sequentially

data read

𝑎11,𝑎12, 𝑎13,…….

Execution

Speed

Programmed by C

▪ CPUs Process Sequential Data More Efficiently than nonsequential data

Page 39: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Row-Major vs. Column-Major

38

Address Access Value

0 A[0][0] 𝑎11

1 A[1][0] 𝑎21

2 A[0][1] 𝑎12

3 A[1][1] 𝑎22

4 A[0][2] 𝑎13

5 A[1][2] 𝑎23

[Column major order]

Need data indexing!!

𝑎11,𝑎12, 𝑎13,……. Memory access times increase!!

Execution

Speed

Programmed by C

A = 𝑎11 𝑎12 𝑎13𝑎21 𝑎22 𝑎23

▪ CPUs Process Sequential Data More Efficiently than nonsequential data

Page 40: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Column-major layout Row-major layout

P =

1 2

3 4

5 6

Row-Major vs. Column-Major Execution

Speed

39

Page 41: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Row-major layout

P =

1 2

3 4

5 6

Multi-Dimensional layout

Row-Major and Multi-Dimension Indexing Execution

Speed

40

Page 42: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Generating Row-Major Code

41

Execution

Speed

Page 43: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Code Execution Profiling with SIL and PIL Execution

Speed

▪ Produce execution time metric for tasks and functions in the generated code

– Measure execution time, self time, CPU utilization and number of calls

– Identify tasks that require the most execution time

– In these tasks, investigate code sections that require the most execution time

42

Page 44: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Code Execution Profiling with SIL and PIL

▪ How to Generate Execution-Time Metrics in SIL/PIL Manager

Execution

Speed

43

Page 45: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Improving Code and Model Performance Execution

Speed

44

Page 46: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

C

Improving Code and Model Performance Execution

Speed

45

Page 47: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++

Key Takeaway

46

Execution

Speed

RAM

Efficiency

ROM

Efficiency

▪ Improving Modeling Patterns for Efficiency

– Clone detection, memory efficiency

▪ RAM and Data Copy Reduction

– Buffer reuse, reduction data copy

▪ Execution Speed

– Row-major and column-major

– Code execution profiling

Page 48: 효율적인모델기반설계를위한최적화코드생성 · Row-Major vs. Column-Major. Execution Speed Row-Major layout – Elements of the rows are contiguous – C and C++