tms320 dr. naim dahnoun, bristol university, (c) texas instruments 2004 overview a.a.2009-10

47
TMS320 m Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009- 10

Upload: kari-chatfield

Post on 14-Dec-2015

225 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

TMS320Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Overview

A.A.2009-10

Page 2: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

A.A.2009-10

2

VoIP

Texas Instruments’ TMS320 Family

C2000

C5000

C6000

Lowest CostControl Systems- Motor Control- Storage- Digital Ctrl

Systems

Efficiency Best MIPS per Watt /

Dollar / Size- Wireless phones- Internet audio

players- Digital still cameras - Modems- Telephony- VoIP

Performance &Best Ease-of-Use

- Multi Channel and Multi

Function App's- Comm Infrastructure- Wireless Base-

stations- DSL- Imaging- Multi-media Servers- Video

Digital Signal Processor

Page 3: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

3

C64x: The C64x fixed-point DSPs offer the industry's highest level of performance to address the demands of the digital age. At clock rates of up to 1 GHz, C64x DSPs can process information at rates up to 8000 MIPS with costs as low as $19.95. In addition to a high clock rate, C64x DSPs can do more work each cycle with built-in extensions. These extensions include new instructions to accelerate performance in key application areas such as digital communications infrastructure and video and image processing.

c62x: These first-generation fixed-point DSPs represent breakthrough technology that enables new equipments and energizes existing implementations for multi-channel, multi-function applications, such as wireless base stations, remote access servers (RAS), digital subscriber loop (xDSL) systems, personalized home security systems, advanced imaging/biometrics, industrial scanners, precision instrumentation and multi-channel telephony systems.

C67x:  For designers of high-precision applications, C67x floating-point DSPs offer the speed, precision, power savings and dynamic range to meet a wide variety of design needs. These dynamic DSPs are the ideal solution for demanding applications like audio, medical imaging, instrumentation and automotive.

Digital Signal Processor

A.A.2009-10

C6000

Page 4: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

TMS320C6000Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004

Architectural Overview

A.A.2009-10

Page 5: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

5

TMS320 C6xx

Y =N

å an xnn = 1

*

Prodotto Scalare tra due vettori

Instruction Set

DSP System Block Diagram

A.A.2009-10

Page 6: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

6

TMS320 C6xx

It has been shown in Introduction that SOP is the key element for most

DSP algorithms.

Two basic Operations

are required for this algorithm.

- Multiplication

- Addition Therefore two basic

Instructions are required

Scalar Product of Vectors

The implementation in this module will be done in

assembly for C6xx

= a1 * x1 + a2 * x2 +... + aN * xN

Y =N

å an xnn = 1

*

SOP Algorithm

A.A.2009-10

Page 7: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

7

Multiply (MPY)

The multiplication of a1 by x1 is done in assembly by the following instruction:

MPY a1, x1, Y

This instruction is performed by a multiplier

unit that is called “.M”

Y =N

å an xnn = 1

*

= a1 * x1 + a2 * x2 +... + aN * xN

A.A.2009-10

Page 8: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

8

Multiply (.M unit)

.M.M

Y =40

å an xnn = 1

*

The .M unit performs multiplications in hardware

MPY .M a1, x1, Y

Note: 16-bit by 16-bit multiplier provides a 32-bit result.

32-bit by 32-bit multiplier provides a 64-bit result.

A.A.2009-10

Page 9: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

9

Addition (.L unit)

.M.M

.L.L

Y =40

å an xnn = 1

*

MPY .M a1, x1, prod

ADD .L Y, prod, Y

RISC processors such as the C6000 use registers to hold the operands, so

lets change this code.

RISC processors such as the C6000 use registers to hold the operands, so

lets change this code.A.A.2009-10

Page 10: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

10

Register File A

Y =40

å an xnn = 1

*

MPY .M a1, x1, prod

ADD .L Y, prod, Y

MPY .M A0, A1, A3

ADD .L A4, A3, A4

MPY .M A0, A1, A3

ADD .L A4, A3, A4

.M.M

.L.L

A0

A1

A2

A3

A4

A15

Register File A

.

.

.

a1x1

prod

32-bits

Y

.

.

.

Let us correct this by replacing a, x, prod and Y by the

registers as shown above.

Let us correct this by replacing a, x, prod and Y by the

registers as shown above.

A.A.2009-10

Page 11: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

11

Specifying Register Names

Register File A contains 16 registers (A0 -A15) which are

32-bits wide.

.M.M

.L.L

A0

A1

A2

A3

A4

A15

Register File A

.

.

.

a1x1

prod

32-bits

Y

.

.

.

A.A.2009-10

Page 12: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

12

Data loading (load unit .D)

Q: How do we load the operands into the registers?

.M.M

.L.L

A0

A1

A2

A3

A4

A15

Register File A

.

.

.

a1x1

prod

32-bits

Y

.

.

.

A: The operands are loaded into the registers by loading them from the memory using the .D unit.

.D.D

Data MemoryData Memory

It is worth noting at this stage that the only way to access memory is through

the .D unit.

A.A.2009-10

Page 13: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

13

Load Instruction

Q: Which instruction(s) can be used for loading operands from the memory to the registers?.M.M

.L.L

A0

A1

A2

A3

A4

A15

Register File A

.

.

.

a1x1

prod

32-bits

Y

.

.

.

.D.D

Data MemoryData Memory

A.A.2009-10

Page 14: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

14

Load Instructions

A: The load instructions.

Q: Which instruction(s) can be used for loading operands from the memory to the registers?.M.M

.L.L

A0

A1

A2

A3

A4

A15

Register File A

.

.

.

a1x1

prod

32-bits

Y

.

.

.LDB, LDH

LDW,LDDW.D.D

Data MemoryData Memory

A.A.2009-10

Page 15: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

15

Using the Load Instructions

Before using the load unit you have to be

aware that this processor is byte addressable,

which means that each byte is represented by a

unique address.

Also the addresses are 32-bit wide.

00000000

00000002

00000004

00000006

00000008

Data

16-bits

32 bits address

FFFFFFFF

Byte

A.A.2009-10

Page 16: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

16

The syntax for the load instruction is:

Where:

Rn is a register that contains the address of the operand to be loaded

and

Rm is the destination register.

LD *Rn,Rm

00000000

00000002

00000004

00000006

00000008

Data

16-bits

address

FFFFFFFF

a1x1

prodY

Using the Load Instructions

A.A.2009-10

Page 17: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

17

The syntax for the load instruction is:

The question now is how many bytes are going to be loaded into the destination register?

LD *Rn,Rm

00000000

00000002

00000004

00000006

00000008

Data

16-bits

address

FFFFFFFF

a1x1

prodY

Using the Load Instructions

A.A.2009-10

Page 18: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

18

The syntax for the load instruction is:

LD *Rn,Rm

The answer, is that it depends on the instruction you choose:• LDB: loads one byte (8-bit)

• LDH: loads half word (16-bit)

• LDW: loads a word (32-bit)

• LDDW: loads a double word (64-bit)

Note: LD on its own does not exist.

00000000

00000002

00000004

00000006

00000008

Data

16-bits

address

FFFFFFFF

a1x1

prodY

Using the Load Instructions

A.A.2009-10

Page 19: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

19

Example:If we assume that A5 = 0x4 then:

(1) LDB *A5, A7 ;

gives A7 = 0x00000001

(2) LDH *A5,A7;

gives A7 = 0x00000201

(3) LDW *A5,A7;

gives A7 = 0x04030201

(4) LDDW *A5,A7:A6;

gives A7:A6 = 0x0807060504030201

00000000

00000002

00000004

00000006

00000008

Data

16-bits

address

FFFFFFFF

0xB0xA

0xD0xC

0x010x02

0x030x04

0x050x06

0x070x08

01The syntax for the load instruction is:

LD *Rn,Rm

Little Endianla parte meno significativava memorizzata per prima,

all'indirizzo più bassodi memoria

Using the Load Instructions

A.A.2009-10

Page 20: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

20

Q: If data can only be accessed by the load instruction and the .D unit, how can we load the register pointer Rn in the first place?

00000000

00000002

00000004

00000006

00000008

Data

16-bits

address

FFFFFFFF

0xB0xA

0xD0xC

0x10x2

0x30x4

0x50x6

0x70x8

01The syntax for the load instruction is:

LD *Rn,Rm

Using the Load Instructions

A.A.2009-10

Page 21: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

21

The instruction MVKL will allow a move of a 16-bit constant into a register as shown below:

MVKL .? a, A5(‘a’ is a constant or label)

How many bits represent a full address?

32 bits

So why does the instruction not allow a 32-bit move?

All instructions are 32-bit wide (see instruction opcode).

Loading the Pointer *Rn

A.A.2009-10

Page 22: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

22

To solve this problem another instruction is available: MVKH

eg. MVKH .? a, A5(‘a’ is a constant or label)

ah

ah x

al a

A5

Finally, to move the 32-bit address to a register we can use:

MVKL a, A5

MVKH a, A5

Loading the Pointer *Rn

A.A.2009-10

Page 23: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

23

Always use MVKL then MVKH, look at the following examples:

MVKL 0x1234FABC, A5

A5 = 0xFFFFFABC Wrong

Example 1 (A5 = 0x87654321)

MVKL 0x1234FABC, A5

A5 = 0xFFFFFABC (sign extension)MVKH 0x1234FABC, A5

A5 = 0x1234FABC OK

Example 2 (A5 = 0x87654321)

MVKH 0x1234FABC, A5

A5 = 0x12344321

Loading the Pointer *Rn

A.A.2009-10

Page 24: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

24

MVKL pt1, A5 MVKH pt1, A5

MVKL pt2, A6 MVKH pt2, A6

LDH .D *A5, A0

LDH .D *A6, A1

MPY .M A0, A1, A3

ADD .L A4, A3, A4

pt1 and pt2 point to some locations

in the data memory.

.M.M

.L.L

A0

A1

A2

A3

A4

A15

Register File A

.

.

.

a1x1

prod

32-bits

Y

.

.

.

Le componentiai e xi sono numeri

interi da 16 bits

.D.D

Data MemoryData Memory

LDH, MVKL and MVKHIPOTESI

A.A.2009-10

Page 25: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

25

Page 26: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

26

Creating a loop

MVKL pt1, A5 MVKH pt1, A5

MVKL pt2, A6 MVKH pt2, A6

LDH .D *A5, A0

LDH .D *A6, A1

MPY .M A0, A1, A3

ADD .L A4, A3, A4

So far we have only implemented the SOP for one tap only, i.e.

Y= a1 * x1

So let’s create a loop so that we can implement the SOP for N Taps.

A.A.2009-10

Page 27: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

27

With the C6000 processors there are no dedicated

instructions such as block repeat. The loop is created

using the B instruction.

So far we have only implemented the SOP for one tap only, i.e.

Y= a1 * x1

So let’s create a loop so that we can implement the SOP for N Taps.

Creating a loop – B instruction

A.A.2009-10

Page 28: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

28

Create a label to branch to.

Add a branch instruction, B.

Create a loop counter (LC).

Add an instruction to decrement the LC.

Make the branch conditional based on LC.

What are the steps for creating a loop ?

A.A.2009-10

Page 29: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

29

1. Create a label to branch to

MVKL pt1, A5 MVKH pt1, A5

MVKL pt2, A6 MVKH pt2, A6

loop LDH .D *A5, A0

LDH .D *A6, A1

MPY .M A0, A1, A3

ADD .L A4, A3, A4

A.A.2009-10

Page 30: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

30

MVKL pt1, A5 MVKH pt1, A5

MVKL pt2, A6 MVKH pt2, A6

loop LDH .D *A5, A0

LDH .D *A6, A1

MPY .M A0, A1, A3

ADD .L A4, A3, A4

B .? loop

2. Add a branch instruction, B.

A.A.2009-10

Page 31: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

31

Which unit is used by the B instruction?

MVKL .S pt1, A5 MVKH .S pt1, A5

MVKL .S pt2, A6 MVKH .S pt2, A6

loop LDH .D *A5, A0

LDH .D *A6, A1

MPY .M A0, A1, A3

ADD .L A4, A3, A4

B .S loop

.M.M

.L.L

A0

A1

A2

A3

A15

Register File A

.

.

.

ax

prod

32-bits

Y

.D.D

.M.M

.L.L

A0

A1

A2

A3

A15

Register File A

.

.

.

ax

prod

32-bits

Y

.S.S

.D.D

Data MemoryData MemoryA.A.2009-10

Page 32: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

32

3. Create a loop counter.

MVKL .S pt1, A5 MVKH .S pt1, A5

MVKL .S pt2, A6 MVKH .S pt2, A6

MVKL .S count, B0

loop LDH .D *A5, A0

LDH .D *A6, A1

MPY .M A0, A1, A3

ADD .L A4, A3, A4

B .S loop

B registers will be introduced later

.M.M

.L.L

A0

A1

A2

A3

A15

Register File A

.

.

.

ax

prod

32-bits

Y

.D.D

.M.M

.L.L

A0

A1

A2

A3

A15

Register File A

.

.

.

ax

prod

32-bits

Y

.S.S

.D.D

Data MemoryData MemoryA.A.2009-10

Page 33: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

33

4. Decrement the loop counter

MVKL .S pt1, A5 MVKH .S pt1, A5

MVKL .S pt2, A6 MVKH .S pt2, A6

MVKL .S count, B0

loop LDH .D *A5, A0

LDH .D *A6, A1

MPY .M A0, A1, A3

ADD .L A4, A3, A4

SUB .S B0, 1, B0

B .S loop

.M.M

.L.L

A0

A1

A2

A3

A15

Register File A

.

.

.

ax

prod

32-bits

Y

.D.D

.M.M

.L.L

A0

A1

A2

A3

A15

Register File A

.

.

.

ax

prod

32-bits

Y

.S.S

.D.D

Data MemoryData MemoryA.A.2009-10

Page 34: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

34

What is the syntax for making instruction conditional?

[condition] InstructionLabel

e.g. [B1] B loop

(1) The condition can be one of the following registers: A1, A2, B0, B1, B2.

(2) Any instruction can be conditional.

5. Make the branch conditional based on the value in the loop counter

A.A.2009-10

Page 35: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

35

The condition can be inverted by adding the exclamation symbol “!” as follows:

[!condition] InstructionLabel

e.g.

[!B0]B loop; branch if B0 = 0

[B0] B loop; branch if B0 != 0

5. Make the branch conditional based on the value in the loop counter

A.A.2009-10

Page 36: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

36

MVKL .S2 pt1, A5 MVKH .S2 pt1, A5

MVKL .S2 pt2, A6 MVKH .S2 pt2, A6

MVKL .S2 count, B0

loop LDH .D *A5, A0

LDH .D *A6, A1

MPY .M A0, A1, A3

ADD .L A4, A3, A4

SUB .S B0, 1, B0

[B0] B .S loop

5. Make the branch conditional

.M.M

.L.L

A0

A1

A2

A3

A15

Register File A

.

.

.

ax

prod

32-bits

Y

.D.D

.M.M

.L.L

A0

A1

A2

A3

A15

Register File A

.

.

.

ax

prod

32-bits

Y

.S.S

.D.D

Data MemoryData MemoryA.A.2009-10

Page 37: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

37

With this processor all the instructions are encoded in a 32-bit.

Therefore the label must have a dynamic range of less than 32-bit as the instruction B has to be coded.

Case 1: B .S1 label Relative branch. Label limited to +/- 220 offset.

More on the Branch Instruction (1)

21-bit relative addressB

32-bit

A.A.2009-10

Page 38: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

38

More on the Branch Instruction (2)

By specifying a register as an operand instead of a label, it is possible to have an absolute branch.

This will allow a dynamic range of 232.

Case 2: B .S2 register Absolute branch. Operates on .S2 ONLY!

5-bit register codeB

32-bit

A.A.2009-10

Page 39: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

39

Testing the code

This code performs

the following operations:

a0*x0 + a0*x0 + a0*x0 +

… + a0*x0

However, we would like to perform:

a0*x0 + a1*x1 + a2*x2 +

… + aN*xN

MVKL .S2 pt1, A5 MVKH .S2 pt1, A5

MVKL .S2 pt2, A6 MVKH .S2 pt2, A6

MVKL .S2 count, B0

loop LDH .D *A5, A0

LDH .D *A6, A1

MPY .M A0, A1, A3

ADD .L A4, A3, A4

SUB .S B0, 1, B0

[B0] B .S loop

A.A.2009-10

Page 40: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

40

Modifying the pointers

The solution is

to modify

the pointers

A5 and A6

MVKL .S2 pt1, A5 MVKH .S2 pt1, A5

MVKL .S2 pt2, A6 MVKH .S2 pt2, A6

MVKL .S2 count, B0

loop LDH .D *A5, A0

LDH .D *A6, A1

MPY .M A0, A1, A3

ADD .L A4, A3, A4

SUB .S B0, 1, B0

[B0] B .S loop

A.A.2009-10

Page 41: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

41

Indexing Pointers

Description

Pointer

+ Pre-offset

- Pre-offset

Pre-increment

Pre-decrement

Post-increment

Post-decrement

Syntax PointerModified

*R*+R[disp]*-R[disp]*++R[disp]*--R[disp]*R++[disp]*R--[disp]

No

No

No

Yes

Yes

Yes

Yes

[disp] specifies # elements - size in DW, W, H, or B. disp = R or 5-bit constant. R can be any register.

A.A.2009-10

Page 42: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

42

Modify and testing the code

This code now performs

the following operations:

a0*x0 + a1*x1 + a2*x2 +

... + aN*xN

MVKL .S2 pt1, A5 MVKH .S2 pt1, A5

MVKL .S2 pt2, A6 MVKH .S2 pt2, A6

MVKL .S2 count, B0

loop LDH .D *A5++, A0

LDH .D *A6++, A1

MPY .M A0, A1, A3

ADD .L A4, A3, A4

SUB .S B0, 1, B0

[B0] B .S loop

A.A.2009-10

Page 43: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

43

Store the final result

This code now performs

the following operations:

a0*x0 + a1*x1 + a2*x2 +

... + aN*xN

MVKL .S2 pt1, A5 MVKH .S2 pt1, A5

MVKL .S2 pt2, A6 MVKH .S2 pt2, A6

MVKL .S2 count, B0

loop LDH .D *A5++, A0

LDH .D *A6++, A1

MPY .M A0, A1, A3

ADD .L A4, A3, A4

SUB .S B0, 1, B0

[B0] B .S loop

STH .D A4, *A7

A.A.2009-10

Page 44: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

44

Store the final result

The Pointer A7

is now initialised.

MVKL .S2 pt1, A5 MVKH .S2 pt1, A5

MVKL .S2 pt2, A6 MVKH .S2 pt2, A6

MVKL .S2 pt3, A7MVKH .S2 pt3, A7MVKL .S2 count, B0

loop LDH .D *A5++, A0

LDH .D *A6++, A1

MPY .M A0, A1, A3

ADD .L A4, A3, A4

SUB .S B0, 1, B0

[B0] B .S loop

STH .D A4, *A7

A.A.2009-10

Page 45: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

45

What is the initial value of A4?

A4 is used as

an accumulator,

so it needs to be

reset to zero.

MVKL .S2 pt1, A5 MVKH .S2 pt1, A5

MVKL .S2 pt2, A6 MVKH .S2 pt2, A6

MVKL .S2 pt3, A7MVKH .S2 pt3, A7MVKL .S2 count, B0ZERO .L A4

loop LDH .D *A5++, A0

LDH .D *A6++, A1

MPY .M A0, A1, A3

ADD .L A4, A3, A4

SUB .S B0, 1, B0

[B0] B .S loop

STH .D A4, *A7

A.A.2009-10

Page 46: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

46 TMS320 C6xx

Codex Complete

MVKL .S2 pt1, A5 MVKH .S2 pt1, A5

MVKL .S2 pt2, A6 MVKH .S2 pt2, A6

MVKL .S2 pt3, A7MVKH .S2 pt3, A7

MVKL .S2 N, B0

ZERO .L1 A4

loop LDH .D1 *A5++, A0

LDH .D1 *A6++, A1

MPY .M1 A0, A1, A3

ADD .L1 A4, A3, A4

SUB .S2 B0, 1, B0

[B0] B .S1 loop

STH .D1 A4, *A7

32 bits 32 bits32 bits16 bits

*A5 = an (16 bits)

16 bits

(pt1)

16 bits

(pt2)

16 bits

(pt3)

*A6 = xn (16 bits)

*A7 = Y (16 bits)

B0 = N (16 bits)

A0

A1A4

A.A.2009-10

Page 47: TMS320 Dr. Naim Dahnoun, Bristol University, (c) Texas Instruments 2004 Overview A.A.2009-10

47

Y =40å an xn

n = 1*

Code Review(using side A only)

MVK .S1 40, A2 ; A2 = 40, loop countloop: LDH .D1 *A5++, A0 ; A0 = a(n)

LDH .D1 *A6++, A1 ; A1 = x(n)MPY .M1 A0, A1, A3 ; A3 = a(n) * x(n)ADD .L1 A4, A3, A4 ; Y = Y + A3SUB .L1 A2, 1, A2 ; decrement loop count

[A2] B .S1 loop ; if A2 0, branchSTH .D1 A4, *A7 ; *A7 = Y

Note: Assume that A4 was previously cleared and the pointers are initialised. Assume that A2 is B0

A.A.2009-10