matrix multiplication algorithms

Matrix MultiplicationAlgorithms

U.A.Nuli

Sequential Matrix Multiplication Algorithm

Matrix Multiplication on 2D SIMD Mesh


2D Mesh with Wraparound Connections

MATRIX MULTIPLICATION (HYPERCUBE SIMD)

Given the Hypercube SIMD model with n3 = 23q Processors, Two nxn matrix multiplication can be carried out in θ(log n) time

The processing elements can be thought of as filling an nxnxn latticeProcessor Pm where 0 ≤ m ≤ 23q -1, has local memory locations a,b,c,s,t

Matrix Elements a(i, j) and b(i,j) are stored I variable a,b of processor P(2qi+j)

MATRIX MULTIPLICATION (HYPERCUBE SIMD) Parameter: q {Matrix size is 2 q × 2 q} Glogal: l Local: a, b, c, s,t begin { Phase 1: Broadcast matrices A and B } for l ← 3q − 1 downto 2q do

for all Pm, where BIT(m, l) = 1 do t ← BIT COMPLEMENT(m, l) a [t]a ⇐b [t]b ⇐

endfor endfor


for l ← q − 1 downto 0 do for all Pm, where BIT(m, l) != BIT(m, 2q + l) do

t ← BIT COMPLEMENT(m, l) a [t]a ⇐

endfor endfor


Processor ID

BIT(ID,0) BIT(ID,2) t

0 0 0

1 1 0 0

2 0 0

3 1 0 2

4 0 1 5

5 1 1

6 0 1 7

7 1 1

I =0 for q=1


P0P1

P2P3

P4P5

P6 P7

for l ← 2q − 1 downto q do for all Pm, where BIT(m, l) != BIT(m, q + l) do

t ← BIT COMPLEMENT(m, l) b [t]b ⇐

endforendfor


Processor ID

BIT(ID,1) BIT(ID,2) t

0 0 0

1 0 0

2 1 0 0

3 1 0 1

4 0 1 6

5 0 1 7

6 1 1

7 1 1

I =1 for q=1q+l=2


P0P1

P2P3

P4 P5

P6 P7

{ Phase 2: Do the multiplications in parallel }

for all Pm do c ← a × b

Endfor{ Phase 3: Sum the products } for l ← 2q to 3q − 1 do

for all Pm do t ← BIT COMPLEMENT(m, l) s [t]c ⇐c ← c + s

endfor endfor end


Matrix Multiplication Algorithm for UMA Multiprocessor


Total Processors = P

Matrix Size = N*N

Processor P0 (m=0) works on Row no = 0,0+P, 0+2P, ... <=N

Processor P1 (m=1) works on Row no = 1,1+P, 1+2P, … <=N

Processor P2 (m=2) works on Row no = 2,2+P, 0+2P, … <=N

Matrix Multiplication Algorithm for Multicomputers

Row-Column-Oriented Algorithm

Block-Oriented Algorithm

Row-Column-Oriented Algorithm

matrix multiplication algorithms

Documents