trident processor a scalable architecture for scalar, vector, and matrix operations trident...
Post on 20-Dec-2015
242 views
TRANSCRIPT
![Page 1: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/1.jpg)
Trident Processor
A Scalable Architecture for Scalar, Vector, and Matrix
Operations
Trident Processor
A Scalable Architecture for Scalar, Vector, and Matrix
OperationsEng. M. Soliman Prof. S.
Sedukhin
![Page 2: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/2.jpg)
2
ContentsContents
The impacting factors on the processor architecture
The idea of our proposed Trident processor
The Trident parallelism
The architecture of the Trident processor
The features of the Trident processor
Conclusion and Future work
![Page 3: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/3.jpg)
3
TechnologyApplications
Characteristics
processorArchitecture
The Important Factors Impact on the Processor Architecture The Important Factors Impact on the Processor Architecture
![Page 4: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/4.jpg)
4
Fast-improving TechnologyFast-improving TechnologyMoose's law: The number of transistors per
integrated circuit would double every 18 months
![Page 5: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/5.jpg)
5
Application CharacteristicsApplication Characteristics
Processor Multimedia extension
Intel Pentium II, III, and 4
MMX, SSE, and SSE2
Motorola PowerPC AltiVec
Silicon Graphics MIPS MDMX
Sun Sparc VIS
Hewlett-Packard PA-RISC
MAX
In response to the increasing importance of multimedia applications, major processor vendors have announced extensions to their general purpose processors in an effort to improve their multimedia performance
![Page 6: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/6.jpg)
6
The Idea of the Trident ProcessorThe Idea of the Trident Processor
The huge transistor budget (within a few years it will be possible to integrate a billion transistors on a single chip )
The requirements of future applications (the scientific and engineering applications, multimedia applications, … , are based on vector and matrix operations)
![Page 7: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/7.jpg)
7
Scalar IS(1
operation)
Vector IS (n operations)
Matrix IS(n2/n3 operations)
We Propose the Trident ProcessorWe Propose the Trident Processor
Trident: A general-purpose processor which has three instruction sets (IS): scalar, vector, and matrix
![Page 8: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/8.jpg)
8
Ins. Set Example Scalar Code Scalar ops
Scalar Addition z=x+y; 1
VectorAddition
for(i=0;i<n;i++)z[i]=x[i]+y[i]; O(n)
Dot products=0;for(i=0;i<n;i++)s+=x[i]*y[i];
O(n)
Additionfor(i=0;i<n;i++)for(j=0;j<n;j++)z[i][j]=x[i][j]+ y[i][j];
O(n2)
Matrix Matrix-vector multiplication
for(i=0;i<n;i++){s=0;for(j=0;j<n;j++)s+=x[i][j]*y[j];z[i]=s;}
O(n2)
Matrix-matrix multiplication
for(i=0;i<n;i++)for(j=0;j<n;j++){s=0;for(k=0;k<n;k++)s+=x[i][k]*y[k][j];z[i][j]=s;}
O(n3)
The Trident Instruction setsThe Trident Instruction sets
![Page 9: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/9.jpg)
9
Trident processor exploits a significant amount (up to three levels) of data parallelism The advantages of using data parallelism
Compact:A single short instruction can describe array of scalar operations
Expressive: A single instruction can pass valuable information about an array of
scalar operations to hardware
Scalable: adding more hardware can increase performance by processing
longer arrays
The Trident ParallelismThe Trident Parallelism
![Page 10: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/10.jpg)
10
The Trident ArchitectureThe Trident Architecture
![Page 11: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/11.jpg)
11
Vector ProcessingVector Processing
A vector pipeline can perform the fundamental vector operation, such as addition, subtraction, multiplication, and division
Vector data are stored on ring vector registers
Multiple vector instructions can be operated concurrently on the parallel vector pipelines
![Page 12: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/12.jpg)
12
Step0
Inputa0 , b0
Output
1 a! , b1 a0 + b0
2 a3 , b3 a1 + b1
3 a3 , b3 a2 + b2
4 a0 , b0 a3 + b3
VR2 VR0 + VR1
Example: vector additionExample: vector addition
![Page 13: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/13.jpg)
13
Matrix ProcessingMatrix ProcessingBy using parallel vector pipelines and ring matrix register file, the fundamental matrix operations, such as addition, subtraction, multiplication, and inversion, can be performed
![Page 14: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/14.jpg)
14
Example: Matrix additionExample: Matrix addition
MR2 MR0 + MR1P3P2P1P0P3P2P1P0
OutputInput
Ste
p
0a00
b00
a10
b10
a20
b20
a30
b30
1a01
b01
a11
b11
a21
b21
a31
b31
a00+b00
a10 + b10
a20
+
b20
a30
+
b30
2a02
b02
a12
b12
a22
b22
a32
b32
a01+b01
a11 + b11
a21
+
b21
a31
+
b31
3a03
b03
a13
b13
a23
b23
a33
b33
a02+b02
a12 + b12
a22
+
b22
a32
+
b32
![Page 15: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/15.jpg)
15
The basic matrix operation is the matrix-matrix multiplication
Matrix-matrix MultiplicationMatrix-matrix Multiplication
![Page 16: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/16.jpg)
16
ChainingChaining
1
0
n
jjiji bac
1
0
n
kkjikij bac
1
0
n
iii bac
Matrix-matrix multiplication
Matrix-vector multiplication
Dot product
![Page 17: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/17.jpg)
17
Instructions O(n3) O(n2) O(1)
Load O(n3) O(n3) O(n2)
Store O(n2) O(n2) O(n2)
Mull-acc. O(n3) O(n3) O(n2)
Branch O(n3) O(n2) 0
Address comp. O(n3) O(n2) O(1)
Add/sub. O(n3) O(n2) 0
Reg. initialization
O(n2) O(n) 0
Scalar IS Vector IS Matrix IS
Matrix-matrix Multiplication ComplexityMatrix-matrix Multiplication Complexity
![Page 18: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/18.jpg)
18
0
500
1000
1500
2000
2500
3000
3500
4000
1 2 3scalar
vector matrix
88 Matrix-matrix Multiplication88 Matrix-matrix Multiplication
Number of instructions
![Page 19: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/19.jpg)
19
0
200
400
600
800
1000
1200
1400
1 2 3 4 5 6 7
scalar vector matrix
(1) load, (2) store, (3) multiply-accumulate steps, (4) branch, (5) address computations, (6) addition/ subtraction, and (7) register initializations
ContinueContinue
![Page 20: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/20.jpg)
20
What this means?What this means?fewer instruction cache misses, fewer instruction fetches and decodes, fewer branches and fewer mispredicted branches,more predictable memory accesses, fewer hazards We can say that Trident code is compact code with powerful instructions for high performance
![Page 21: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/21.jpg)
21
The Trident Processor FeaturesThe Trident Processor Features
The Trident processor consists mainly of datapath circuitry and register files
The advances in the VLSI fabrication technology can be directly applied to support more parallelism
Simple control unit
There are many applications benefit from executing on the Trident processor, such as scientific, engineering, multimedia, and many others
![Page 22: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/22.jpg)
22
Future WorkFuture Work
Simulating the Trident processor
Evaluating the performance of Trident processor on some multimedia and numerical applications
Comparing the performance of Trident processor with the superscalar processors
![Page 23: Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix Operations Trident Processor A Scalable Architecture for Scalar, Vector, and Matrix](https://reader031.vdocuments.mx/reader031/viewer/2022032015/56649d485503460f94a23836/html5/thumbnails/23.jpg)
23
Thank youThank you