soc 5.1 chapter 5 interconnect computer system design system-on-chip by m. flynn & w. luk pub....
TRANSCRIPT
soc 5.1
Chapter 5Interconnect
Computer System Design
System-on-Chipby M. Flynn & W. Luk
Pub. Wiley 2011 (copyright 2011)
soc 5.2
SOC interconnect design approach
soc 5.3
Interconnect design
• find the cost and performance of alternatives
• iterated to find the least expensive design that meets the requirements
• consider the larger issues: reliability, scalability, design costs, availability of IP
soc 5.4
SOC module with interconnect
soc 5.5
Many alternatives
• find requirements: number of nodes, performance requirements, marginal and development cost
• bus based: purchased IP or proprietary
• NOC based: static vs dynamic
soc 5.6
AMBA bus based system
soc 5.7
Bus terminology
• protocol
• master / slave; agents on the bus
• arbitration / arbitrator :assigns bus ownership
• bridge: communications between protocols
• physical configuration: wires, bidirectional
• synchronization: clock management
• bus wrapper: manages multiple protocols
soc 5.8
MUX connects 3 masters, 4 slaves
soc 5.9
Simple AHB transfer
soc 5.10
Core connect SOC
soc 5.11
PLB transfer protocol
soc 5.12
Bus types and ideal performance(a) Simple Bus
(b) Bus with arbitration support
(c) Tenured split bus: 4 bytes wide
(d) Tenured split bus: 16 bytes wide
bus transmission time: 1 cycle
soc 5.13
OCP and bus wrappers
soc 5.14
Sonics microNetwork
soc 5.15
Hardware gatesfor write buffer
Performance of buffer; burst mode
soc 5.16
Analyzing bus performance
• find offered occupancy () for each source (master); find the number of sources (n)– note a complex superscalar can have multiple
sources as I, D caches can prefetch independently
• does the source immediately resubmit a request if it is denied?
• find achieved occupancy (a) • overall the system’s performance is
reduced by (a/)
soc 5.17
Without resubmissions• Prob(processor does not access bus) = 1 – • Prob(n processors do not access bus) = (1 – n
• Prob(bus is busy) = 1 – (1 – n
= bus bandwidth = bus B( n)• Bw = Bus B( n) / Tbw
• achieved bandwidth per processor a is
na = B( n)
a = B( n) / n
soc 5.18
Resubmissions: iterate to find a
• let offered occupancy be a; initially set a
• find new a =/ (a
• na =1-(1-a)n
soc 5.19
SOC interconnect switches (NOC)• nodes are the units to be connected• links are the connections
– width, w bits
– cycle time, Tch, determines bandwidth
– they can be uni or bidirectional
• message consists of Header– target node address H and payload l
– transmission: Tch(H/w + l /w)
– h=H/w usually assumed to be 1
• links can be– static: links between nodes fixed– dynamic: links vary, as in crossbar
soc 5.20
Static: nodes, links and fanout
soc 5.21
Static (k,d) networks
• networks with– k nodes per dimension– d dimensions (k,d)
• total nodes, N = kd – in hypercube k=2
• most (k,d) have end around closure– fanout = 2d (k>2)
• diameter– (max internode distance with closure) =dk/2
soc 5.22
Static network
soc 5.23
Examples of static networks
soc 5.24
Static network analysis
• for a static (k,n) network– let kd be average number of network hops for
message to transit a single dimension– for bidirectional network with closure kd = k/4, k even
• time to transmit message without contention Tc
– Tc = n x kd + (l/w) in network cycles
– for h = 1
soc 5.25
Dynamic network
soc 5.26
Switch based interconnect
soc 5.27
Dynamic, indirect network
soc 5.28
Crossbar 2x2, kxk
soc 5.29
Dynamic, Indirect Networks
• switches are separate from the nodes
- centralized as a MIN (Multistage Interconnection Network)
• a switch
- k x k crossbar with no storage
• an N-node (1 channel/node) network
- has (N/k)w switches per stage.
• min. number of stages to connect N to N
- [logkN]
soc 5.30
Baseline dynamic network address selects output
soc 5.31
Xfabric (direct network w 2D grid)
soc 5.32
XfabricJunction
soc 5.33
Format of Nexus burst
soc 5.34
NOC layer architecture
soc 5.35
Typical layered NOC
soc 5.36
NOC layered architecture
• physical layer– how packets are transmitted over physical wires
• transport layer– packet routing
• transaction layer– NIU provides service to the IP
• each layer transparent to the other
soc 5.37
NOC layered advantages
• layers can be independently optimized
• scalable
• better Quality of Service control– more optimization points of control
• flexible throughput– can reallocate physical layer resources as required
• multiple clock domain operation
soc 5.38
Transaction, transport and physical layers of an NOC
soc 5.39
PivotPoint Architecture, 3x3 crossbar
soc 5.40
Dynamic vs Static
• Section 5.9 assumes– h=1 (header sent in 1 cycle)– wormhole routing (message can begin to
leave node after h=1 cycles)
• spreadsheet can be used to compare configurations
soc 5.41
Message and header
soc 5.42
Bus pros (+) and cons (-)
• Every unit attached adds parasitic capacitance (-)
• Bus timing is difficult in deep sub-micron process (-)
• Bus testability is problematic and slow (-)
• Bus arbiter delay grows with the number of masters. The arbiter is also instance-specific (-)
• Bandwidth is limited and shared by all units attached (-)
• Bus latency is zero once arbiter has granted control (+)
• The silicon cost of a bus is low for small systems (+)
• Any bus is directly compatible with most IPs, including software running on CPUs (+)
• The concepts are simple and well understood (+)
soc 5.43
NOC pros (+) and cons (-)
• Only point-to-point one-way wires are used for all network sizes (+)
• Network wires can be pipelined because the network protocol is globally asynchronous (+)
• Dedicated BIST is fast and complete (+)
• Routing decisions are distributed and the same router is reinstanciated, for all network sizes (+)
• Aggregated bandwidth scales with the network size (+)
• Internal network contention causes a small latency (-)
• Network uses significant silicon area (-)
• Software needs clean synchronization in multiprocessor systems (-)
• System designers need re-education for new concepts (-)
soc 5.44
Summary
• SOC interconnect design– find the cost and performance of alternatives
• common choices include– buses, e.g. AMBA, CoreConnect– Network-on-Chip NOC, static/dynamic networks
• iterated to find the least expensive design that meets the requirements
• consider the larger issues: reliability, scalability, design costs, availability of IP