csci 136 computer architecture ii – buses and io xiuzhen cheng [email protected]
TRANSCRIPT
![Page 1: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/1.jpg)
Csci 136 Computer Architecture IICsci 136 Computer Architecture II– Buses and IO– Buses and IO
Xiuzhen [email protected]
![Page 2: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/2.jpg)
Announcement
Homework assignment #14, Due time – .Readings: Sections 8.4 – 8.6
Problems: 8.8, 8.10 – 8.22
Examples in Sections 8.4 and 8.5 In lab sessions???
Final: Tuesday, May 4th, 11:00-1:00PM
Note: you must pass final to pass this course!
![Page 3: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/3.jpg)
A Bus Is:
shared communication link
single set of wires used to connect multiple subsystems
A Bus is also a fundamental tool for composing large, complex systems
systematic means of abstraction
Control
Datapath
Memory
ProcessorInput
Output
What is a bus?
![Page 4: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/4.jpg)
Versatility:New devices can be added easily
Peripherals can be moved between computersystems that use the same bus standard
Low Cost:A single set of wires is shared in multiple ways
MemoryProcesser
I/O Device
I/O Device
I/O Device
Advantages of Buses
![Page 5: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/5.jpg)
It creates a communication bottleneckThe bandwidth of that bus can limit the maximum I/O throughput
The maximum bus speed is largely limited by:The length of the bus
The number of devices on the bus
The need to support a range of devices with:Widely varying latencies
Widely varying data transfer rates
MemoryProcesser
I/O Device
I/O Device
I/O Device
Disadvantage of Buses
![Page 6: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/6.jpg)
Control lines:Signal requests and acknowledgments
Indicate what type of information is on the data lines
Data lines carry information between the source and the destination:
Data and Addresses
Complex commands
A bus transaction includes two parts:Issuing the command (and address) – request
Transferring the data – action
Data Lines
Control Lines
The General Organization of a Bus
![Page 7: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/7.jpg)
Types of BusesProcessor-Memory Bus (design specific)
Short and high speed
Only need to match the memory systemMaximize memory-to-processor bandwidth
Connects directly to the processor
Optimized for cache block transfers
I/O Bus (industry standard)Usually is lengthy and slower
Need to match a wide range of I/O devices
Connects to the processor-memory bus or backplane bus
Backplane Bus (standard or proprietary)Backplane: an interconnection structure within the chassis
Allow processors, memory, and I/O devices to coexist
Cost advantage: one bus for all components
![Page 8: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/8.jpg)
Processor/MemoryBus
PCI Bus
I/O Busses
Example: Pentium System Organization
![Page 9: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/9.jpg)
A Computer System with One Bus: Backplane Bus
A single bus (the backplane bus) is used for:Processor to memory communication
Communication between I/O devices and memory
Advantages: Simple and low cost
Disadvantages: slow and the bus can become a major bottleneck
Example: IBM PC - AT
Processor Memory
I/O Devices
Backplane Bus
![Page 10: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/10.jpg)
A Two-Bus System
I/O buses tap into the processor-memory bus via bus adaptors:
Processor-memory bus: mainly for processor-memory trafficI/O buses: provide expansion slots for I/O devices
Apple Macintosh-IINuBus: Processor, memory, and a few selected I/O devicesSCCI Bus: the rest of the I/O devices
Processor Memory
I/OBus
Processor Memory Bus
BusAdaptor
BusAdaptor
BusAdaptor
I/OBus
I/OBus
![Page 11: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/11.jpg)
A Three-Bus System (+ backside cache)
A small number of backplane buses tap into the processor-memory bus
Processor-memory bus is only used for processor-memory traffic
I/O buses are connected to the backplane bus
Advantage: loading on the processor bus is greatly reduced
Processor Memory
Processor Memory Bus
BusAdaptor
BusAdaptor
BusAdaptor
I/O Bus
BacksideCache bus
I/O BusL2 Cache
![Page 12: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/12.jpg)
Synchronous Bus:Includes a clock in the control lines
A fixed protocol for communication that is relative to the clock
Advantage: involves very little logic and can run very fast
Disadvantages:Every device on the bus must run at the same clock rate
To avoid clock skew, they cannot be long if they are fast
Asynchronous Bus:It is not clocked
It can accommodate a wide range of devices
It can be lengthened without worrying about clock skew
It requires a handshaking protocolNeeds additional set of control lines
Synchronous and Asynchronous Bus
![Page 13: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/13.jpg)
An asynchronous handshaking protocol
Figure 8.10: 7 steps to read a word from memory and receive it in an I/O device
![Page 14: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/14.jpg)
All agents operate synchronously
All can source / sink data at same rate
=> simple protocoljust manage the source and target
Simplest bus paradigm
![Page 15: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/15.jpg)
Separate versus multiplexed address and data lines:
Address and data can be transmitted in one bus cycleif separate address and data lines are available
Cost: (a) more bus lines, (b) increased complexity
Data bus width:By increasing the width of the data bus, transfers of multiple words require fewer bus cycles
Example: SPARCstation 20’s memory bus is 128 bit wide
Cost: more bus lines
Block transfers:Allow the bus to transfer multiple words in back-to-back bus cycles
Only one address needs to be sent at the beginning
The bus is not released until the last word is transferred
Cost: (a) increased complexity (b) decreased response time for request
Increasing the Bus Bandwidth
![Page 16: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/16.jpg)
Bus Transaction
Arbitration: Who gets the busAvoid chaos – multiple transactions simultaneously
Request: What do we want to do
Action: What happens in response
![Page 17: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/17.jpg)
A bus transaction includes two parts:Issuing the command (and address) – request
Transferring the data – action
Master is the one who starts the bus transaction by:issuing the command (and address)
Slave is the one who responds to the address by:Sending data to the master if the master ask for data
Receiving data from the master if the master wants to send data
BusMaster
BusSlave
Master issues command
Data can go either way
Master versus Slave
![Page 18: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/18.jpg)
One of the most important issues in bus design:How is the bus reserved by a device that wishes to use it?
Chaos is avoided by a master-slave arrangement:Only the bus master can control access to the bus:
It initiates and controls all bus requests
A slave responds to read and write requests
The simplest system:Processor is the only bus master
All bus requests must be controlled by the processor
Major drawback: the processor is involved in every transaction
BusMaster
BusSlave
Control: Master initiates requests
Data can go either way
Arbitration: Obtaining Access to the Bus
![Page 19: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/19.jpg)
Multiple Potential Bus Masters: the Need for Arbitration
Bus arbitration scheme:A bus master wanting to use the bus asserts the bus request
A bus master cannot use the bus until its request is granted
A bus master must signal to the arbiter after finish using the bus
Bus arbitration schemes usually try to balance two factors:
Bus priority: the highest priority device should be serviced first
Fairness: Even the lowest priority device should never be completely locked out from the bus
Bus arbitration schemes can be divided into four broad classes:
Daisy chain arbitration
Centralized, parallel arbitration
Distributed arbitration by self-selection: each device wanting the bus places a code indicating its identity on the bus.
Distributed arbitration by collision detection: Each device just “goes for it”. Problems found after the fact.
![Page 20: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/20.jpg)
The Daisy Chain Bus Arbitrations Scheme
Advantage: simple
Disadvantages:Cannot assure fairness: A low-priority device may be locked out indefinitely
The use of the daisy chain grant signal also limits the bus speed
BusArbiter
Device 1HighestPriority
Device NLowestPriority
Device 2
Grant Grant Grant
Release
Request
wired-OR
![Page 21: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/21.jpg)
Used in essentially all processor-memory busses and in high-speed I/O busses
BusArbiter
Device 1 Device NDevice 2
Grant Req
Centralized Parallel Arbitration
![Page 22: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/22.jpg)
Overlapped arbitrationperform arbitration for next transaction during current transaction
Bus parkingmaster can holds onto bus and performs multiple transactions as long as no other master makes request
Overlapped address / data phasesrequires one of the above techniques
Split-phase (or packet switched) buscompletely separate address and data phases
arbitrate separately for each
address phase yield a tag which is matched with data phase
”All of the above” in most modern buses
Increasing Transaction Rate on Multimaster Bus
![Page 23: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/23.jpg)
° ° °Master Slave
Control LinesAddress LinesData Lines
Bus Master: has ability to control the bus, initiates transaction
Bus Slave: module activated by the transaction
Bus Communication Protocol: specification of sequence of events and timing requirements in transferring information.
Asynchronous Bus Transfers: control lines (req, ack) serve to orchestrate sequencing.
Synchronous Bus Transfers: sequence relative to common clock.
Busses so far
![Page 24: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/24.jpg)
Giving Commands to I/O Devices
Two methods are used to address the device:Special I/O instructions
Memory-mapped I/O
Special I/O instructions specify:Both the device number and the command word
Device number: the processor communicates this via aset of wires normally included as part of the I/O bus
Command word: this is usually send on the bus’s data lines
Memory-mapped I/O:Portions of the address space are assigned to I/O device
Read and writes to those addresses are interpretedas commands to the I/O devices
User programs are prevented from issuing I/O operations directly:
The I/O address space is protected by the address translation
Why it is popular?
![Page 25: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/25.jpg)
Single Memory & I/O Bus No Separate I/O Instructions
CPU
Interface Interface
Peripheral Peripheral
Memory
ROM
RAM
I/O$
CPU
L2 $
Memory Bus
Memory Bus Adaptor
I/O bus
Memory Mapped I/O
![Page 26: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/26.jpg)
I/O Device Notifying the OS
The OS needs to know when:The I/O device has completed an operation
The I/O operation has encountered an error
This can be accomplished in two different waysI/O Interrupt:
Whenever an I/O device needs attention from the processor,it interrupts the processor from what it is currently doing.
Polling:The I/O device put information in a status register
The OS periodically check the status register
![Page 27: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/27.jpg)
I/O Interrupt
An I/O interrupt is just like the exceptions except:An I/O interrupt is asynchronous
Further information needs to be conveyed
An I/O interrupt is asynchronous with respect to instruction execution:
I/O interrupt is not associated with any instruction
I/O interrupt does not prevent any instruction from completionYou can pick your own convenient point to take an interrupt
I/O interrupt is more complicated than exception:Needs to convey the identity of the device generating the interrupt
Interrupt requests can have different urgencies:Interrupt request needs to be prioritized
![Page 28: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/28.jpg)
Polling: Programmed I/O
Advantage: Simple: the processor is totally in control and does all the work
Disadvantage:Polling overhead can consume a lot of CPU time
CPU
IOC
device
Memory
Is thedata
ready?
readdata
storedata
yes no
done? no
yes
busy wait loopnot an efficient
way to use the CPUunless the device
is very fast!
but checks for I/O completion can bedispersed among
computation intensive code
![Page 29: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/29.jpg)
Delegating I/O Responsibility from the CPU: DMA
Direct Memory Access (DMA):
External to the CPU
Act as a maser on the bus
Transfer blocks of data to or from memory without CPU intervention
CPU
IOC
device
Memory DMAC
CPU sends a starting address, direction, and length count to DMAC. Then issues "start".
DMAC provides handshakesignals for PeripheralController, and MemoryAddresses and handshakesignals for Memory.
![Page 30: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/30.jpg)
What is DMA (Direct Memory Access)?
Typical I/O devices must transfer large amounts of data to memory or processor:
Disk must transfer complete block (4K? 16K?)Large packets from networkRegions of frame buffer
DMA gives external device ability to write memory directly: much lower overhead than having processor request one word at a time.
Processor (or at least memory system) acts like slave
Issue: Cache coherence:What if I/O devices write data that is currently in processor Cache?
The processor may never see new data!
Solutions: Flush cache on every I/O operation (expensive)Have hardware invalidate cache lines (remember “Coherence” cache misses?)
![Page 31: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/31.jpg)
Delegating I/O Responsibility from the CPU: IOP
CPU IOP
Mem
D1
D2
Dn
. . .main memory
bus
I/Obus
CPU
IOP
(1) Issuesinstructionto IOP
memory
(2)
(3)
Device to/from memorytransfers are controlledby the IOP directly.
IOP steals memory cycles.
OP Device Address
target devicewhere cmnds are
IOP looks in memory for commands
OP Addr Cnt Other
whatto do
whereto putdata
howmuch
specialrequests
(4) IOP interrupts CPU when done
![Page 32: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/32.jpg)
Responsibilities of the Operating System
The operating system acts as the interface between:The I/O hardware and the program that requests I/O
Three characteristics of the I/O systems:The I/O system is shared by multiple program using the processor
I/O systems often use interrupts (external generated exceptions) to communicate information about I/O operations.
Interrupts must be handled by the OS because they cause a transfer to supervisor mode
The low-level control of an I/O device is complex:Managing a set of concurrent events
The requirements for correct device control are very detailed
![Page 33: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/33.jpg)
Operating System Requirements
Provide protection to shared I/O resourcesGuarantees that a user’s program can only access theportions of an I/O device to which the user has rights
Provides abstraction for accessing devices:Supply routines that handle low-level device operation
Handles the interrupts generated by I/O devices
Provide equitable access to the shared I/O resourcesAll user programs must have equal access to the I/O resources
Schedule accesses in order to enhance system throughput
![Page 34: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/34.jpg)
OS and I/O Systems Communication Requirements
The Operating System must be able to prevent:The user program from communicating with the I/O device directly
If user programs could perform I/O directly:Protection to the shared I/O resources could not be provided
Three types of communication are required:The OS must be able to give commands to the I/O devices
The I/O device must be able to notify the OS when the I/O device has completed an operation or has encountered an error
Data must be transferred between memory and an I/O device
![Page 35: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/35.jpg)
Summary ½
Buses are an important technique for building large-scale systems
Their speed is critically dependent on factors such as length, number of devices, etc.Critically limited by capacitanceTricks: esoteric drive technology such as GTL
Important terminology:Master: The device that can initiate new transactionsSlaves: Devices that respond to the master
Two types of bus timing:Synchronous: bus includes clockAsynchronous: no clock, just REQ/ACK strobing
Direct Memory Access (dma) allows fast, burst transfer into processor’s memory:
Processor’s memory acts like a slaveProbably requires some form of cache-coherence so that DMA’ed memory can be invalidated from cache.
Interfacing I/O devices to memory, processor,and OS
![Page 36: Csci 136 Computer Architecture II – Buses and IO Xiuzhen Cheng cheng@gwu.edu](https://reader036.vdocuments.mx/reader036/viewer/2022062409/5697bf881a28abf838c891c7/html5/thumbnails/36.jpg)
Summary 2/2
I/O performance limited by weakest link in chain between OS and device
I/O device notifying the operating system:
Polling: it can waste a lot of processor time
I/O interrupt: similar to exception except it is asynchronous
Delegating I/O responsibility from the CPU:
DMA, or even IOP