Download - I/O InterfaceCS510 Computer ArchitecturesLecture 17 - 1 Lecture 17 I/O Interfaces and I/O Busses

I/O Interface CS510 Computer Architectures Lecture 17 - 1

Lecture 17Lecture 17

I/O Interfaces and I/O Interfaces and I/O BussesI/O Busses

Lecture 17Lecture 17

I/O Interfaces and I/O Interfaces and I/O BussesI/O Busses


Storage System IssuesStorage System IssuesStorage System IssuesStorage System Issues

• Historical Context of Storage I/O• Storage I/O Performance Measures• Secondary and Tertiary Storage Devices• Processor Interface Issues• I/O Buses• Redundant Arrays of Inexpensive Disks (RAID)


Disk Time ExampleDisk Time ExampleDisk Time ExampleDisk Time Example• Disk Parameters:

– Transfer size is 8K bytes

– Advertised average seek is 12 ms

– Disk spins at 7200 RPM

– Transfer rate is 4 MB/sec

– Controller overhead is 2 ms

• Assume that disk is idle so no queuing delay• What is Average Disk Access Time for a Sector?

– Ave seek + ave rot delay + transfer time + controller overhead

– 12 ms + 0.5/(7200 RPM/60) + 8 KB/4 MB/s + 2 ms

– 12 + 4.15 + 2 + 2 = 20 ms

• Advertised seek time assumes no locality: typically 1/4 to 1/3 advertised seek time– Average Disk Access Time for a Sector: 20 ms => 12 ms


Processor Interface IssuesProcessor Interface IssuesProcessor Interface IssuesProcessor Interface Issues

• Interconnections– Busses

• Processor interface– Interrupts

– Memory mapped I/O

• I/O Control Structures– Polling

– Interrupts

– DMA

– I/O Controllers

– I/O Processors

• Capacity, Access Time, Bandwidth

Lecture 17 - 5CS510 Computer ArchitecturesI/O Interface

I/O InterfaceI/O InterfaceI/O InterfaceI/O Interface

Seperate I/O instructions (in,out)

Lines distinguish between I/O and memory transfers

VME busMultibus-IINubus

40 Mbytes/secoptimistically

10 MIPS processorcompletelysaturates the bus!

CPU

Interface Interface

Peripheral Peripheral

Memory

memorybus

Independent I/O Bus

Common Memory& I/O bus

CPU

Interface Interface


Memory


Memory Mapped I/OMemory Mapped I/OMemory Mapped I/OMemory Mapped I/O

Single Memory & I/O Bus No Separate I/O Instructions

CPU

Interface Interface


Memory

L2 $

Memory Bus

Memory Bus Adapter

I/O bus

$

CPU

ROM

RAM

I/OI/O


Programmed I/O (Polling)Programmed I/O (Polling)Programmed I/O (Polling)Programmed I/O (Polling)

CPU

IOC

device

Memory

• not an efficient way to use the CPU, unless the device is very fast!

• but checks for I/O completion can be dispersed among computationally intensive code

• busy on wait loopIs dataready?

readdata

storedata

yesno

done?no

yes


Interrupt Driven Data TransferInterrupt Driven Data TransferInterrupt Driven Data TransferInterrupt Driven Data Transfer

(2) save PC

(3) interruptservice addr

(4)

1000 transfers/second: 1000 interrupts @ 2 sec per interrupt=> 2 msec 1000 interrupt service @ 98 sec each=> 98 msec 100 msec = 0.1 CPU seconds

CPU

IOC

device

Memory

User program halts only during actual transfer

memory

userprogram

interruptserviceroutine

readstore...rti

addsubandornop

(1) I/Ointerrupt


Direct Memory AccessDirect Memory AccessDirect Memory AccessDirect Memory Access

0ROM

RAM

Peripherals

DMACn

Memory Mapped I/O

Time to do 1000 xfers in 1 msec:1 DMA set-up sequence: @ 50 sec1 interrupt: @ 2 sec1 interrupt service sequence: @ 48 sec 100sec .0001 second of CPU time

CPU sends a starting address, direction(R/W), and word count to DMAC. Then issues "start".

DMAC provides; Peripheral controller Handshake signals Memory Addresses Handshake signals

CPU

IOC

device

Memory DMAC

I/O


Input/Output ProcessorsInput/Output ProcessorsInput/Output ProcessorsInput/Output Processors

(2)

(3)

Device to/from memory transfers are controlled by the IOP directly.

IOP steals memory cycles from CPU

OP Device Address

target device

where cmnds are

looks in memory for commands

CPU IOP

Mem

D1

D2

Dn

. . .main memory

bus

I/Obus

OP Addr Cnt Other

whatto do

whereto putdata

howmuch

specialrequests

Commandmemory

CPU

IOP

(1)issues instruction to IOP

interrupts when done(4)


Relationship Relationship to Processor Architectureto Processor Architecture


• I/O instructions and busses have largely disappeared• Interrupt vectors have been replaced by jump tables

PC <- M [ IVA + interrupt number ]

• Interrupts:– Stack replaced by shadow registers

– Handler saves registers and re-enables higher priority interrupts




• Caches required for processor performance cause problems for I/O– Flushing is expensive, I/O pollutes cache

– Solution is borrowed from shared memory multiprocessors "snooping"

• Virtual memory frustrates DMA• Stateful processors are hard to switch context


Interconnect TrendsInterconnect TrendsInterconnect TrendsInterconnect Trends

>1000 m 10 - 100 m 1 mDistance

10 - 100 Mb/s 40 - 1000 Mb/s 320 - 1000+ Mb/sBandwidth

high (>ms) medium low (<µs)Latency

lowExtensive CRC

mediumByte Parity

highByte Parity

Reliability

• Interconnect = glue that interfaces computer system components

• High speed hardware interfaces + logical protocols

• Networks, channels, backplanes

• Memory-mapped• Wide pathways• Centralized arbit.

• Message-based• Narrow pathways• Distributed arbitration

Network Channel Backplane


Backplane ArchitecturesBackplane ArchitecturesBackplane ArchitecturesBackplane Architectures

Distinctions begin to blur:

SCSI channel is like a bus FutureBus is like a channel (disconnect/reconnect) HIPPI forms links in high speed switching fabrics

128No

16 - 32Single/Multiple

MultipleNo

Async25

12.927.913.621

.5 mIEEE 1014

96Yes32

Single/MultipleMultipleOptionalAsync

3715.595.220.820

.5 mIEEE 896

Metric VME FutureBus96Yes32

Single/MultipleMultipleOptional

Sync201040

13.321

.5 mANSI/IEEE 1296

MultiBus IIBus Width (signals)Address data multiplexed?Data WidthXfer Size# of Bus MastersSplit TransactionsClockingBandwidth, Single Word (0 ns mem)Bandwidth, Single Word (150ns mem)Bandwidth Multiple Word (0 ns mem)Bandwidth Multiple Word (150 ns mem)Max # of devicesMax Bus LengthStandard

25na8

Single/MultipleMultipleOptional

Either5, 1.55, 1.55, 1.55, 1.5

725 m

ANSI X3.131

SCSI-I


Bus-Based InterconnectBus-Based InterconnectBus-Based InterconnectBus-Based Interconnect

• Bus: a shared communication link between subsystems– Low cost: a single set of wires is shared multiple ways

– Versatility: Easy to add new devices & peripherals may even be ported between computers using common bus

• Disadvantage– A communication bottleneck, possibly limiting the

maximum I/O throughput

• Bus speed is limited by physical factors– the bus length

– the number of devices (and, hence, bus loading).

– these physical limits prevent arbitrary bus speedup.


Bus-Based InterconnectBus-Based InterconnectBus-Based InterconnectBus-Based Interconnect• Two generic types of busses:

– I/O busses (sometimes called a channel) • lengthy • many types of devices connected • wide range in the data bandwidth• follow a bus standard

– CPU-Memory buses (sometimes called a backplane)• short • high speed, matched to the memory system to maximize memor

y-CPU bandwidth

– To lower costs, low cost (older) systems combine together

• Bus transaction– Sending address & receiving or sending data


Bus ProtocolsBus ProtocolsBus ProtocolsBus Protocols

Multibus: 20 address, 16 data, 5 control, 50ns Pause• Bus Master: has ability to control the bus, initiates a transaction

• Bus Slave: module activated by the transaction

• Bus Communication Protocol: specification of sequence of events and timing requirements in transferring information.

– Asynchronous Bus Transfers: control lines (req., ack.) serve to orchestrate sequencing

– Synchronous Bus Transfers: sequence relative to the common clock

° ° °Master Slave

Control Lines

Address Lines

Data Lines


Synchronous Bus Protocols Synchronous Bus Protocols Synchronous Bus Protocols Synchronous Bus Protocols

Pipelined/Split transaction Bus Protocol

Address

Data

Wait

addr 1

data 0

addr 2

wait 1

data 1

addr 3

OK 1

begin read Read complete

Address

Data

Read

Wait

Clock

Read transaction


Asynchronous HandshakeAsynchronous HandshakeAsynchronous HandshakeAsynchronous HandshakeWrite Transaction

t3: Master releases req

Address

Data

Read

Req.

Ack. 4 Cycle Handshake

t0 t1 t2 t3 t4 t5

Master Asserts Address Next Address

Master Asserts Data

t0: Master has obtained control and asserts address, direction, data; Waits a specified amount of time for slaves to decode target

t2: Slave asserts ack, indicating data receivedt1: Master asserts request line

t4: Slave releases ack


Asynchronous HandshakeAsynchronous HandshakeAsynchronous HandshakeAsynchronous Handshake

Time Multiplexed Bus: address and data share lines

Read Transaction

Address

Data

Read

Req

Ack

Master Asserts Address Next Address

t0 t1 t2 t3 t4 t5

4 Cycle Handshake

t0: Master has obtained control and asserts address, direction; Waits a specified amount of time for slaves to decode

t2: Slave asserts ack, indicating ready to transmit data;Slave asserts data

t3: Master releases req, data received

t1: Master asserts request line

t4: Slave releases ack


Bus ArbitrationBus ArbitrationBus ArbitrationBus ArbitrationParallel (Centralized) Arbitration

BR BG

M

BR BG

M

BR BG

M

Bus Arbiter

MBGi BGo

BRM

BGi BGo

BRM

BGi BGo

BR

BG

BR

A.U.

BR A

M

BR A

M

BR A

M

BR A

A.U.

Serial Arbitration (daisy chaining)

Polling

BR: Bus RequestBG: Bus Grant

Bus sequence:BR

BG

Busy

data


Bus OptionsBus OptionsBus OptionsBus Options

Bus width Separate address Multiplex address & data lines & data lines

Data width Wider is faster Narrower is cheaper (e.g., 32 bits) (e.g., 8 bits)

Transfer size Multiple words has Single-word transfer less bus overhead is simpler

Bus masters Multiple Single master (requires arbitration) (no arbitration)

Split Yes - separate No - continuous transaction?Request and Reply connection is cheaperpackets gets higher and has lower latencybandwidth(needs multiple masters)

Clocking Synchronous Asynchronous

Option High performance Low cost


1990 Bus Survey1990 Bus Survey1990 Bus Survey1990 Bus Survey

Signals 128 96 96 16 8Addr/Data mux no yes yes n/a n/aData width 16 - 32 32 32 16 8Masters multi multi multi single multiClocking Async Async Sync Async eitherMB/s 25 37 20 25 1.5 (asyn) (0ns, word) 5 (sync) 150ns word 12.9 15.5 10 = = 0ns block 27.9 95.2 40 = = 150ns block 13.6 20.8 13.3 = =Max devices 21 20 21 8 7Max meters 0.5 0.5 0.5 50 25Standard IEEE 1014 IEEE 896.1 ANSI/IEEE ANSI X3.129 ANSI X3.131

1296

VME FutureBus Multibus II IPI SCSI


VMEVMEVMEVME

• 3 96-pin connectors• 128 defined as standard, rest customer defined

– 32 address

– 32 data

– 64 command & power/ground lines


SCSI: Small Computer SCSI: Small Computer System InterfaceSystem Interface

SCSI: Small Computer SCSI: Small Computer System InterfaceSystem Interface

• Up to 8 devices to communicate on a bus or “string” at sustained speeds of 4-5 MBytes/sec

• SCSI-2 up to 20 MB/sec• Devices can be slave (“target”) or master(“initiator”)• SCSI protocol: a series of phases, during which specific actions are taken

by the controller and the SCSI disks– Bus Free: No device is currently accessing the bus– Arbitration: When the SCSI bus goes free, multiple devices may request (arbitra

te for) the bus; fixed priority by address– Selection: informs the target that it will participate (Reselection if disconnected)– Command: the initiator reads the SCSI command bytes from host memory and

sends them to the target– Data Transfer: data in or out, initiator: target– Message Phase: message in or out, initiator: target (identify, save/restore data

pointer, disconnect, command complete)– Status Phase: target, just before command complete


SCSI “Bus”: Channel SCSI “Bus”: Channel ArchitectureArchitecture

SCSI “Bus”: Channel SCSI “Bus”: Channel ArchitectureArchitecture

Command SetupArbitrationSelection

Message Out (Identify)CommandDisconnect to seek/¼ ll buf fer

Message In (Disconnect)- - Bus Free - -

ArbitrationReselection

Message In (Identify)

Data Transfer

Data In

Disconnect to ¼ ll buf ferMessage In (Save Data Ptr)

Message In (Disconnect)- - Bus Free - -

ArbitrationReselection

Message In (Identify)

Command CompletionStatus

Message In (Command Complete)

If no disconnect is needed

Completion

Message In (Restore Data Ptr)

peer-to-peer protocolsinitiator/targetlinear byte streamsdisconnect/reconnect


1993 I/O Bus Survey1993 I/O Bus Survey1993 I/O Bus Survey1993 I/O Bus Survey

Originator Sun DEC IBM Intel

Clock Rate (MHz) 16-25 12.5-25 async 33

Addressing Virtual Physical Physical Physical

Data Sizes (bits) 8,16,32 8,16,24,32 8,16,24,32,64 8,16,24,32,64

Master Multi Single Multi Multi

Arbitration Central Central Central Central

32 bit read (MB/s) 33 25 20 33

Peak (MB/s) 89 84 75 111 (222)

Max Power (W) 16 26 13 25

Bus SBus TurboChannel MicroChannel PCI


1993 MP Server Memory Bus 1993 MP Server Memory Bus SurveySurvey

1993 MP Server Memory Bus 1993 MP Server Memory Bus SurveySurvey

Originator HP SGI SunClock Rate (MHz) 60 48 66

Split transaction? Yes Yes Yes? Address lines 48 40 ??Data lines 128 256 144 (parity)Data Sizes (bits) 512 1024 512Clocks/transfer 4 5 4?Peak (MB/s) 960 1200 1056Master Multi Multi MultiArbitration Central Central CentralAddressing Physical Physical PhysicalSlots 16 9 10Busses/system 1 1 2Length 13 inches 12? inches 17 inches

Bus Summit Challenge XDBus


Communications NetworksCommunications NetworksCommunications NetworksCommunications NetworksPerformance limiter is memory system, OS overhead, not protocol

s

• Send/receive queues in processor memories• Network controller copies back and forth via DMA• No host intervention needed• Interrupt host when message sent or received

NodeProcessor

ControlReg. I/F

NetI/F Memory

RequestBlock

ReceiveBlock

Media

Network Controller

Peripheral Backplane Bus

DMA

. . .

Processor MemoryList of request blocks

Data to be transmitted

. . .List of receive blocks

Data receivedDMA

. . .List of free blocks


I/O Controller ArchitectureI/O Controller ArchitectureI/O Controller ArchitectureI/O Controller Architecture

Request/Response block interface Backdoor access to host memory

Peripheral Bus (VME, FutureBus, etc.)

HostMemory

ProcessorCache

HostProcessor

Peripheral Bus Interface/DMA

I/O Channel Interface

BufferMemory

ROM

µProc

I/O Controller


I/O Data FlowI/O Data FlowI/O Data FlowI/O Data Flow

I/O Device Head/Disk Assembly

DMA over Peripheral BusOS Buffers (>10 MByte)

Xfer over Serial InterfaceTrack Buffers (32K - 256KBytes)Embedded Controller

Memory-to-Memory CopyApplication Address Space

Host Processor

Xfer over Disk ChannelHBA Buffers (1 M - 4 MBytes) I/O Controller

Impediment to high performance: multiple copies,

complex hierarchy

Download - I/O InterfaceCS510 Computer ArchitecturesLecture 17 - 1 Lecture 17 I/O Interfaces and I/O Busses

Top Related