2 systems architecture, fifth edition chapter goals describe the system bus and bus protocol...

Post on 20-Dec-2015

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

2Systems Architecture, Fifth Edition

Chapter Goals

• Describe the system bus and bus protocol• Describe how the CPU and bus interact with

peripheral devices• Describe the purpose and function of device

controllers• Describe how interrupt processing coordinates the

CPU with secondary storage and I/O devices

3Systems Architecture, Fifth Edition

Chapter Goals (continued)

• Describe how buffers, caches, and data compression improve computer system performance

4Systems Architecture, Fifth Edition

5Systems Architecture, Fifth Edition

System Bus

• Connects CPU with main memory and peripheral devices

• Set of data lines, control lines, and status lines• Bus protocol

– Number and use of lines

– Procedures for controlling access to the bus

• Subsets of bus lines: data bus, address bus, control bus

6Systems Architecture, Fifth Edition

7Systems Architecture, Fifth Edition

Bus Clock and Data Transfer Rate

• Bus clock pulse– Common timing reference for all attached devices

– Frequency measured in MHz

• Bus cycle– Time interval from one clock pulse to the next

• Data transfer rate– Measure of communication capacity

– Bus capacity = data transfer unit x clock rate

8Systems Architecture, Fifth Edition

Bus Protocol

• Governs format, content, timing of data, memory addresses, and control messages sent across bus

• We can’t let two devices put data on the bus at the same time. So we need access control.

• Approaches for access control– Master-slave approach – traditional – CPU is bus

master and all other devices are slaves

9Systems Architecture, Fifth Edition

Bus Protocol

• Approaches for access control (transferring data without CPU):– Direct memory access (DMA) – DMA controller

gets data from device and stores in RAM

– Peer-to-peer buses – any device can become master via bus arbitration protocol

Local Bus vs External Bus

• Traditionally, the local bus is connected to CPU and cache and RAM and other internal devices

• External bus connects the main processing unit to I/O devices

• Differences between local and external buses is getting fuzzy – new bus protocols can support both

10Systems Architecture, Fifth Edition

Parallel vs Serial Bus

• Parallel bus is older technology in which bus is a connection of wires that a devices “plugs into”

• Serial bus interconnects one device after another and creates a daisy-chain of devices

• Timing skew has become a problem with parallel bus design

11Systems Architecture, Fifth Edition

Serial Bus vs Parallel Bus

12Systems Architecture, Fifth Edition

Parallel

Serial (daisychain)

Example System Buses• IBM PC Bus - 8 bit data 20 bit address, used in all

early IBM PCs and clones.• PC-AT bus (ISA) - Compatible with PC bus, but has

second strip of connectors with an additional 36 lines. These lines give a 16 bit data bus for 80286 chip

• VESA Local bus (VL-bus or VLB) – found alongside ISA bus in pcs; acted as a high-speed bus for DMA and memory-mapped I/O; aka Very Long Bus!

13Systems Architecture, Fifth Edition

Example System Buses

• IBM Microchannel - Bus for IBM PS/2 computer; closed architecture with high licensing costs

• EISA (Extended industry standard architecture) - Several non-IBM companies reacted to Microchannel and designed EISA. Provides for 32 bit data bus.

14Systems Architecture, Fifth Edition

VME Bus• Used in SGI systems• Begun by Motorola, became an IEEE standard (IEEE

P1014)• 32 bit bus, asynchronous design (see next slide)• No circuitry on motherboard• Hundreds of companies design board for VME, 300

page set of VME definitions, very stable• Bus lines provide automatic self-testing and status

reporting• Now VME64 with a 64-bit bus

15Systems Architecture, Fifth Edition

16Systems Architecture, Fifth Edition

PCI Bus

• (Peripheral component interconnect) - used in pc and Mac systems

• Well defined and fast• Is the local bus in a machine with other buses• Intel based; CPU bus and peripherals plug directly

into PCI bus• Allows devices to talk to each other without CPU

intervention

17Systems Architecture, Fifth Edition

18Systems Architecture, Fifth Edition

Older architecture

19Systems Architecture, Fifth Edition

Newer architecture

CPU

northbridge chip

southbridge chip

RAM

Video card

PCI busReal time clockUSBPower managementOther devices

Front side bus(system bus)

Cache

Backside bus

PCI Bus

• Plug-in boards have software settings, not DIP switches

• 532 Mbps transfer speed (PCI v.3.0)• Synchronous bus (see figure on next slide)• Initiator and target design (master/slave)• Address and data lines multiplexed

20Systems Architecture, Fifth Edition

21Systems Architecture, Fifth Edition

PCI Bus

• OS queries all PCI buses at boot time to find out what devices are present and what system resources (interrupt lines, memory, etc.) each needs. It then allocates the resources and tells each device what its allocation is.

• Each device can request up to six areas of memory space or I/O port space

22Systems Architecture, Fifth Edition

PCI Versions

• 32-bit, 33MHz (5V, added in Rev. 2.0) • 64-bit, 33MHz (5V, added in Rev. 2.0) • 32-bit, 66MHz (3.3V only, added in Rev. 2.1) • 64-bit, 66MHz (3.3V only, added in Rev. 2.1)

PCI-X

• PCI-extended• Twice as fast as PCI – 1.06 GB/s• Designed for servers to support Gigabit Ethernet cards,

Fibre Channel and Ultra320 SCSI controllers• PCI-X backwards compatible with older PCI standards

(except the 5v ones)• PCI-X only runs as fast as the slowest device• In 2003 PCI SIG ratified PCI-X 2.0 which added 266 MHz

and 533 MHz options, or roughly 2.15 GB/s and 4.3 GB/s throughput (but losing ground to PCIe)

• PCI-X 3.0 in development, but how far with popularity of PCIe?

PCI-Express

• PCIe or PCI-E

• Not the same as PCI. PCI is a parallel bus, where PCIe is a serial bus (like USB)

• Hub on motherboard acts as crossbar switch allowing multiple simultaneous full-duplex connections

• Serial format starting to win out over parallel format due in part to timing skew

• PCIe is a layered protocol, consisting of a Transaction Layer, a Data Link Layer, and a Physical Layer (fairly complex, like USB)

From top to bottom – PCIe x4, x16, x1, x16, and an older PCI connectorfrom Wikipedia

A PCIe card will fit in any slot that isat least wide enough

27Systems Architecture, Fifth Edition

SCSI (Small Computer System Interface)

• Family of standard buses designed primarily for secondary storage devices

• Most often used for disk drives but can interface pretty much any device

• Implements both a low-level physical I/O protocol and a high-level logical device control protocol

28Systems Architecture, Fifth Edition

SCSI Interfaces – Parallel

• Still common is the older parallel SCSI (aka SPI)• Popular forms include

– SCSI-1

– Fast SCSI

– Fast-Wide SCSI

– Ultra Wide SCSI

• See handout on parallel SCSI specs

29Systems Architecture, Fifth Edition

SCSI Interfaces – Serial

• Serial SCSI – modern addition to SCSI system• Faster data rates, hot swapping, and improved

fault isolation among the advantages of serial SCSI

• Once again clock skew issue of high speed parallel interfaces is driving the change from parallel to serial

30Systems Architecture, Fifth Edition

SCSI Interfaces – iSCSI

• SCSI command set stays the same, its just that the physical specifications essentially no longer exist

• Physical specs are TCP/IP• SCSI-3 implemented over a network• iSCSI competing with Fibre Channel• Many felt iSCSI would not be as fast as Fibre

Channel due to TCP/IP overhead, but now systems are using TCP Offload Engine and 10G Ethernet

31Systems Architecture, Fifth Edition

32Systems Architecture, Fifth Edition

33Systems Architecture, Fifth Edition

Desirable Characteristicsof a SCSI Bus

• Non-proprietary standard• High data transfer rate• Peer-to-peer capability• High-level (logical) data access commands• Multiple command execution• Interleaved command execution• But typically quite a bit more expensive.

34Systems Architecture, Fifth Edition

I/O Ports

• I/O ports are the pathways between the CPU and a peripheral device

• Logical and Physical Access– Usually a memory address that can be read/written by

the CPU and a single peripheral device

– Also a logical abstraction that enables CPU and bus to interact with each peripheral device as if the device were a storage device with linear address space

35Systems Architecture, Fifth Edition

Physical access: System bus is usually physically implemented on a large printed circuit board with attachment points for devices.

36Systems Architecture, Fifth Edition

Logical access: The device, or its controller, translates linear sector address into corresponding physical sector location on a specific track and platter.

37Systems Architecture, Fifth Edition

Device Controllers

• Implement the bus interface and access protocols• Translate logical addresses into physical addresses• Enable several devices to share access to a bus

connection

38Systems Architecture, Fifth Edition

39Systems Architecture, Fifth Edition

Mainframe Channels

• Advanced type of device controller used in mainframe controllers

• Compared with device controllers:– Greater data transfer capacity

– Larger maximum number of attached peripheral devices

– Greater variability in types of devices that can be controlled

40Systems Architecture, Fifth Edition

Interrupt Processing

• Used by application programs to coordinate data transfers to/from peripherals, notify CPU of errors, and call operating system service programs

• When interrupt is detected, executing program is suspended; pushes current register values onto the stack and transfers control to an interrupt handler

• When interrupt handler finishes executing, the stack is popped and suspended process resumes from point of interruption

41Systems Architecture, Fifth Edition

Interrupt Processing

• Secondary storage and I/O devices are much slower than RAM, ROM, cache memory, and the CPU (see table on next slide)

• When the CPU asks for data from an I/O device, what should the CPU do?– Sit in a wait cycle?

– Go do something else?

Interrupt Processing

42Systems Architecture, Fifth Edition

43Systems Architecture, Fifth Edition

Multiple Types of Interrupts

• Categories of interrupts– I/O event

– Error condition

– Service request

– Processor to processor communication

• Can one interrupt be interrupted by another type of interrupt?

44Systems Architecture, Fifth Edition

45Systems Architecture, Fifth Edition

Buffers and Caches

• Improve overall computer system performance by employing RAM to overcome mismatches in data transfer rate and data transfer unit size

46Systems Architecture, Fifth Edition

Buffers

• Small storage areas (usually DRAM or SRAM) that hold data in transit from one device to another

• Use interrupts to enable devices with different data transfer rates and unit sizes to efficiently coordinate data transfer

• Buffer overflow

47Systems Architecture, Fifth Edition

Classic example of a buffer: a print buffer

48Systems Architecture, Fifth Edition

Computer system performance improves dramatically with larger buffer.

49Systems Architecture, Fifth Edition

Computer system performance improves dramatically with larger buffer.

Assumes a32-bit bus

50Systems Architecture, Fifth Edition

Computer system performance improves dramatically with larger buffer.

2 interrupts eachtime we fill up thebuffer.

Buffer will be filled64KB/buffer sizetimes

51Systems Architecture, Fifth Edition

Computer system performance improves dramatically with larger buffer.

Sum of bustransfers andbus interrupts

52Systems Architecture, Fifth Edition

Computer system performance improves dramatically with larger buffer.

Assumes 100CPU cycles tohandle aninterrupt.

53Systems Architecture, Fifth Edition

Diminishing Returns

• When multiple resources are required to produce something useful, adding more and more of a single resource produces fewer and fewer benefits

• Applicable to buffer size

54Systems Architecture, Fifth Edition

Law of diminishing returns affects both bus and CPU performance

Similar chart to thelast one, but now theamount to transferis 64B instead of64KB.

Note howimprovement stopsonce the buffer sizeequals the transferamount.

55Systems Architecture, Fifth Edition

Cache

• Differs from buffer:– Data content not automatically removed as used

– Used for bidirectional data

– Used only for storage device accesses

– Usually much larger

– Content must be managed intelligently

• Achieves performance improvements differently for read and write accesses

56Systems Architecture, Fifth Edition

Write access: Sending confirmation (2) before data is written to secondary storage device (3) can improve program performance; program can immediately proceed with other processing tasks.

57Systems Architecture, Fifth Edition

Read accesses are routed to cache (1). If data is already in cache, it is accessed from there (2). If data is not in cache, it must be read from the storage device (3). Performance improvement realized only if requested data is already waiting in cache.

58Systems Architecture, Fifth Edition

Cache Controller

• Processor that manages cache content• Guesses what data will be requested; loads it from

storage device into cache before it is requested• Can be implemented in

– A storage device storage controller or communication channel

– Operating system

59Systems Architecture, Fifth Edition

Cache

Primary storage cache Secondary storage cache

• Can limit wait states by using SRAM cached between CPU and SDRAM primary storage

• Level one (L1): within CPU

• Level two (L2): on-chip

• Level three (L3): off-chip

• Gives frequently accessed files higher priority for cache retention

• Uses read-ahead caching for files that are read sequentially

• Gives files opened for random access lower priority for cache retention

60Systems Architecture, Fifth Edition

Intel Itanium® 2 microprocessor uses three levels of primary storage caching.

61Systems Architecture, Fifth Edition

Processing Parallelism

• Increases computer system computational capacity; breaks problems into pieces and solves each piece in parallel with separate CPUs

• Techniques– Multicore processors

– Multi-CPU architecture

– Clustering

62Systems Architecture, Fifth Edition

Multicore Processors

• Include multiple CPUs and shared memory cache in a single microchip

• Typically share memory cache, memory interface, and off-chip I/O circuitry among the cores

• Reduce total transistor count and cost and provide synergistic benefits

63Systems Architecture, Fifth Edition

64Systems Architecture, Fifth Edition

Multi-CPU Architecture

• Employs multiple single or multicore processors sharing main memory and the system bus within a single motherboard or computer system

• Common in midrange computers, mainframe computers, and supercomputers

• Cost-effective for– Single system that executes many different

application programs and services

– Workstations

65Systems Architecture, Fifth Edition

Scaling Up

• Increasing processing by using larger and more powerful computers

• Used to be most cost-effective• Still cost-effective when maximal computer power

is required and flexibility is not as important

66Systems Architecture, Fifth Edition

Scaling Out

• Partitioning processing among multiple systems• Speed of communication networks; diminished

relative performance penalty• Economies of scale have lowered costs• Distributed organizational structures emphasize

flexibility• Improved software for managing multiprocessor

configurations

67Systems Architecture, Fifth Edition

High-Performance Clustering

• Connects separate computer systems with high-speed interconnections

• Used for the largest computational problems(e.g., modeling three-dimensional physical phenomena)

68Systems Architecture, Fifth Edition

Partitioning the problem to match the cluster architecture ensures that most data exchange traverses high-speed paths.

69Systems Architecture, Fifth Edition

Compression

• Reduces number of bits required to encode a data set or stream

• Effectively increases capacity of a communication channel or storage device

• Requires increased processing resources to implement compression/decompression algorithms while reducing resources needed for data storage and/or communication

Trading data sizeagainst CPU time

70Systems Architecture, Fifth Edition

Compression Algorithms

• Vary in:– Type(s) of data for which they are best suited

– Whether information is lost during compression

– Amount by which data is compressed

– Computational complexity

• Lossless versus lossy compression

71Systems Architecture, Fifth Edition

Compression can be used to reduce disk storage requirements (a) or to increase communication channel capacity (b).

72Systems Architecture, Fifth Edition

MPEG standards address recording and encoding formats for both images and sound.

Exploitsvarying

sensitivityof the earto sounds

to performlossy

compression

Chip Interfacing

• You are working for Nokia on a new cellphone• This phone will have a processor, one EPROM,

one RAM, and an I/O chip to control display and keyboard

• The processor has a 16-bit address bus• With 16 bits, you can have 65,536 bytes of storage

73Systems Architecture, Fifth Edition

Chip Interfacing

• For the I/O chip, we could attach it as an I/O device, then set CS line on PIO to IORQ line on CPU

• Or we could choose a particular address and have that address go into the CS line of the I/O chip

• The latter form is called memory-mapped I/O

74Systems Architecture, Fifth Edition

Chip Interfacing

• The I/O chip needs 4 bytes of address space (3 I/O ports and 1 status register)

• The EPROM is an 8K chip so it needs 8K of address space (13 bits needed to select 8K)

• Likewise, the RAM needs 8K of address space

75Systems Architecture, Fifth Edition

Chip Interfacing

• You don’t want addresses of chips to overlap, so place the devices in memory as follows:– EPROM starts at address 0 (0000h) and is 8K

(8192, or 2000h) long so ends at 1FFFh

– RAM starts at address 32K (32,768, or 8000h) and is 8K long so ends at 9FFFh

– I/O starts at address 65532 (FFFCh) and is 4 bytes long so ends at 65535 (FFFFh)

76Systems Architecture, Fifth Edition

Chip Interfacing• So, hexadecimal address ranges for each chip are:

– EPROM: 0000 – 1FFF

– RAM: 8000 – 9FFF

– I/O: FFFC – FFFF

• That would place the devices at the following binary addresses:– EPROM: 000xxxxxxxxxxxxx

– RAM: 100xxxxxxxxxxxxx

– I/O: 11111111111111xx

77Systems Architecture, Fifth Edition

Memory Allocation

78Systems Architecture, Fifth Edition

EPROM RAM I/O

0K 8K-1 32K 40K-1 65532-65535

Interface

79Systems Architecture, Fifth Edition

:A0

A12

A13

A14

A15

EPROM

~CS

RAM

~CS

I/O

~CS

80Systems Architecture, Fifth Edition

Summary

• How the CPU uses the system bus and device controllers to communicate with secondary storage and input/output devices

• Hardware and software techniques for improving data efficiency, and thus, overall computer system performance: bus protocols, interrupt processing, buffering, caching, and compression

top related