lecture 11: interfaces, i/o and configurable...

54
1 Kurt Keutzer Lecture 11: Interfaces, I/O and Configurable Processors Professor Kurt Keutzer Computer Science 252 Spring 2000 With contributions from Prof. David Patterson Niraj Shah, Scott Weber

Upload: others

Post on 11-Mar-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

1Kurt Keutzer

Lecture 11: Interfaces, I/O and

Configurable Processors

Professor Kurt Keutzer

Computer Science 252

Spring 2000

With contributions from Prof. David Patterson

Niraj Shah, Scott Weber

Page 2: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

2Kurt Keutzer

Embedded Systems vs. General Purpose Computing - 1

Embedded System

• Runs a few applications often known at design time

• Not end-user programmable

• Operates in fixed run-time constraints, additional performance may not be useful/valuable

General purpose computing

•Intended to run a fully general set of applications

• End-user programmable

• Faster is always better

Page 3: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

3Kurt Keutzer

Embedded Systems vs. General Purpose Computing - 2

Embedded System

Differentiating features:

� power

� cost

� speed (must be predictable)

General purpose computing

Differentiating features

� speed (need not be fully predictable)

� speed

� did we mention speed?

� cost (largest component power)

Page 4: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

4Kurt Keutzer

Configurabilty and Embedded Systems

Advantages of configuration:

• Pay (in power, design time, area) only for what you use

• Gain additional performance by adding features tailored to your application:

Particularly for embedded systems:

� Principally in embedded controller microprocessor applications

� Some us in DSP

Page 5: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

5Kurt Keutzer

What to Configure?

What parts of the microcontroller/microprocessor system to configure?

Easy answers:

• Memory and Cache Sizes - get precisely the sizes your applications needs

• Register file sizes

• Interrupt handling and addresses

Harder answers:

• Peripherals

• Instructions

But first we need more context

Page 6: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

6Kurt Keutzer

I/O Interrupts

An I/O interrupt is just like the exception handlers except:

� An I/O interrupt is asynchronous

� Further information needs to be conveyed

An I/O interrupt is asynchronous with respect to instruction execution:

� I/O interrupt is not associated with any instruction

� I/O interrupt does not prevent any instruction from completion� You can pick your own convenient point to take an interrupt

I/O interrupt is more complicated than exception:

� Needs to convey the identity of the device generating the interrupt

� Interrupt requests can have different urgencies:� Interrupt request needs to be prioritized

Page 7: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

7Kurt Keutzer

…………add $r1,$r2,$r3subi $r4,$r1,#4slli $r4,$r4,#2

Hiccup(!)

lw $r2,0($r4)

lw $r3,4($r4)add $r2,$r2,$r3sw 8($r4),$r2

…………

Raise priorityReenable All IntsSave registers

…………lw $r1,20($r0)lw $r2,0($r1)addi $r3,$r0,#5sw $r3,0($r1)

…………Restore registersClear current IntDisable All IntsRestore priorityRTI

Ext

erna

l Int

erru

pt

PC saved

Disable A

ll Ints

Superviso

r Mode

Restore PCUser Mode

“Int

erru

pt H

andl

er”

Example: Device Interrupt

Advantage:� User program progress is only halted during actual transfer

Disadvantage, special hardware is needed to:� Cause an interrupt (I/O device)� Detect an interrupt (processor)� Save the proper states to resume after the interrupt (processor)

Page 8: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

8Kurt Keutzer

Interrupt Driven Data TransferCPU

IOC

device

Memory

addsubandornop

readstore...rtimemory

userprogram(1) I/O

interrupt

(2) save PC

(3) interruptservice addr

interruptserviceroutine(4)

Device xfer rate = 10 MBytes/sec => 0 .1 x 10 sec/byte => 0.1 µsec/byte => 1000 bytes = 100 µsec

1000 transfers x 100 µsecs = 100 ms = 0.1 CPU seconds

-6

User program progress only halted during actual transfer

1000 transfers at 1 ms each:1000 interrupts @ 2 µsec per interrupt1000 interrupt service @ 98 µsec each = 0.1 CPU seconds

Still far from device transfer rate! 1/2 in interrupt overhead

Page 9: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

9Kurt Keutzer

Better Way to Handle Interrupts?

Handling all interrupts with CPU could bring it to a halt in a real time system

Isn’t there a better way?

Hint, remember the trickledown theory of embedded processor architecture.

Page 10: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

10Kurt Keutzer

Trickle Down Theory of Embedded Architectures

Mainframe/supercomputers

High-end servers/workstations

High-end personal computers

Personal computers

Lap tops/palm tops

Gadgets

Features tend to trickle down:• #bits: 4->8->16->32->64• ISA’s• Floating point support• Dynamic scheduling• Caches• I/O controllers/processors• LIW/VLIW• Superscalar

Page 11: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

11Kurt Keutzer

I/O Interface

Independent I/O Bus

CPU

Interface Interface

Peripheral Peripheral

Memorymemorybus

Separate I/O instructions (in,out)

CPU

Interface Interface

Peripheral Peripheral

Memory

Lines distinguish betweenI/O and memory transferscommon memory

& I/O busVME busMultibus-IINubus

40 Mbytes/secoptimistically

10 MIP processorcompletelysaturates the bus!

Page 12: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

12Kurt Keutzer

Delegating I/O Responsibility from the CPU: IOP

CPU IOP

Mem

D1

D2

Dn

. . .main memory

bus

I/Obus

CPU

IOP(1) Issuesinstructionto IOP

memory

(2)

(3)

Device to/from memorytransfers are controlledby the IOP directly.

IOP steals memory cycles.

OP Device Address

target devicewhere cmnds are

IOP looks in memory for commands

OP Addr Cnt Other

whatto do

whereto putdata

howmuch

specialrequests

(4) IOP interruptsCPU when done

Page 13: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

13Kurt Keutzer

Memory Mapped I/O

Single Memory & I/O Bus No Separate I/O Instructions

CPU

Interface Interface

Peripheral Peripheral

Memory

ROM

RAM

I/O$

CPU

L2 $

Memory Bus

Memory Bus Adaptor

I/O bus

Page 14: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

14Kurt Keutzer

Delegating I/O Responsibility from the CPU: DMA

Direct Memory Access (DMA):

� External to the CPU

� Act as a master on the bus

� Transfers blocks of data to or from memory without CPU intervention

CPU

IOC

device

Memory DMAC

CPU sends a starting address, direction, and length count to DMAC. Then issues "start".

DMAC provides handshakesignals for PeripheralController, and MemoryAddresses and handshakesignals for Memory.

Page 15: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

15Kurt Keutzer

Direct Memory Access

CPU

IOC

device

Memory DMAC

Time to do 1000 xfers at 1 msec each:1 DMA set-up sequence @ 50 µsec1 interrupt @ 2 µsec1 interrupt service sequence @ 48 µsec

.0001 second of CPU time

CPU sends a starting address, direction, and length count to DMAC. Then issues "start".

DMAC provides handshake signals for PeripheralController, and Memory Addresses and handshakesignals for Memory.

0ROM

RAM

Peripherals

DMACn

Memory Mapped I/O

Page 16: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

16Kurt Keutzer

68332 Family

68K was the most successful embedded controller in history

CISC instruction set - good code density

Table lookup for compressed tables

Time processing unit - breakthrough in modular peripheral handling!

Page 17: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

17Kurt Keutzer

MC68332 - Top level

inter module busIMB

I/0 - channel 0

I/0 - channel 15unitTPU

time processingCPU32

serial I/0

IMB control RAM

TPU

Designed for automotive applications with mixture of computation intensive tasks and complex I/0 -functions Idea: off-load CPU from frequent I/0 interactions to make use of computation performance:

Page 18: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

18Kurt Keutzer

68332 CPU Block Diagram

Page 19: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

19Kurt Keutzer

Addressing Modes in 68332

Seven modes

• Register direct

• Register indirect

• Register indirect with index

• Program counter indirect with displacement

• Program counter indirect with Index

• Absolute

• Immediate

Why so many modes? Antiquated architectural feature?

Page 20: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

20Kurt Keutzer

Addressing Modes in 68332

Seven modes

• Register direct

• Register indirect

• Register indirect with index

• Program counter indirect with displacement

• Program counter indirect with Index

• Absolute

• Immediate

Complex addressing modes allow for more dense code … but …MCore - Mot’s embedded micocontroller rewrite uses simple DLX-like

Load Store instructions - code size impact?

Page 21: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

21Kurt Keutzer

MC68332 Time Processing Unit

IMB

Data

Control ServiceRequests

Microengine

HostInterface

TimerChannelsScheduler

DevelopmentSupportand Test

SystemConfiguration

ChannelControl

ParameterRAM

Store

ExecutionUnit

Channel 0Channel 1

Channel 15

Pins

Control andData

Channel

ControlStore

timebase

TPU: time processing unit: peripheral coprocessor

independent programmable timer channels: single-shot "capture & compare"channel coupling and sequence control with control processor

pin

Page 22: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

22Kurt Keutzer

Time Processing Unit

Page 23: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

23Kurt Keutzer

Time Processing Unit

Semi-autonomous microcontroller

Operates concurrently with CPU

• Schedules tasks

• Processes ROM instructions

• Accesses shared data with CPU

• Performs Input/Output

Page 24: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

24Kurt Keutzer

Uses of Time Processing Unit

Programmable series of two operations

• Match

• Capture

Each operation is called an ``event’’

A pre-programmed series of event is called a ``function’’

Pre-programmed functions

• Input capture/input transition counter

• Output compare

• Period measurement with addition/missing transition detect

• Position synchronized pulse-generator

• Period/pulse-width accumulator

Page 25: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

25Kurt Keutzer

Time BasesTwo sixteen-bit counters

provide time bases for all

Pre-scalers controlled by CPU via bit-fiels in TPU module configuration register TPUCMR

Current values accessible via TCR1 and TCR2 registers

TCR1, TCR2 can be read/written by TPU microcode- not available to CPU

TC1 qualified by system clock

TC2 qualified by system clock or external clock

Page 26: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

26Kurt Keutzer

Timer Channels

Sixteen channels

- each one connect to a MCU pin

Each channel has symmetric hardware:

• Event register

� 16-bit capture register

� 16-bit compare/match register

� 16-bit comparator

• Pin control logic - pin direction determined by TPU microengine

Page 27: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

27Kurt Keutzer

Scheduler

Determines which of sixteen channels is serviced by the microenginer

Channel can request service for one of four reasons

� host service

� link to another channel

� match event

� capture event

• Host system assigns to each channel a priority

� high

� middle

� low

Page 28: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

28Kurt Keutzer

Microengine

Determines which of sixteen channels is serviced by the microenginer

Channel can request service for one of four reasons

� host service

� link to another channel

� match event

� capture event

• Host system assigns to each channel a priority

� high

� middle

� low

Page 29: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

29Kurt Keutzer

Another Motorola Microprocessor

Page 30: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

30Kurt Keutzer

Concepts so far ...

• Interrupts

• Memory Mapping of I/O

• Time Processing Unit / Peripheral Processor

other configurable elements

Peripherals

Instructions

Page 31: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

31Kurt Keutzer

Configurability in ARM Processor

ARM allows for configurability via AMBA bus

Offers ``prime cell’’ peripherals which hook into AMBA Peripheral Bus (APB)

• UART

• Real Time Clock

• Audio Codec Interface

• Keyboard and mouse interface

• General purpose I/O

• Smart card interface

• Generic IR interface

http://www.arm.com/Pro+Peripherals/PrimeCell/index.html

Page 32: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

32Kurt Keutzer

ARM7 core

Page 33: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

33Kurt Keutzer

ARM’s Amba open standard

Advanced System Bus, (ASB) - high performance, CPU, DMA, external

Advanced Peripheral Bus, (APB) - low speed, low power, parallel I/O, UART’s

External interface

http://www.arm.com/Documentation/Overviews/AMBA_Intro/#intro

Page 34: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

34Kurt Keutzer

Ex1: ARM Infrared (IR) Interface

Page 35: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

35Kurt Keutzer

Ex 2: ARM Smart Card Interface

Page 36: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

36Kurt Keutzer

Ex 3: Audio Codec

Page 37: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

37Kurt Keutzer

Another Kind of Configurability

RTLSynthesis

HDL

netlist

logicoptimization

netlist

Library

physicaldesign

layout

Synthesis of a processor core from an RTL description allows for:

• full range of other types of configurability

• additional degrees of freedom in quality of implementation

Examples:

• ARM7

• Motorola Coldfire

• Tensilica Xtensa

Page 38: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

38Kurt Keutzer

Quality of Results Tradeoffs

Delay

Area

Synthesizable implementationallows for explanation of a widerange of implementations

Page 39: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

39Kurt Keutzer

ARM Core7 Thumb Embedded

Page 40: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

40Kurt Keutzer

Ultimate configurabilty :The tensilica solution:

Fast, safetailoring of

coresExtensibility withsynchronization to

the hardware

DSP andperipheral

blocksuP

GeneratoruP

Generator

uPCores

uPCores

Pre-verifiedfunctionlibrary

Pre-verifiedfunctionlibrary

S/Wdevelopmentenvironment

S/Wdevelopmentenvironment

Ultra small andefficient, newarchitectures

Page 41: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

41Kurt Keutzer

Tensilica Viterbi Implementation

Niraj Shah

Scott Weber

290A Final Presentation

Page 42: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

42Kurt Keutzer

Tensilica Flow

.c

.o xt-run

.c.c

gen uArch Designer

gen

xt-gcc

TIE

TensilicaProcessorGenerator

Page 43: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

43Kurt Keutzer

Xtensa Architecture

XtensaCore

Rs Rt RrI

TIE

TIE Extensions:

� single cycle

� state free

� no new exceptions

� no stalls

� typeless data

Rs, Rt, Rr are 32 bit regs

I is the instruction controlling the TIE unit

Xtensa Core is a 32 bit configurable RISC processor

Page 44: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

44Kurt Keutzer

Viterbi Architecture

ACS

TraceBackRAMInit

ADC I/0Device

MeasuredMeasuredPerformancePerformance

HereHere

Page 45: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

45Kurt Keutzer

TIE SetupBMreg (ACS)

-++

31 8:7 0I

Rs Rt

Rr

31 8:7 0Q

bm33123:2415:167:80

bm2bm1bm0

-

0x7F0x7F

-

Controlinstruction

Page 46: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

46Kurt Keutzer

ACS TIE Extension (ACS)

+

+

bm331 24:23 16:15 8:7 0

bm2 bm1 bm017

pm- pm-11 1:027

-=1?

11:12pm

310:10’s

decision bitdecision bit

ACS03 ||ACS12 ||ACS30 ||ACS21

31

instruction

RtRs

Rr

msbmsb

Page 47: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

47Kurt Keutzer

ACS TIE Extension with State (ACS)

bm331 24:23 16:15 8:7 0

bm2 bm1 bm0

+

+

17pm- pm-

1127

-=1?

31Rs

msbmsb

+

+

17pm-pm-

11 27

- =1?

31Rt

msbmsb

11pm

310:1decision bitdecision bit

Rr

pm16:17

0:11:0

27

decision bitdecision bit

Control

instruction

Page 48: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

48Kurt Keutzer

TIE Zmask (TraceBack)

&

31 1:0Rs Rt

Rr

31 6:5 0

6:70

|

0x7F0x7F

<<1<<1

&0x3F0x3F

31

Controlinstruction

Page 49: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

49Kurt Keutzer

Designs

All designs had a BER of 0.000095 after 10 million iterations

Design 1

� 100 MHz, 48 mW, 1K DCache, 1K ICache, TIEDesign 1+

� 222 MHz, 144 mW, 1K DCache, 1K ICache, TIE

Design 2-

� 100 MHz, 69 mW, 16K DCache, 16K ICache, TIE

Design 2

� 222 MHz, 191 mW, 16K DCache, 16K ICache, TIE

Design 3

� 222 MHz, 191 mW, 16K DCAche, 16K ICache, TIE with state

Page 50: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

50Kurt Keutzer

Performance

118

409

263

909

357409

793

909966

1142

0

200

400

600

800

1000

1200

Design1

Design1+

Design2-

Design2

Design3

CachePerfect Cache

Kb/sKb/s

Page 51: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

51Kurt Keutzer

Energy Dissipation

uJuJ/bit/bit

0.4

0.12

0.54

0.160.19

0.17

0.240.21 0.2

0.17

0

0.1

0.2

0.3

0.4

0.5

0.6

Design1

Design1+

Design2-

Design2

Design3

CachePerfect Cache

Page 52: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

52Kurt Keutzer

n(s*J)/Bit

n(s*J)/n(s*J)/BitBit

3.39

0.293

2.05

0.176

0.5320.416 0.3150.231 0.2070.148

00.5

11.5

22.5

33.5

Design1

Design1+

Design2-

Design2

Design3

CachePerfect Cache

Page 53: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

53Kurt Keutzer

Die Area

2.1 2.12.372.37

6.146.14

6.7 6.7 6.7 6.7

01234567

Design1

Design1+

Design2-

Design2

Design3

CachePerfect Cache

mmmm22

Page 54: Lecture 11: Interfaces, I/O and Configurable Processorsbwrcs.eecs.berkeley.edu/Classes/CS252/Notes/Lec11-config-trim.pdf · CPU IOP Mem D1 D2 Dn. . . main memory bus I/O bus CPU IOP

54Kurt Keutzer

Summary: Levels of Configurabilty

Configurability is highly desirable in embedded applications

There are many levels of configuration:

• Memory and Cache Sizes - get precisely the sizes your applications needs

• Register file sizes

• Interrupt handling and addresses

• Peripherals

• Instructions

• Physical implementation