microcontrollers: introduction -...

63
1 Microcontrollers: Introduction Dott.Credits: Domenico Balsamo Michele Magno Luca Benini

Upload: dothien

Post on 11-Jul-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

1

Microcontrollers: Introduction

Dott.Credits: Domenico BalsamoMichele Magno

Luca Benini

2

Embedded Systems

Embedded computing system: any device that includes a programmable computer but is not itself a general‐purpose computer.

One or more microcontrollers (MCU) hidden in a variety of devices and objects:

The MCU has to control and enhance the functionalities of the device

The MCU is a secondary characteristic and must have a small impact on resource consumption and costs.

3

Digital Electronic Integrated circuit

4

What is a microcontroller (AKA MCU)?

A Microcontroller is a small CPU with many support devices built into the chip

Self Contained (CPU, Memory, I/O)

Application or Task Specific (Not a general-purpose computer)

Appropriately scaled for the job

Small power consumption

Low costs ( $0.50 to $5.00.)

5

MCU-Based System Architecture

Flexible sensor interface Ultra-low power standby Very Fast wakeup Watchdog and Monitoring Efficient wireless protocol primitives Data SRAM is critical limiting resource

proc

DataSRAM pgm

EPROM

timersSensor Interface digital sensors

analog sensorsADC

Wireless NetInterface

Wired NetInterface

RFtransceiver

antenna

serial linkUSB,EN,…

Power supply-Standby & Wakeup

Flash Storage

pgm images

data logs

WD

6

Market & Families Microcontroller unit sales are 15x higher than Microprocessors and are

much cheaper. Most manufacturers offer a wide range of devices for low end to higher

end applications

TOP MCU suppliers ($) – www.statista.com

NXP acquired freescale

7

How we compare and classify microcontrollers? Performance Metrics NOT easy to define and mostly application

depended.

Performance Metrics

Computation: Clock Speed MIPS (instructions per sec) Latency

Lateness of the response Lag between the begin and the end

of the computation Throughput

Tasks per second Byte per second

Goal: best tradeoffpower consumptions Vsperformance

Eletrical: Power Consumptions Voltage Supply Noise Immunity Sensitivity

8

Example of MCU Architecture I/O PortADC ‐ DAC

USARTxTIMERsDMA

MemoryClock

BUS

CPU

9

The MCU CORE An instruction processor

Instruction set- CISC Complex Instruction Set Computing (Intel x86 family; Motorola 680x0

Family)- RISC Reduced Instruction Set computer (AIM Power PC, ARM family, ATMEL

AVR Family)

Architecture (respect integer operand maximum dimension)- 8 bit (Intel 8051, Motorola 6800, ATMEL AVR ) - 16 bit (Intel 8088, Motorola 68000, TI MSP430)- 32 bit (ARM v7, x86 family, Motorola 680x0 Family, Power PC)- 64 bit (ARM v8, x86-64 family, Power PC)

Recent trend - Proprietary ISAs (you cannot make your own processor) vs. Open Source ISA (e.g. RISC-V)

10

The CPU consists of a data section containing registers and an ALU, and a control section, which interprets instructions and effects register transfers. The

data section is also known as the datapath.

Abstract View of a CPU

11

Datapath & Control

Datapath: Storage, FU, interconnect sufficient to perform the desired functions Inputs are Control Points Outputs are signals (such as overflow, negative, etc)

Controller: State machine to orchestrate operation on the data path Based on desired function and signals 11

Datapath Controller

Control Points

signals

12

The datapath usually consists of a collection of registers known as the register file and the arithmetic and logic unit (ALU).

An Example Datapath

13

Microcontroller Architectures

CPUProgram + Data

Address Bus

Data Bus

Memory

Von NeumannArchitecture

CPUProgram

Address Bus

Data Bus

HarvardArchitecture

Memory

Data

Address Bus

Fetch Bus

0

0

0

2n

14

von Neumann architecture

15

Harvard architecture

16

Harvard features

17

von Neumann vs. Harvard

Harvard can’t use self-modifying code.

Harvard allows two simultaneous memory fetches.

Most DSPs use Harvard architecture for streaming

data:

greater memory bandwidth;

more predictable bandwidth.

18

von Neumann Architecture an example

MSP430 Texas Instruments

von-Neumann architecture

All program, data memory and

peripherals share a common bus

structure.

Consistent CPU instructions and

addressing modes are used.

19

Harward Architecture Example

Cortex M3: The mainstream ARM

processor for microcontroller applications.

20

Harward Architecture Example Cortex M3/4

Memory protection unit (MPU)

Prevents application task from corrupting OS or other task data▪ Improves system reliability

User-configurable regions▪ Address▪ Size▪ Memory attributes▪ Access permissions

MEMORY

data for

taskA

data for

task C

data for

OS kernel

data for

task B

I/O #2I/O #1I/O #0

I/O #n

MPU

MPUconfiguration

OS kernel(privileged)

task A

task B

task C

ARMCortex-M

Universität Dortmund

• Multiple stages are involved in executing an instruction.– Example: 1) Fetching the instruction code2) Decoding the instruction code3) Executing the instruction code

• Hence multiple processor clock cycles are needed to execute one single instruction.

Fetch Instruction

Decode Instruction

Execute Instruction

time

Fetch Instruction

Decode Instruction

Execute Instruction

1st 2nd

Instruction Execution

Universität Dortmund

• The pipeline allows concurrent execution of multiple different instructions– execution of different stages of multiple instructions at the same time

• During a normal operation– while one instruction is being executed– the next instruction is being decoded– and a third instruction is being fetched from memory– allows effective throughput to increase to one instruction per clock cycle

Instruction Pipeline

Universität Dortmund

Simple 3‐Stage Pipeline

• The ARM Cortex‐M3 Uses the 3‐stage pipeline for instruction executions– Fetch  Decode  Execute– Pipeline design allows effective throughput to increase to one 

instruction per clock cycle– Allows the next instruction to be fetched while still decoding or 

executing the previous instructions

Fetch Decode Execute

Fetch Decode Execute

Fetch Decode Execute

1st

2nd

3rd

time

Universität Dortmund

ARM Processors Families 

25

Universität Dortmund

• Key attributes: Implementation size, performance, and very low power.

• Architectures types:– ARMv4T architecture introduced the 16‐bit Thumb® instruction 

set alongside the 32‐bit ARM instruction set.– ARMv5TEJ architecture introduced arithmetic support for digital 

signal processing (DSP) algorithms.– ARMv6 architecture introduced an array of new features including 

the Single Instruction Multiple Data (SIMD) operations.– ARMv7 architecture implementsThumb‐2 technology.

• Cortex‐A implements a virtual memory system architecture based on an MMU, an optional NEON processing unit for multimedia applications and advanced hardware Floating Point.

• Cortex‐R – implements a protected memory system architecture based on an MPU (memory protection unit).

• Cortex‐M – Microcontroller profile designed for fast interrupt processing.– ARMv8 implementing 64bit instruction set

ARM Processors Architectures (2)

26Alberto Macii - Politecnico di Torino

Universität Dortmund

Cortex M family ‐ Comparison

Universität Dortmund

Embedded ARM Cortex Processors

• Cortex M0:– Ultra low gate count (less that 12 K gates).

– Ultra low‐power (3 µW/MHz ).

– 32‐bit processor.

28

Universität Dortmund

Embedded ARM Cortex Processors 

• Cortex M3:– The mainstream ARM processor for microcontroller applications.

– High performance and energy efficiency.

29

Universität Dortmund

Cortex M3 Central Core

• Harvard architecture:– Separate Instruction & Data buses enable 

parallel fetch & store.

• Advanced 3‐Stage Pipeline:– Includes Branch Forwarding & Speculation

• Additional Write‐Back via Bus Matrix.

30

Alberto Macii - Politecnico di Torino

Universität Dortmund

Embedded ARM Cortex Processors (4)

31

Cortex M4Embedded processor for DSP with FPU 

Universität Dortmund

Cortex M7

2x Perf of M4

Universität Dortmund

ARMv8 64bit 

• Premium smartphones

• Enterprise servers• Home server• Wireless Infrastructure

• Digital TV

Universität Dortmund

Cortex A57 Block Diagram

Universität Dortmund

ARM Partnership Model

35Alberto Macii - Politecnico di Torino

High-performance Cortex™-M4 MCUSTM32 F4 series

STM32 F4 series: Most powerful Cortex-MKey features

STM32 – leading Cortex-M portfolio

STM32 product series

4 product series

STM32 F4 portfolio

STM32 F4 series – applications served

Points of sale/inventory management

Industrial automation and solar panels

Transportation

Medical

Building

Security/fire/HVAC

Test and measurement

Consumer

Communication

STM32 F4 block diagramFeature highlight

168 MHz Cortex-M4 CPU

Floating point unit (FPU)

ART Accelerator TM

Multi-level AHB bus matrix

1-Mbyte Flash, 192-Kbyte SRAM

1.7 to 3.6 V supply

RTC: <1 µA typ, sub second accuracy

2x full duplex I²S

3x 12-bit ADC 0.41 µs/2.4 MSPS

168 MHz timers

51/82/114/140 I/Os

USB 2.0 OTGFS/HS

Encryption**

Camera Interface

3x 12-bit ADC24 channels / 2Msps

3x I2C

Up to 16 Ext. ITs

Temp Sensor

2x6x 16-bit PWMSynchronized AC Timer 2x Watchdog

(independent& window)

5x 16-bit Timer

XTAL oscillators32KHz + 8~25MHz

Power Supply Reg 1.2V

POR/PDR/PVD

2x DAC + 2 Timers

2 x USART/LIN

1 x SPI

1 x Systic Timer

PLLClock ControlRTC / AWU

4KB backup RAM

Ethernet MAC 10/100, IEEE1588

USB 2.0 OTG FS

4x USART/LIN

1x SDIO

Int. RC oscillators32KHz + 16MHz

3 x 16bit Timer

2x 32-bit Timer

2x CAN 2.0B

2 x SPI / I2S

HS requires an external PHY connected to ULPI interface,** Encryption is only available on STM32F415 and STM32F417

4

STM32F4xx Block Diagram vith details Cortex-M4 w/ FPU, MPU and ETM Memory

Up to 1MB Flash memory 192KB RAM (including 64KB CCM

data RAM FSMC up to 60MHz

New application specific peripherals USB OTG HS w/ ULPI interface Camera interface HW Encryption**: DES, 3DES, AES

256-bit, SHA-1 hash, RNG. Enhanced peripherals

USB OTG Full speed ADC: 0.416µs conversion/2.4Msps,

up to 7.2Msps in interleaved triplemode

ADC/DAC working down to 1.8V Dedicated PLL for I S precision Ethernet w/ HW IEEE1588 v2.0 32-bit RTC with calendar 4KB backup SRAM in VBAT domain 2 x 32bit and 8 x 16bit Timers high speed USART up to 10.5Mb/s high speed SPI up to 37.5Mb/s

2

RDP (JTAG fuse) More I/Os in UFBGA 176 package

ARM

® 3

2-bi

tmul

ti-AH

Bbu

sm

atrix

Ar

bite

r (m

ax16

8MH

z) Flas

hI/F

CORTEX-M4CPU + FPU +MPU168 MHz

128KB SRAM

DMA16 Channels

Bridge

Bridge APB1 (max 42MHz)

JTAG/SW Debug

ETM

Nested vect IT Ctrl

512kB- 1MBFlash Memory

External Memory Interface

AHB1

(max 168MHz)

AHB2 (max 168MHz)

APB

2(m

ax84

MH

z)

64KB CCM data RAM

D-bus

I-bus

S-bus

Evaluation board for full product feature evaluation Hardware evaluation platform for all interfaces Possible connection to all I/Os and all

peripherals Discovery kit for cost-effective evaluation and

prototyping

Large choice of development IDE solutions from the STM32 and ARM ecosystem

Extensive tools and SW

STM32F4DISCOVERY $14.90

STM3240G-EVAL

$349

Why Low Power Is so Important for MCUs?

Longer battery life Smaller products Simpler power supplies Less EMI simplifies PCB Permanent battery Reduced liability

Power as a Design Constraint

Why worry about power? Battery life in portable and mobile platforms Power consumption in desktops, server farms Cooling costs, packaging costs, reliability, timing Power density: 30 W/cm2 in Alpha 21364

(3x of typical hot plate)

Where does power go in CMOS?

leakshort2 VIfAVIfACVP

Dynamic power consumption

Power due to short-circuit current during transition

Power due to leakage current

Dynamic Power Consumption

fACV2

A - Activity of gates How often on average do wires switch?

f – clock frequencyTrend: increasing ...

V – Supply voltage Trend: has been dropping with each successive fab

C – Total capacitance seen by the gate’s outputsFunction of wire lengths,transistor sizes, ...

Reducing Dynamic Power1) Reducing V has quadratic effect; Limits?2) Lower C - shrink structures, shorten wires3) Reduce switching activity - Turn off unused parts or

use design techniques to minimize number of transitions

Short-circuit Power Consumption

Finite slope of the input signal causes a direct current path between VDD and GND for a short period of time during switching when both the NMOS and PMOS transistors are conducting

Vin Vout

CL

Ishort

fAVIshort

Reducing Short-circuit1) Lower the supply voltage V2) Slope engineering – match the rise/fall time of the input and output signals

Leakage Power

leakVI

Sub-threshold current grows exponentially with increases in temperature and decreases in Vt

Sub-threshold current

How can we reduce power consumption?

Dynamic power consumption Reduce the rate of charge/discharge of highly loaded nodes Reduce spurious switching (glitches) Reduce switching in idle states (clock gating) Decrease frequency Decrease voltage (and frequency)

Static power Consumption Smaller area (!) Reduce device leakage through power gating Reduce device leakage through body biasing Use higher-threshold transistors when possible

Power performance tradeoffs!

Typical Ultra-Low Power MCU ArchitectureSystem Clock

Generator

ACLK

SMCLK

MCLK

CPU

Key Feature

• MCLK Main clock provided to the CPU

• SMCLK Sub-Main clock provided to the peripherals

• ACLK Auxiliary clock at low frequency provided to

the peripherals

• Peripherals can work at High and Low frequency

• Each Clock can be disabled (Clock Gating, reducing

dynamic power) by setting the status register SR.

• The CPU can be disabled (reducing Leakage power) by

setting the SR.

Typical application profile

Time

•Application phases:• OFF – power is not applied to MCU• STARTUP INITIALIZATION – MCU performs configuration (peripherals, clocks, …)• Tperiod

• INACTIVE – MCU is in low power mode to reduce power consumption• ACTIVE – MCU is in normal mode and performs tasks

2

OFF STARTUP INITIALIZATION

IRQ

IDD

IRQ

TASKS

Process ACTIVE

INACTIVE

Tperiod Tperiod

TASKS

ACTIVE

INACTIVE INACTIVE

Microcontroller Power States

3RUN (Range1) at 80 MHz 120 µA / MHz**

STANDBY 115 nA / 415 nA*

VBAT 4 nA / 300 nA*

SHUTDOWN 30 nA / 330 nA*

STANDBY + 32 KB RAM 350 nA / 650 nA*

256 µs

14 µs

14 µs

5 µs

6 cycles

Wake-up time

4 µs

STOP 2 (full retention) 1.1 µA / 1.4 µA*

LPSLEEP at 2 MHz 48 µA / MHz

RUN (Range2) at 26 MHz 100 µA / MHz**

STOP 1 (full retention) 6.6 µA / 6.9 µA*

Typ @ VDD =1.8 V @ 25 °C

* : with RTC** : from SRAM1

6 cycles SLEEP at 26 MHz 35 µA / MHz

LPRUN at 2 MHz 112 µA / MHz**

54

How to Read Datasheets

Manufacturers of electronic components provide datasheets containing the specifications detailing the part/device characteristics;

Datasheets give the electrical characteristics of the device and the pin-out functions, but without detailing the internal operation;

More complex devices are provided with documents that aid the development of applications, such as: Application notes; User's guides; Designer's guides; Package drawings, etc…

55

Datasheet example

56

Datasheet example

57

Datasheet example

58

Datasheet example

59

Datasheet example

60

Datasheet example

61

Datasheet example

62

Datasheet example

63

Datasheet example