mobile ecosystem: a deep focus on modem processing · multi-threading fills stalled pipeline...

28
© 2011 MIPS Technologies, Inc. All rights reserved. Mobile Ecosystem: A Deep Focus on Modem Processing Amit Rohatgi Principal Architect, Mobile MIPS Technologies Digitimes Technical Forum

Upload: trantuyen

Post on 02-Aug-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

© 2011 MIPS Technologies, Inc. All rights reserved.

Mobile Ecosystem: A Deep Focus on Modem Processing

Amit RohatgiPrincipal Architect, MobileMIPS Technologies

Digitimes Technical Forum

2 © 2011 MIPS Technologies, Inc. All rights reserved.

Agenda

� Defining the Mobile Ecosystem

� Basic Block Diagram of a Mobile SoC

� Radio Inferface Layer (RIL)

� Current Look at Phone Segments

� Modem Technologies and Challenges

� Multi-threading Advantages in Modem Processing

3 © 2011 MIPS Technologies, Inc. All rights reserved.

Mobile Ecosystem

� The Mobile Ecosystem for feature phones or smart phones contains� Architecture� Software (Apps + Modem)� Tools� OEMs/ODMs & Carriers� Power, multimedia features

� Connectivity to the internet defines a mobile device� 3gpp/3gpp2� Wifi, BT, NFC, GPS

MobileEcosystem

Architecture (CPU/GPU/DSP/IP)

Modem S/W

Apps S/WTools/Dev

OS(e.g. Android)

Performance(Power,

Multimedia,Benchmarks)

Partners:OEMs,ODMs,SoC,Carriers

4 © 2011 MIPS Technologies, Inc. All rights reserved.

3G/4G Baseband Modem

Basic Mobile SoC Diagram

USIM Touch screen

Power Mgt

3G/4G Baseband

Modem

WiFi +

BT

GPS

SPIGPIO

Memory Controller

HDMI

MIPI Camera

MIPI

SpeakerMic

Audio Codec

GPU

Video Codec

USB

TV

NFC

PWR

MGMT

5 © 2011 MIPS Technologies, Inc. All rights reserved.

Android Radio Interface Layer (RIL)

� Android's Radio Interface Layer (RIL) provides an abstraction layer between Android telephony services (android.telephony) and radio hardware.

� The RIL is radio agnostic, and includes support for Global System for Mobile communication (GSM)-based radios, for example

6 © 2011 MIPS Technologies, Inc. All rights reserved.

Phone Type

Low-tier feature

phone

mid-tier feature

phone low-tier smartphone mid-tier smartphone

high-tier

smartphone

Baseband/App

Processor Chipset M14Kc M14Kc/24Ke 24Ke/34K

34K/

34K family or 74K? or 1004K 1004K family

# Cores 1 1 1 1,2 3+ (multiple VPEs)

Frequency less than 200MHz 200-400MHz 400-600 MHz 600-800MHz 1GHz+

Cache 0/0,0 8/8,0 8/8,0 16/16,128KB 32/32,256-640KB

GPU (e.g.) N/A GC-200 vivante GC-300 vivante SGX530+ SGX545-SGXMP

OS

JAVA ME/BREW/

proprietary

JAVA ME/BREW/

proprietary Android/BMP

Android/BMP/QNX/

Winmobile7

Android/BMP/

QNX/

Win7/Win8/iOS

RAM 16-64MB 128MB 128MB 256MB 512MB

Screen QQVGA/QCIF WQVGA HVGA-VGA WVGA+ QHD+

Retail Cost

Targets <$50 $50-$100 $100-$250 $250-$500 $500+

Current Phone Segments

7 © 2011 MIPS Technologies, Inc. All rights reserved.

Modem Technologies

� 3GPP/3GPP2 (Cellular)� GSM, CDMA, UMTS, LTE

� Wireless LAN� 802.11b/g/n/ac

� Wireless PAN� Bluetooth, NFC, Zigbee

� Location� GPS, GLONASS, Compass, Galileo

8 © 2011 MIPS Technologies, Inc. All rights reserved.

Challenges

� Mobile phones have become rich in modem communication technologies

� Higher data rates (video), lower latency (VoIP) hav e put burden on processor architectures and software planning

� Time-to-market and stability are key elements to success

� Challenge: how to integrate multiple modem technologies, maintain standards compatibility and offer low latency for upcoming technologies, while maintaining stability and fast time to market?

Hardware Multi-threading may provide the answer

© 2011 MIPS Technologies, Inc. All rights reserved.

Multi-threading Advantages in Modem Processing

10 © 2011 MIPS Technologies, Inc. All rights reserved.

Agenda

� General Requirements

� Multi-threading (MT) Benefits

� Task-level Separation

� MIPS Technologies’ Unique MT Cores

� Use Cases

� Summary

11 © 2011 MIPS Technologies, Inc. All rights reserved.

Mobile Modem Processing Requirements

� Architecture (HW and SW)� Small silicon � low cost� Low power � longer battery life� Efficient Protocol Stack � scalability and maintenance

� Functional� Low-latency: going below 1msec� Faster throughput: 3GPP/3GPP2 heading to 100Mbps+� Multi-modem integration (cellular, GPS, WiFi, Bluetooth, NFC)

12 © 2011 MIPS Technologies, Inc. All rights reserved.

Benefits of Hardware Multi-threading

� CPU stalls cycles occur when:� L1 cache miss waiting on fills� L2 cache miss waiting on external memory� Branch miss-prediction

� Multi-threading fills stalled pipeline w/parallel program threads� Fewer wasted stalled cycles� Net-net = fewer total cycles required for overall

program to complete� Higher aggregate throughput� Much greater task-swapping efficiency� Supports multiple OS concurrently� Add more software capabilities at lower

integration cost� Fewer instructions fetched and discarded� Lower power consumption

13 © 2011 MIPS Technologies, Inc. All rights reserved.

Task Level Partitioning for Modem Processing

� Separation of control plane and user plane function ality in L1-L3

� Separation of real-time tasks and non-real-time tasks

� Separation of modem technologies – 1x, UMTS, LTE, GPS, WiFi, Bluetooth, NFC

� Illustrative example follows

14 © 2011 MIPS Technologies, Inc. All rights reserved.

Modem Protocol Stack Partitioning

15 © 2011 MIPS Technologies, Inc. All rights reserved.

Modem Protocol Stack Partitioning

Cor

e/th

read

for

cont

rol

plan

e

Cor

e/th

read

for

user

pl

ane

Cor

e/th

read

for

inte

rrup

t pr

oces

sing

16 © 2011 MIPS Technologies, Inc. All rights reserved.

Modem Protocol Stack Partitioning

Core/thread for L1 control/MAC

Core/thread for L3/NAS

Cor

e/th

read

for

inte

rrup

t pr

oces

sing

17 © 2011 MIPS Technologies, Inc. All rights reserved.

Modem Protocol Stack Partitioning

Cor

e/th

read

for

3GP

P/3

GP

P2

Cor

e/th

read

for

GP

S/W

iFi

Cor

e/th

read

for

BT

/NF

C

Cor

e/th

read

for

inte

rrup

t pr

oces

sing

18 © 2011 MIPS Technologies, Inc. All rights reserved.

Modem Protocol Stack Partitioning

Cor

e/th

read

for

3GP

P/3

GP

P2

Cor

e/th

read

for

GP

S/W

iFi

Cor

e/th

read

for

BT

/NF

C

Cor

e/th

read

for

inte

rrup

t pr

oces

sing

MT Benefits:

� Greater performance per mW and mm2

� Very low latency guaranteed through dedicated thread for ISRs

� Measured savings in s/w development and testing due to reduced freezes, stalls and glitches

� Finer granularity for power management

19 © 2011 MIPS Technologies, Inc. All rights reserved.

Key Capabilities in MIPS ’ Multi-threaded Products

� Two layer Multi-threading framework:� Virtual Processing Elements (VPEs)

• Complete copy of processor state as seen by software• Each VPE appears as a CPU resource to SMP Operating Systems

• Analogous to Hyper-Threading in x86 architecture

� Thread Contexts (TCs) - Support for light-weight multi-threading: • Expand # of threads and performance without need for full VPE hardware

• Thread Contexts (TCs) map to VPEs

� Multi-threading enhanced features:� Thread context switching on a per cycle basis

� Zero-overhead interrupt capability • Can implement through “parking” a thread; use external event to trigger execution

� Quality of Service (QoS) • Through use of user-configurable thread scheduler

• Eases management of real time behavior

� Efficient inter-thread communication• For implementing high-performance data-flow

20 © 2011 MIPS Technologies, Inc. All rights reserved.

Virtual Processors & Thread Contexts

MMU/TLB Interrupts Debug

Pipelines

Caches

GPRs

ProgramCounter

Virtual Processor Element

Thread Context

Virtual Processor

MMU/TLB Interrupts Debug

GPRs

ProgramCounter

Virtual Processor Element

Thread Context

Virtual Processor

GPRs

ProgramCounter

Thread Context

21 © 2011 MIPS Technologies, Inc. All rights reserved.

MIPS Multi-threading Cores

� 34K – single core, multi-threaded performance� Focus on higher number of

user-level H/W threads

� 1-2 VPEs, up to 9 TCs� Customizable policy manager

� 1004K – multi-core, multi-thread� “Best mix” of CMP and MT� Cache Coherence block for

multicore and I/O transactions

� Prodigy – optimized for MT and ST performance� SMT = simultaneous MT

execution� Execution of multiple

instructions/cycle

22 © 2011 MIPS Technologies, Inc. All rights reserved.

Advantages & Challenges of MT

Advantages

� Multi-threaded applications can show 20-30% performance boost or higher

� Efficient use of CPU resources

� Minimize pipeline stalls

� Ultra-low interrupt response time

� In combination with multi-core, provides finer grain control of power and performance

Challenges

� Single-task, single threaded applications show no benefit

� Parallelism within the same workload is an architectural design consideration

� Parallelism shares cache resources

© 2011 MIPS Technologies, Inc. All rights reserved.

Engineering Use Cases

Baseband Processing using Multi-threaded Cores

24 © 2011 MIPS Technologies, Inc. All rights reserved.

Key LTE (4G) PHY Parameters

Data throughput necessitates high compute performance, as well as very low latency!

25 © 2011 MIPS Technologies, Inc. All rights reserved.

VPE 1

Multi-tasking OSrunning Linux applications on a VPE

RTOS handlingVo-LTE on other

VPE

Dedicated TCs for each function

Customer Example: LTE Feature Phone

OS

QoS

Common Hardware

Application

OS

Application

TC TC TCApplications

Linux

Browser

Enc Dec E.C.

ThreadX

Voice over LTE

Policy manager sets relative priority of the tasks

VPE 0

TCApplications

26 © 2011 MIPS Technologies, Inc. All rights reserved.

Customer Example: I/O Co -processor

� 1 VPE used as an I/O co-processor� Code runs in Scratchpad RAM� No segmentation of cache needed => no cache thrashing� Zero latency interrupt/event servicing� Inter-thread communication to signal events to another

VPE for processing

� e.g. use to monitor F-QPCH in 1x (slotted reception f or sleep mode); wakeup receiver for a 5ms window, decode quick-paging bit, optionally send event via ITC and park

� e.g. LTE UE tracking update – now conducted in idle and connected state (during user plane activity) to avoi d ping-pong effects and minimize network signaling

27 © 2011 MIPS Technologies, Inc. All rights reserved.

Customer Example: Data Flow Handler

� Producer / Consumer Model� Handles asynchronous and parallel task of passing data

between layers� Continuous processing, with single semaphore for inter-thread

communication or FIFO (example below)

Producer Consumer

28 © 2011 MIPS Technologies, Inc. All rights reserved.

Summary: MT Advantages in Modem Processing

� MT better utilizes CPU resources� Get more work done with a lower clock

� MP and MT: flexible performance options� Finer grain control on performance vs. gate count� Fewer gates means lower static and dynamic power – longer

standby and talk-time

� MT results in fewer cache misses

Hardware multi-threading can mean lower power requirements, smaller die size (vs. adding another core ),

lower costs, and faster TTM