multicore and mips: creating the next generation of socs · decades of multi-threading expertise in...
TRANSCRIPT
www.imgtec.com
Jim Whittaker
EVP MIPS Business Unit
Multicore and MIPS: Creating the next generation of SoCs
© Imagination Technologies Multicore Keynote Sept 2014 2
Many new opportunities
Wearables
Home wireless for everything
Automation & Robotics
ADAS and intelligent transport
IoT/IoE
Health
Energy
Agriculture
Big data & analytics
Flexible CPU & heterogeneous processing key to catch the next wave
© Imagination Technologies Multicore Keynote Sept 2014 3
Imagination’s IP portfolio Everything needed to create connected SoC solutions
Unified
Memory
FlowCloud Connectivity
PowerVR Graphics & GPU Compute
Processors
PowerVR Video & Vision
Processors
Ensigma Communications
Processors
MIPS General
Processors
Each IP core is a class leader - when used with any other processors
Lowest power - Smallest silicon area
Open and customer - centric business model
© Imagination Technologies Multicore Keynote Sept 2014 4
Why Multicore
Number of transistors on a chip far exceeds the number we can use to
increase single thread performance
Methods used to increase single thread performance result in reduced
power efficiency
Workload dictates the optimum balance of compute resources
Optimised hardware for specific tasks improve performance/power
© Imagination Technologies Multicore Keynote Sept 2014 5
32-bit embedded
microcontrollers 64-bit advanced
networking processors
…and everything in-between!
Momentum - MIPS CPUs Already deployed across the spectrum
© Imagination Technologies Multicore Keynote Sept 2014 6
MIPS is strong – and growing
Delivering the architecture
Delivering the IP cores
Building up the ecosystem
Revolutionising security
Delivering the most compelling alternative for 64/32bit CPU IP
>5B MIPS CPUs shipped
Up to 40% smaller than competitors
Industry’s leading CoreMark
performance
64bit CPU IP shipping in volume
for 20 years
© Imagination Technologies Multicore Keynote Sept 2014 7
And now…the next phase begins I6400: not just the next MIPS CPU core – the next era of CPU IP
Aptiv
proAptiv
interAptiv
microAptiv
Warrior Series5P MIPS r5 32-bit
P5600
Now
Warrior Series6I
MIPS r6 64/32-bit
I6400
Warrior Series5M MIPS r5 32-bit
M5100 MCU
M5150 MPU with MMU
© Imagination Technologies Multicore Keynote Sept 2014 8
I6400: Broad feature set for a wide range of applications
64-bit, SMT, Virtualization,
SIMD, Heterogeneous MC
SMT, Virtualization, SIMD,
Heterogeneous MC, ECC
Automotive/ Embedded
DTV/STB Mobile Enterprise
64-bit, SMT, MC,
Multi-Cluster,
Virtualization, ECC
SMT, Virtualization,
SIMD, MC
Broadest set of applications ever addressed by a single MIPS core family
I6400 – A MIPS64 AND MIPS32 processor
Instructions
dealing with
64-bit data
MIPS64
MIPS32
MIPS64
Is MIPS32, plus instructions for 64-bit data types
Runs MIPS32 software without mode switching
MIPS64/32 Release 6
Streamlining a highly efficient architecture
Modernization of architecture through:
Additional instructions for enhanced execution on
modern software workloads =
JITs, VMs, PIC, etc. commonly found in Javascript,
Browsers, abstracted compiler technologies (i.e. LLVM)
MIPS: the ultimate 64/32-bit architecture
© Imagination Technologies Multicore Keynote Sept 2014 10
I6400 Multi-threading
Why MT?
A path to higher performance, and higher efficiency
30%-50% higher performance for 10% increase in cluster area*
Ex. CoreMark, DMIPS, SPECint2000
Decades of multi-threading expertise in MIPS and Imagination
Easy to use – programming model is same as multi-core
A thread looks like a core to standard SMP OSs
Simultaneous multi-threading (SMT) execution
Multiple threads execute in a given pipeline stage per cycle, or…
Superscalar execution on a single thread
Thread execution can switch dynamically per cycle
A powerful differentiator among IP cores
Instruction
Queues
Th
rea
d 1
Hardware
Scheduler
Execution
Queues
I6400
* Preliminary performance benefit on popular benchmarks for adding a 2nd thread in I6400 processor, with silicon area cost
Th
rea
d 2
Th
rea
d 3
Th
rea
d 4
© Imagination Technologies Multicore Keynote Sept 2014 11
MIPS64 I6400 - hardware virtualization highlights
Secure Root is the secure hypervisor/kernel
Guest access rights controlled by Root
Full VZ using Root/Guest TLB
Scalable Supports up to 15 Guests (OS and/or Apps)
SoC virtualization support
Virtualized GIC (interrupt controller) and IOMMU
Bus transactions to other IP include Guest ID
Benefits Ease of use - no modification required to Guest OS
Reliability – corrupted/crashed OS1 cannot affect OS2
Performance – intelligent resource allocation
Security – multi-domain support in hardware
Rich set of Trusted Execution Environment features and benefits
MIPS core
Hypervisor/Secure Kernel
OS1 OS2
App
App
App App App Guests
Secure/non-Secure OS/Apps
Root
MIPS64 I6400 base core microarchitecture
Dual-issue In-Order design with MT
Compact, balanced 9-stage pipeline
Dual issue 128b SIMD (Int, SP/DP FPU) IEEE 754-2008 compliant FPU
Instruction bonding on integer, FP ops Doubles throughput on memcopies
Instruction and Data L1 caches w/ ECC 64 byte cache lines
Advanced Branch Prediction
Low latency 128b core:CM interface Snoop
Bus Interface Unit L1 Instr Cache (32-64 KB, 4 way)
Instruction Fetch Unit
Mem Mgmt Unit
Memory
Pipe
Instruction Issue Unit
Branch
Pipe
MCP I/F (128-bit to CM)
On Chip
Trace I/F
Debug Off-chip
Trace I/F
Load/
Store
Address
EJTAG
Trace TAP
Optional
512 Entry FTLB
ALU
Pipe
Graduation Unit
Branch
Resolution
and Store
Data Pipe
MDU
Pipe
Execution Pipes
Power Mgmt
Unit (PMU)
ALU
Pipe
L1 Data Cache (32-64 KB, 4 way)
Optimized for efficiency and maximizing pipeline utilization
64/96 Entry VTLB
MT SIMD
Integer and
SP/DP FPU
Thread1 Thread2 Thread3 Thread4
4-entry I & D
uTLBs per VC
Branch Predict
BHT, JRC, RPS
© Imagination Technologies Multicore Keynote Sept 2014 13
MIPS64 I6400 multi-core features
Coherent cluster, up to 6
cores
Directory-based coherency
improves power, performance
and scalability
PowerGearing for MIPS
Virtualized GIC and IOMMUs
Integrated L2 Cache (L2$) 512KB – 8MB (16-way) with ECC
Low L2$ hit latencies
HW prefetch lowers latency to memory
AXI4 -> ACE System Interface Multi-cluster, heterogeneous scalability
Leverages new coherency architecture
IO Subsystem
Low Power
High Performance
Core 3
Coherency Mgr. with L2$
(Directory-based)
Core 4 Core 5
Global Interrupt Controller (GIC)
Cluster Power Controller (CPC)
Trace Funnel
Custom GCRs
GCRs
To
System
ACE/AXI4
Core 2 Core 1 Core 0
IOCU 1 IOCU 0
128 bits
128 bits
256 bits
© Imagination Technologies Multicore Keynote Sept 2014 14
Building systems:- Threads, cores and clusters… 1 Thread
Core
2-4 Thread
Core
2-6 Core
Cluster 2-64 Cluster Node
© Imagination Technologies Multicore Keynote Sept 2014 15
Building systems:- Threads, cores and clusters… 1 Thread
Core
•Wide range of CPU
configurations
•Hardware virtualization
based security
•PowerGearing™
power management
2-4 Thread
Core
2-6 Core
Cluster 2-64 Cluster Node
SoC Fabric
© Imagination Technologies Multicore Keynote Sept 2014 16
Flexible configuration for flexible needs
Hardware multi-threading
30%-50% more performance for 10% more area
Multi-core
Mix of cores/configurations
Multi-cluster
Mix of heterogeneous CPU clusters
Embedded
Consumer/STB
Mobile
Server
Dataplane
Storage
© Imagination Technologies Multicore Keynote Sept 2014 17
It’s not just CPUs – true heterogeneous processing
Single
Thread
Multi-
Thread
Core
Multi-Core
CPU Cluster Multi-Core
GPU Cluster
© Imagination Technologies Multicore Keynote Sept 2014 18
Unified Memory
IP Platforms: Heterogeneous Network Processors MIPS leads the way in security, hardware multi-threading, coherency – and efficiency
Terabit Coherent Fabric
Ensigma NPU
10/40/100Gbps Offload
Up to 40% better processor area for multi-core Comprehensive support for hardware multi-threading
Coherency across thread, core, cluster
MIPS Coherent Multicore Cluster
Ensigma NPU
Crypto Offload
Customer Differentiating
System IP
MIPS Coherent Multicore Cluster
PowerVR Multicore
GPU Compute
© Imagination Technologies Multicore Keynote Sept 2014 19
IP Platforms: Heterogeneous IoT Device Processors High end feature set for deeply embedded = scales perfectly from high end
SoC Fabric
Ensigma RPU
BT Smart Low Power Wi-Fi
Hardware Virtualization Tightly integrated, low power communications
Class-leading single thread performance
MIPS M-Class
MCU
Customer Differentiating
System IP
On-chip Flash
On-chip RAM
© Imagination Technologies Multicore Keynote Sept 2014 20
Conclusions
Multicore is not just about multiple cores
Threads, cores and clusters – and not just CPUs
The application space is getting wider
Flexible cluster configuration for power management and
burst performance needs many options
MIPS Series6 Warrior cores deliver a compelling
alternative for multi thread/core/cluster CPU IP
Not just for CPUs, but for heterogeneous SoCs
www.imgtec.com
Jim Whittaker
EVP MIPS Business Unit
Multicore and MIPS: Creating the next generation of SoCs