a breakthrough new cpu architecture revives ipc scaling

19
A Breakthrough New CPU Architecture Revives IPC Scaling Mohammad Abdallah Founder, President and CTO Linley Processor Conference October 23, 2014

Upload: rusnano

Post on 05-Jul-2015

1.662 views

Category:

Engineering


1 download

DESCRIPTION

Soft Machines представляет революционную архитектуру микропроцессоров VISC™, возрождающую рост производительности на ватт потребления Компания Soft Machines — стартап из Кремниевой долины, работающий в области полупроводников, анонсировал архитектуру Soft Machines VISC™. В числе инвесторов компании — Samsung Ventures, AMD, Mubadala, РВК, KACST, РОСНАНО и TAQNIA. VISC-архитектура — это настоящий прорыв в направлении увеличения производительности микропроцессоров в расчете на ватт потребляемой мощности. Эта разработка позволит существенно повысить энергоэффективность во всех сегментах компьютерной экосистемы. VISC-архитектура разрабатывалась как решение проблем увеличения частоты одноядерных процессоров и сложности программирования многоядерных процессоров. http://www.rusnano.com/about/press-centre/news/20141024-soft-machines-predstavlyaet-revolyutsionnuyu-architekturu-mikroprotsessorov-visc

TRANSCRIPT

Page 1: A Breakthrough New CPU Architecture Revives IPC Scaling

A Breakthrough New CPU Architecture Revives IPC Scaling

Mohammad Abdallah

Founder, President and CTO

Linley Processor Conference October 23, 2014

Page 2: A Breakthrough New CPU Architecture Revives IPC Scaling

• Emerging from stealth mode

• Developed new VISC™ Architecture

• 7 years, $125M R&D

• ~250 employees , 75+ patents filed

Introducing Soft Machines™

2 ©Copyright 2014, All Rights Reserved

Page 3: A Breakthrough New CPU Architecture Revives IPC Scaling

The Death of CPU Scaling

©Copyright 2014, All Rights Reserved 3

“The failure of CPU scaling after 30 years of continual improvements may have slammed the door on the easiest and most common type of performance scaling…”

The Death of CPU Scaling ExtremeTech (2012)

2014

Microprocessor Scaling Realities after 2004

Transistor scaling continues

Clock speed flat

Power budget flat

Perf/clock flat

Source: “The Free Lunch is Over”, Herb Sutter

Page 4: A Breakthrough New CPU Architecture Revives IPC Scaling

Industry Response: Multi-Core

4

Core1 Core2

Thread1 Thread2

Advantages: - Utilizes growing transistor

budget - Performance scaling for

parallel code - Improves throughput

Challenges: - ST performance doesn’t scale - Threading/multicore coding

complexity - Amdahl’s Law of diminishing returns - Dark silicon

©Copyright 2014, All Rights Reserved

Page 5: A Breakthrough New CPU Architecture Revives IPC Scaling

• Revive CPU performance scaling

• Utilize Moore’s Law transistor scaling

• Mitigate dark silicon

• Liberate ISA dependency

CPU Architecture Challenge

5 ©Copyright 2014, All Rights Reserved

Page 6: A Breakthrough New CPU Architecture Revives IPC Scaling

VISC™ Architecture Wave

6

RISC (MIPS)

CISC (IBM/Intel)

VISC (Soft Machines)

Software Scalability/Productivity

Compilation Concurrency Extraction Assembly

Device Physics Scalability

Short Pipeline

Code Memory size

Deep OoO Pipeline

Processor Speed

Virtual Cores/Threads

Processor Power

Late 1980s – 2010s 1970s – early 1980s 2010s

VISC Architecture scales on both physical and software productivity layers

©Copyright 2014, All Rights Reserved

Page 7: A Breakthrough New CPU Architecture Revives IPC Scaling

VISC™ Processor Block Diagram

©Copyright 2014, All Rights Reserved 7

L2$ & Memory

Sequential Code

SW Single Thread

Core2 Core1 L1 D$ L1 D$

Core4 Core3 L1 D$ L1 D$

Virtual Cores Global Front End

Virtual HW Threads (HW threadlets)

Virtual Core1

Virtual Core2

Virtual Core3

Virtual Core4

Page 8: A Breakthrough New CPU Architecture Revives IPC Scaling

VISC™ CPU Usage Example

©Copyright 2014, All Rights Reserved 8

or

• VISC dynamically allocates resources across virtual cores based on individual application needs

• Performance/watt balanced for both single & multi-thread applications

Heavy App

Dual SW Threads Single SW Thread Heavy App Light App

Virtual Cores Virtual HW Threads/Threadlets

Core2 Core1 L1 D$ L1 D$

Virtual Core1

Virtual Core2

Virtual Cores Virtual HW Threads/Threadlets

Core2 Core1 L1 D$ L1 D$

Virtual Core1

Page 9: A Breakthrough New CPU Architecture Revives IPC Scaling

VISC™ Architecture Prototype Pipeline

©Copyright 2014, All Rights Reserved 9

Fetch Allocate/ Dispatch EXE

Mem/long latency

Execution

RF read

Virtual Thread

Formation

Pipeline of Virtual Threads Across the Virtual Cores

L2$ & Memory

SW Single Thread

Global Front End

Core2 Core1 L1 D$ L1 D$

Virtual Core1

Virtual Core2

Virtual Cores

Virtual HW Threads (HW threadlets)

Page 10: A Breakthrough New CPU Architecture Revives IPC Scaling

VISC™ Revives IPC Curve

10

ARM A15 1C

Intel Atom

1C

Soft Machines

2VC Proto

Apple A7 1C

ARM A57 1C

Intel Haswell

1C

Compiled Code 32-bit 32-bit 32-bit 32-bit 32-bit 64-bit

Cache 1M 2M 1M 1M+4M 2M 2M

Pipeline Moderate Moderate Shallow Moderate Moderate Deep

IPC(SPEC 2006)* 0.71 0.69 2.1 1.0 .87 1.39

* Company conducted benchmark tests and projections, using industry-standard Compiler GCC 4.6 or equivalent

Mobile CPU designs are pursuing higher ARCH/µARCH complexity

2006 The Basic

A8 2-way

2009 The Simple

A9 2-way OoO

2011 The Moderate

A15 3-way

2013 The Big

Apple A7 6-way

2014 The Ultimate

Haswell 8-way

©Copyright 2014, All Rights Reserved

Page 11: A Breakthrough New CPU Architecture Revives IPC Scaling

• Extracting ILP has significant complexity

• OoO complexity increases quadratically with machine width

• VISC complexity increases linearly with number of virtual cores

• VISC Performance/Watt utilizes linear scaling

VISC™ Concurrency Extraction Linear vs. Quadratic Complexity

11 ©Copyright 2014, All Rights Reserved

Page 12: A Breakthrough New CPU Architecture Revives IPC Scaling

System Energy Approach: DRVFS

12

Virtual Cores – DRVFS • DRVFS: linear increase in power

• P No. of virtual core resources • Higher Perf/MHz enables DVFS scaling DOWN

Physical Cores – DVFS • DVFS: quadratic increase in power

• P V2 * F • Lower Perf/MHz requires DVFS scaling UP

Use Case: Rush to low power mode (boosting

performance or response time)

Core1

©Copyright 2014, All Rights Reserved

Page 13: A Breakthrough New CPU Architecture Revives IPC Scaling

VISC™ Single Thread SPEC/Watt

13

Mob

ile

Serv

er

Same performance in 1/4-1/3rd power or 1.7-2.2x perf at the same power* * Company conducted benchmark tests and projections for 28nm

1C App CPU

Single Thread Performance

Pow

er

1.7x

1/3

1/4

2.1x

1.8x 2.2x 1VC (2C) 1VC (4C)

©Copyright 2014, All Rights Reserved

Page 14: A Breakthrough New CPU Architecture Revives IPC Scaling

VISC™ Dual Thread SPEC/Watt

14

* Company conducted benchmark tests and projections for 28nm

2C App CPU

Mob

ile

Serv

er

Pow

er

Dual Thread Performance

1.4x

1.5x

1/2

0.4x

1.8x

1.9x

Same performance in 0.4 to 0.5x of power or 1.4 - 1.9x perf at the same power*

2VC (2C) 2VC (4C)

©Copyright 2014, All Rights Reserved

Page 15: A Breakthrough New CPU Architecture Revives IPC Scaling

VISC™ Technology Prototype

15

Working Silicon • VISC Processor Proof-of-Concept Prototype

• IPC scalability • VISC architecture • Software efficiency

• Full Platform • VISC Dual Virtual Core Processor • SoC with 3D, Video, DRAM controller,

HD video…. • Full System functionality

• Linux OS • UEFI BIOS • Benchmarks running on Linux • Android ICS booting

©Copyright 2014, All Rights Reserved

Page 16: A Breakthrough New CPU Architecture Revives IPC Scaling

16

Silicon Results: Performance/MHz Dual Virtual Core/A15 IPC Ratio

©Copyright 2014, All Rights Reserved

Page 17: A Breakthrough New CPU Architecture Revives IPC Scaling

VISC™ Architecture

17

Virtual SW layer

Guest Sequential Code

OS & Hypervisor Single Thread

Guest ISA

Virtual ISA

L2$ & Memory

Core2 Core1 L1 D$ L1 D$

Core4 Core3 L1 D$ L1 D$

Virtual Core1

Virtual Core2

Virtual Core3

Virtual Core4

Virtual Cores Global Front End

Virtual HW Threads/Threadlets

©Copyright 2014, All Rights Reserved

Page 18: A Breakthrough New CPU Architecture Revives IPC Scaling

Converter

VISC™ Run-time SW Architecture

18

Low level Virtual Machine

High level Virtual Machine Guest Code (ARM,X86)

Dynamic optimization

VISC™ Processor

Guest/VM to native mapping

Native Code

SMI API

Hot Pass

©Copyright 2014, All Rights Reserved

Page 19: A Breakthrough New CPU Architecture Revives IPC Scaling

• Silicon proven VISC™ architecture delivers 3-4x IPC advantage on single and multi-threaded applications without software changes

• Resulting in ~2-4x performance/watt advantage

• VISC architecture is scalable from IoT to mobile to servers due to its modularity and symmetry

• Number of virtual cores, virtual threads, and virtual instruction layer

• VISC virtual instruction layer provides ISA agnostic and optimized run-time platform capabilities

Summary

19 ©Copyright 2014, All Rights Reserved