code density concerns for new architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfnew...

24
Code Density Concerns for New Architectures Vincent M. Weaver Cornell University Sally A. McKee Chalmers University of Technology 7 October 2009

Upload: others

Post on 08-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

Code Density Concerns for NewArchitectures

Vincent M. Weaver

Cornell University

Sally A. McKee

Chalmers University of

Technology

7 October 2009

Page 2: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

Introduction

• Benchmark ported to 21 different assembly languages

• Hand-optimized for minimum size

• Code tested and works on all architectures

What ISA features lead to high code density?

Can this help designers of new ISAs?

1

Page 3: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

New ISAs? Really?

• ISA design still a concern

• FPGAs make it easy

• Embedded architectures want dense code

• Linux has 12 embedded architectures and counting

2

Page 4: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

Benefits of Code Density

• L1 iCache holds more instructions

• More data fits in unified L2 cache

• Less bandwidth required to memory and disk

• Fewer TLB misses

• Compact loops can be executed from instruction buffer

• Smaller cache footprint can lead to energy savings

3

Page 5: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

What about Performance?

• Hard to optimize performance

• Varies across implementations

• Dense code often performs well

4

Page 6: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

The BenchmarkÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛÛ

###############################################################################ּ###############################################################################ּ##################################################################O#O##########ּ###############################################################################ּ###############################################################################ּ###############################################################################ּ###############################################################################ּ###############################################################################ּ###############################################################################ּ###############################################################################ּ###############################################################################ּ###############################################################################ּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּLinuxּVersionּ2.6.11.4-21.17-smp,ּCompiledּ#1ּSMPּFriּAprּ6ּ08:42:34ּUTCּ2007ּּּּTwoּ2791MHzּIntel(R)ּXeon(TM)ּProcessors,ּ2027MּRAM,ּ5521.40ּBogomipsּTotalּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּsampakaּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּּ

LZSS decompression

System calls, including disk I/O

String manipulation and search

Integer to ASCII conversion

5

Page 7: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

ISA Categories

• VLIW

• CISC

• RISC

• Embedded

• 8/16-bit

6

Page 8: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

VLIW Processors

ia64

• 16-byte bundle holds 3 instructions

• Instruction has 3 arguments

• Hundreds of integer registers

• Predication

7

Page 9: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

CISC Processors

m68k, s390, VAX, x86, x86 64

• Variable instruction length, 1-54 bytes

• Instruction has 2 arguments

• 16 integer registers (x86 has 8)

• Status flags

• Unaligned loads

• Complex addressing modes

8

Page 10: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

RISC Processors

Alpha, ARM, m88k, microblaze, MIPS, PA-RISC,PPC, SPARC

• 4 byte instruction length

• Instruction has 3 arguments

• 32 integer registers (except ARM, SPARC)

• Most have a zero register

• Many have branch delay slot

9

Page 11: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

Embedded Processors

avr32, crisv32, sh3, ARM Thumb

• 2 byte instruction length

• Instruction has 2 arguments

• Most have 16 integer registers

• Auto-incrementing loads

• Status flags

10

Page 12: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

8/16-bit Processors

6502, PDP-11, z80

• Variable instruction length (1-6 bytes)

• Instruction has 1-2 arguments

• Status flags

11

Page 13: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

Results – LZSS Decompression

ia64alp

ham

ips

paris

csp

arc

mbla

ze65

02m

88k

arm

s390 pp

c

pdp-

11 vax

z80m

68k

thum

bav

r32

sh3

x86_

64

crisv

32 i386

0

64

128

192

256

byte

s

VLIW

RISC

CISC

embedded

8/16-bit

12

Page 14: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

Results – String Concatenation

ia64alp

hasp

arc

mips

paris

cm

88k

mbla

ze65

02 arm pp

cva

x

thum

bsh

3av

r32m

68k

s390 z8

0

crisv

32

x86_

64

pdp-

11 i386

0

16

32

48

64

byte

s

VLIW

RISC

CISC

embedded

8/16-bit

13

Page 15: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

Results – String Search

ia64alp

hasp

arc

mbla

zem

88k

paris

cm

ips arm pp

c65

02

pdp-

11m

68k

sh3

thum

bva

xz8

0

x86_

64 i386

crisv

32av

r32

s390

0

64

128

192

256

byte

s

VLIW

RISC

CISC

embedded

8/16-bit

14

Page 16: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

Results – Integer → ASCII

ia64

paris

c65

02alp

ha arm

m88

ksp

arc

z80

sh3

ppc

mips

thum

b

mbla

zem

68k

crisv

32

pdp-

11s3

90av

r32

vax

x86_

64 i386

0

64

128

192

256

byte

s

VLIW

RISC

CISC

embedded

8/16-bit

No HW divide

15

Page 17: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

Results – Overall

ia64alp

ha

paris

csp

arc

mbla

zem

ipsm

88k

arm pp

c65

02s3

90

x86_

64 vax

sh3m

68k

i386

thum

bz8

0av

r32

crisv

32

pdp-

110

512

1024

1536

2048

2560

byte

s

VLIW

RISC

CISC

embedded

8/16-bit

16

Page 18: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

Correlations

CorrelationArchitectural Parameter

Coefficient

0.9381 Smallest possible instruction length

0.9116 Low number of integer registers

0.7823 Low Virtual address of first instruction

0.6607 Architecture lacks a zero register

0.6159 Low Bit-width

0.4982 Few operands in each instruction

0.3854 Hardware divide in ALU

17

Page 19: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

More Correlations

CorrelationArchitectural Parameter

Coefficient

0.3653 Unaligned load/store available

0.3129 Year the architecture was introduced

0.2521 Hardware status flags (zero/overflow/etc.)

0.2121 Auto-incrementing addressing scheme

0.0809 Machine is big-endian

0.0021 Branch delay slot

18

Page 20: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

Results – C Comparison (x86/Linux)

-O3gcc

-Osgcc

-Osintel

-Ossun

-Osintel

-O3gcc

-Osgcc

-Ossun

-O3gcc

-Osgcc

-Osgcc

-Osintel

-O3gcc

-Osgcc

handopt.

1024

2048

4096

6144

8192

Siz

e (b

ytes

)

~500kB

GLIBC / STATIC GLIBC / DYNAMIC uCLIBC SYSCALL ONLY ASM

19

Page 21: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

What is holding back the C version?

• Stack frame (Calling convention)

• Pointer aliasing

• Full program register allocation

• Constant loading optimizations

• String instructions

20

Page 22: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

Related Work

• RISC Code Compression

• Kozuch and Wolfe – investigate VAX, MIPS, SPARC,

m68k, RS6000, PPC

• Hasegawa et al. – gcc generated code on m68k, x86,

i960, Sparclite, SPARC, MIPS, AMD29k, m88k, Alpha,

RS6000

• Flynn et al. – synthetic architectures

21

Page 23: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

Conclusions / Future Work

• New ISAs are continually being developed; code density

is still a concern

• Short instruction codings are key

• High code density requires co-operation of ISA, operating

system, system libraries, and compiler

• More architectures should be investigated, as well as

more and larger benchmarks

22

Page 24: Code Density Concerns for New Architecturesweb.eece.maine.edu/~vweaver/papers/iccd09/iccd09.pdfNew ISAs are continually being developed; code density is still a concern Short instruction

Questions?

All code is available:

http://www.deater.net/weave/vmwprod/asm/ll

23