automatic floating-point to fixed-point transformations

Automatic Floating-Point to Fixed-Point Transformations

Kyungtae Han,

Alex G. Olson,

Brian L. Evans

Dept. of Electrical and Computer EngineeringThe University of Texas at Austin

October 30th, 2006

2006 Asilomar Conference on Signals, Systems, and Computers

2

Outline

• Introduction

• Background

• Fixed-point wordlength optimizations

• Automate transformations of systems

• Conclusion

3

Implementing Digital Signal Processing Algorithms

CodeConversion

WordlengthOptimization

Floating-Point Program

Fixed Point (Uniform Wordlength)

Fixed Point (Optimized Wordlength)

Floating-Point

Processor

Fixed-Point

Processor

Fixed-Point ASIC

Price Power*Hardware

Digital SignalProcessingAlgorithms

* Power consumption

HL

HL

HL

ASIC: Application Specific Integrated Circuit

4

Transformations to Fixed Point

• Advantages Lower hardware complexity Lower power consumption Faster speed in processing

• Disadvantages Introduces distortion due to

quantization error Search for optimum wordlength

by trial & error is time-consuming

• Research goals Automate transformations to fixed point Control distortion vs. complexity tradeoffs

CodeConversion


Floating-Point Program

Fixed Point (Optimized Wordlength)

Tra

nsfo

rmat

ion

5

Feasible region

Distortion vs. Complexity Tradeoffs

• Shorter wordlength may increase application distortion and decrease implementation complexity

• Minimize implementation cost• Minimize application distortion

Implementationcomplexity c(w)

Applicationdistortion d(w)

Optimaltradeoff curve

c(w) Implementation cost function

d(w) Application distortion function

6

Search for Optimum Wordlength

• Complete search Search whole space Impractical in systems with many variables

• Gradient-based search Utilizes gradient information to determine next candidates Complexity measure (CM) [Sung and Kum, 1995]

Distortion measure (DM) [Han et al., 2001]

Complexity-and-distortion measure (CDM) [Han and Evans, 2004]

• Guided random search Genetic algorithm for single objective [Leban and Tasic, 2000]

Multiple objective genetic algorithm

7

Complexity-and-Distortion Measure

• Weighted combination of measures

• Single objective function:• Gradient-based search

Initialization Iterative greedy search based

on complexity and distortiongradient information

)( )( )( www dcf dccd

www

w

w

wΙw

max

max

)(

)(

tosubject

)(min

Cc

Dd

fcdn

10,10,1 dcdcwhere

c(w) Complexity function

d(w) Distortion function

Dmax Constant for maximum distortion

Cmax Constant for maximum complexity

Wordlength

lower bound

Wordlength

upper bound

w

w

8

Genetic Algorithm

• Evolutionary algorithm Inspired by Holland

1975 Mimic processes of

plant and animal evolution

Find optimum of a complex function

New GenePool

FunctionEvaluation

Mutation Selection

MatingChild

Genes

Parental Genes

Genes w/Measure

[From Greg Rohling’s Ph.D Defense 2004]

9

Case Study: Filter Design

• Infinite impulse response (IIR) filter Complexity measure: Area model of field-programmable

gate array (FPGA) [Constantinides, Cheung, and Luk 2003]

Distortion measure: Root mean square (RMS) error Seven fixed-point variables (indicated by slashes)

Delay

b0

b1-a1

x[n] y[n]

10

Case Study: Gradient-Based Search

• CDM could lead to lower complexity and lower number of simulations compared to DM and CM

Search

Method

Gradient

Measure

Number

of

Simulations

Complexity Estimate

(LUT)

Distortion

(RMS)*

Gradient

Gradient

Gradient

Complete

DM

CDM

CM

-

316

145

417

167 **

51.05

49.85

51.95

-

0.0981

0.0992

0.0986

-

* Maximum distortion measured by root mean square (RMS) error is 0.1** 167 = 268,435,456 (8.5 years, if 1 second per 1 simulation)

11

Case Study - IIR: Genetic Algorithm

20 40 60 80 10010

-2

10-1

100

Area (LUTs)

Err

or

(RM

S)

non-dom (90/90)

20 40 60 80 10010

-2

10-1

100

Area (LUTs)

Err

or

(RM

S)

non-dom (67/90)

dom (23/90)

20 40 60 80 10010

-2

10-1

100

Area (LUTs)

Err

or

(RM

S)

non-dom (76/90)

dom (14/90)

100th Generation 250th Generation 500th Generation

• Search Pareto optimal set (nondominated) • Handles multiple objectives: Error and Area

* Population for one generation: 90

Pareto Front

LUT: Lookup table

9,000 simulations 22,500 simulations 45,000 simulations

12

Case Study: Comparison

• Superpose gradient-based search (GS) results on GA results

• GS methods can get stuck in a local minimum

• GS methods reduce running time (CDM: 145 simulations)

* Required RMSmax for gradient-based search are Dmax {0.12, 0.1, 0.08}

500th Generation (45000 simulations)50th Generation (4500 simulations)

Contribution #1

20 40 60 80 10010

-2

10-1

100

Area (LUTs)

Err

or

(RM

S)

non-dom (90/90)

DM solutions

CDM solutions

CM solutions

20 40 60 80 10010

-2

10-1

100

Area (LUTs)

Err

or

(RM

S)

non-dom (35/90)

dom (55/90)

DM solutions

CDM solutions

CM solutions

13

Automating Transformations from Floating Point to Fixed Point

• Existing fixed-point tools Support fixed-point simulation Convert floating-point code to

raw fixed-point code Manually find optimum

wordlength by trial and error

• Automating transformations Fully automate conversion and wordlength

optimization process (Proposed)

Floating-PointProgram

Wordlength-OptimizedFixed-Point Program

CodeConversion


• SNU gFix, Autoscaler• CoWare SPW HDS• Synopsys CoCentric• MATLAB Fixed-point toolbox• MATLAB Fixed-point blockset• AccelChip DSP synthesis• Catalytic RMS, MCS

Fixed-point tools

14

Code Generation for Fixed-Point Program

• Adder function in MATLABFunction [c] = adder(a, b)c = 0;c = a + b;

Function [c] = adder_fx(a, b, numtype)c = 0;a = fi (a, numtype.a);b = fi (b, numtype.b);c = fi (c, numtype.c);c(:) = a + b;

(a) Floating point program for adder

(b) Raw fixed-point program

Function [c] = adder_fx(a, b)c = 0;a = fi (a, 1,32,16);b = fi (b, 1,32,16);c = fi (c, 1,32,16);c(:) = a + b;

(c) Converted fixed-point program for automating optimization (Proposed)

SWL

FWL

fi(a, S,WL,FWL) is a constructor function for a fixed-point object in fixed-point toolbox [S: Signed, WL: Wordlength, FWL: Fraction length]

Determined by designers

with trial and error

15

Automating Transformation Environment for Wordlength Optimization

Top Program

Search Engine

EvaluationProgram

(Objectives)

Fixed-PointProgram

Floating-PointProgram

Error Estimation

Complexity Estimation

RangeEstimation

• Given floating-point program and options, auxiliary programs are automatically generated• Given input data, optimum wordlength is searched

Input Data

Gradient-based or Genetic algorithm

Optimum Wordlength

16

Demo of Released Software

17

Conclusion

• Search for optimum wordlength Gradient-based search reduces execution time with

complexity-and-distortion measure method while solutions could be trapped in local optimum

Genetic algorithm can find distortion vs. complexity tradeoff curve, but it requires longer execution time

• Automate transformations from floating-point programs to fixed-point programs Free software release is available atwww.ece.utexas.edu/~bevans/projects/wordlength/converter/

18

End

19

Backup Slides

20

Case Study- Receiver: Gradient-Based Search

Search

Method

Gradient

Measure

Number

of

Simulations

Complexity Estimate

(LUT)

Distortion

(RMS)*

Gradient

Gradient

Gradient

Complete

DM

CDM

CM

-

66

65

195

164

40.65

43.65

41.95

-

0.083

0.085

0.081

-

* Maximum distortion measured by bit error rate (BER) is 0.1

Integrate&

DumpDemodulate

21

Case Study - Receiver: Genetic Algorithm

Population for one generation: 90

100th Generation 200th Generation

25th Generation 50th Generation

22

Fixed-Point Data Format

• Integer wordlength (IWL) Number of bits assigned to integer representation

• Fractional wordlength (FWL) Number of bits assigned to fraction

• Wordlength (WL)

SystemC formatwww.systemc.org

FWLIWLWL

S X X X X X

Wordlength

Integer wordlength

Fractional wordlength

(Binary point)

π = 3.14159…(10) [Floating Point]

3.140625(10) = 011.001001(2) [WL=9; IWL=3; FWL=6]

3.141479492(10) = 011.00100100001110(2) [WL=16; IWL=3; FWL=13]

23

Wordlength Optimization Constraints

• Distortion constraint • Complexity constraint

ImplementationComplexity c(w)

Application-specific distortion d(w)

Dmax

ImplementationComplexity c(w)

Application-specific distortion d(w)

Cmax

Enforcing both constraints bounds the search to a finite area region

24

Wordlength Optimization

• Wordlengths of signals (variables) in digital system as vector

• Multiple objective optimization

},,,,{ 1210 Nwwww w

www

w

w

wwΙw

max

max

)(

)(

tosubject

)](),([ min

Cc

Dd

dcn

• Single objective optimization

www

w

w

wwΙw

max

max

)(

)(

tosubject

)()( min

Cc

Dd

daca dcn

25

Pareto Optimality

• Pareto optimality: “best that could be achieved without disadvantaging at least one group” [Allan Schick 1970]

• Pareto optimal set is set of nondominated solutions E is dominated by C as all objectives for C

are less than corresponding objectives for E Solutions A, B, C, D are nondominated (not

dominated by any solution)

• Pareto front is boundary (tradeoff curve) that connects Pareto optimal set solutions

Obj

ecti

ve

2

Objective 1

Pareto Front

: Nondominated : Dominated

F

E

GH

I

D

C

B

A

26

Comparison of Proposed Methods

Gradient-based

search

Genetic

algorithm

Type of Solution One point Family of points

Tradeoff Curve Found No Yes

Execution Time Short Long

Amount of Computation Low High

Parallelism Low High

27

Automatic Transformation Flow

• Code generation Parse floating-point program Generate a raw fixed-point program and auxiliary

programs (top, objective, cost, etc.)

• Range estimation Estimate range to avoid overflow (Analytical/Simulation) Determine integer wordlength (IWL)

• Wordlength optimization Optimize wordlength according to given input, and error

specification (Analytical/Simulation) Determine fractional wordlength (FWL)

Code Generation

Wordlength Optimization

RangeEstimation

28

Code Generations

<Run Code Generation>

<Floating-point Program>

automatic floating-point to fixed-point transformations

Documents