meeting 13 - eecs.umich.edu

Meeting 13

Summer 2009 Doing DSP Workshop

Today:

◮ Admin comments.

◮ Decimation in time DFT.

◮ Other fast algorithms.

◮ An old friend.

One graphic from TI materials.

Learn all you can from the mistakes of others. You won’t have time to make them

all yourself. — Alfred Sheinwold

Doing DSP Workshop – Summer 2009 Meeting 13 – /46 Tuesday – June 16, 2009

Projects

Audio waveform synthesizer –

sine, square wave, triangle, etc.

◮ Darin Rajabian

OFDM.

◮ Yu Wang

Motor speed control lab demon

stration.

◮ Zharori Cong

◮ B.K. Kim

Remote camera using ZigBee.

◮ James Kim

◮ Jordan Adams

Digital Filter Study.

◮ Vindhya Reddy

◮ Joanna Widjaja

Ultrasonic Vision Aide.

◮ Ronald Deang

Not cast in concrete.


Suggested Project Phases

◮ Start up.

◮ Basically define the task, locate useful resources, and verbalize

a possible plan of attack.

◮ Initial Start.

◮ Develop the initial proposal. If applicable, do MATLAB

simulation. Identify required parts and other resources needed

to be purchased. Should have a reasonably clear understanding

of what is to be done and how. Set up goals and time line.

◮ Work in earnest.

◮ Program, build, debug. Repeat.

◮ Completion. Sometime in August.

◮ Demonstration to the workshop.◮ Poster.

Feel free to use ChihWei and myself as resources.


Updated tentative schedule

Week of June 15: Exercise 5, controlSTICK ADC, DAC, xfer meas..

Tuesday – Fast DFTs.

Thursday – Xilinx 8bit PicoBlaze microcomputer (VHDL).

Week of June 22: Exercise 6, realtime FFT and waveform evaluation.

Tuesday – TBD. KM away.

Thursday – TBD. KM away.

Weeks following —

Lecture and lab complete, focus on projects.


Lab floor to be done this week

The floors in EECS 4341 are scheduled to be stripped and waxed this

week.

Except for the tables we got everything up off of the floor on Friday.

We have kept the lab functional by moving the computers onto the tables

to retain access.


Lab tables to be replaced next week.

Friday all of the computers will be shut down an placed in

temporary storage.

The current lab tables will be removed and be replaced with

“real” tables with built in shelving. This hopefully will be done

early in the week.

Once the new lab tables are in place the computers, scopes and

signal generators will be put onto the new benches. Hopefully

the lab will be fully operational by the end of the week.


Today

◮ Fast DFT algorithms.

◮ Some observations on the C28x FFT support.

◮ A useful C version of the FFT.


“The” Fast Fourier Transform (FFT) Algorithm

There are many fast algorithms (FFTs) that can be used to

compute the Discrete Fourier Transform (DFT). The DFT is

defined as

X[k] =

N−1∑

n=0

x[n]e−j2πkn/N , k = 0,1, . . . , N − 1.

The nominal computational cost is N2 complex MACs.

Any algorithm that significantly reduces this number can be

considered as being fast.

There are many fast algorithms. Some algorithms are faster

than others.

The metric by which to judge algorithms by is not always clear.


Many ways of computing the DFTThe paper An Algorithm

for the Machine Computa

tion of Complex Fourier Se

ries by Cooley and Tukey

in 1965 was the first “mod

ern” (or should we say early

computer period?) publica

tion of a fast algorithm for

computing the DFT. This

paper triggered the devel

opment of a large number

of alternative procedures.

The FFT was first discov

ered by Gauss in 1805. It

was used to calculate the

obit of an asteroid. Was

found in one of his work

books written in Latin. But

that’s another story.

Some DFT algorithms:

◮ brute force

◮ Singleton’s DFT speed upprocedure

◮ Goertzel algorithm

◮ decimation in time

◮ decimation in frequency

◮ other radix algorithms

◮ four◮ eight◮ split radix

◮ Winograd’s short lengthconvolution algorithm

◮ prime factor method(GoodThomas)

◮ Winograd Fourier transformalgorithm (WFTA)


Need and capability

Everything has its time.

Richard Garwin had a need (nuclear monitoring).

John Tukey had an idea how to solve it.

James Cooley coded it up and made it work.

Computers were just then coming into general use.

And, of course, Gauss did “it” first.

Good/Thomas published the Prime Factor Algorithm earlier.

The Chinese Remainder Theorem is very, very old.

Almost every implementor has a different view and shares it.

There are well over 3000 publications about the FFT.

More appear to being generated almost continuously.

Who knows how many publications that use/mention it.


Concepts important to fast DFT algorithms

Roots of unity, powers of WN = e−j2π/N .

Symmetry of the sine and cosine.

Index mappings.

Matrix Kronecker products.


Performance characterization constantly changes

Early effort largely minimized multiplication.

This evolved into minimizing the number of arithmetic operations.

Using today’s processors the goal is largely to minimize data movement.

Implementing an FFT on ASIC, arithmetic becomes important again.

Almost always can trade between memory and execution time.

How does one do a gigapoint FFT?

How to exploit parallelism?

Bit serial arithmetic versions exist.

N specific FFT code generators exist.

Can be pipelined.

Is there a lower bound on computational cost?


The decimation-in-time radix-2 FFT

◮ N is assumed to be an integer power of 2.◮ Divide the x[n] into two N/2 value sets based even/odd

index values.◮ Form the DFT of each set and combine results to form N

value DFT.◮ Repeat the procedure on each of the N/2 values DFTs.◮ And so on.

The resulting nominal complex MAC count isN2 × log2(N).

N log2(N)N2 × log2(N) N2

64 6 192 4096

128 7 448 16384

256 8 1024 65536

512 9 2034 262144

1024 10 5120 1048576


Separating the even and odd indexed samples

Start with the forward transform equation

X[k] =

N−1∑

n=0

x[n]e−j2πkn/N , k = 0,1, . . . , N − 1 .

Even numbers have the form 2p and odd numbers have the form 2q + 1

where p and q go from 0,1,2, . . . , N/2− 1.

X[k] =

N/2−1∑

p=0

x[2p]e−j2πk2p/N+

N/2−1∑

q=0

x[2q + 1]e−j2πk(2q+1)/N

=

N/2−1∑

p=0

x[2p]e−j2πkp/(N/2) + e−j2πk/NN/2−1∑

q=0

x[2q + 1]e−j2πkq/(N/2) .

We now have a weighted sum of two N/2 value DFTs. Repeat the process.


The signal flow graph

+

+

+

+

+

+

+

+

X[0]

X[1]

X[2]

X[3]

X[4]

X[5]

X[6]

X[7]

W 0

N

W 1N

W 2N

W 3

N

W 5

N

W 4N

W 6

N

W 7N

Xe[0]

Xe[1]

Xe[2]

Xe[3]

Xo[0]

Xo[1]

Xo[2]

Xo[3]

x[0]

x[2]

x[6]

x[4]

x[1]

x[3]

x[5]

x[7]


Exploiting symmetry

+

+

+

+

+

+

+

+

X[0]

X[1]

X[2]

X[3]

X[4]

X[5]

X[6]

X[7]

W 0

N

W 1N

W 2N

W 3

N

Xe[0]

Xe[1]

Xe[2]

Xe[3]

Xo[0]

Xo[1]

Xo[2]

Xo[3]

x[0]

x[2]

x[4]

x[6]

x[1]

x[3]

x[5]

x[7]

−

−

−

−


Repeat until done

+

+

+

+

+

+

+

+

X[0]

X[1]

X[2]

X[3]

X[4]

X[5]

X[6]

X[7]

W 0

8

W 1

8

W 2

8

W 3

8

x[0]

x[2]

x[4]

x[6]

x[1]

x[3]

x[5]

x[7]

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

W 0

8

W 2

8

W 2

8

W 0

8

−

−

−

− −

−

−

−

−−

−

−

W 0

8

W 0

8

W 0

8

W 0

8


Butterflies and bit reverse addresses

If one can do two butterflies simul

taneously then an algorithm exists

that allows in/out normal ordering

and inplace computation.

normal bit reverse

000 000

001 100

010 010

011 110

100 001

101 101

110 011

111 111From the Wikepedia.


Pseudo code

Can organize using three loops. One each for level or layer, group,

butterfly.

nFFTs = N/2; FFTsize = 2;for(r = 0; r < R; r++) {

for(fft = 0; fft < nFFTs; fft++) {for(butterfly = 0; butterfly < (FFTsize/2); butterfly++) {

top_index = fft*FFTsize+butterfly;bot_index = top_index+(FFTsize/2);w_index = butterfly*nFFTs;temp = W[w_index]*data[bot_index];data[bot_index] = data[top_index]-temp; // update bot first!data[top_index] = data[top_index]+temp; // now update top

}}nFFTs = (nFFTs/2); FFTsize = (FFTsize*2);

}

The input values assumed to have been reordered.


Perhaps speeding up the indexing

nFFTs = N/2; FFTsize = 2;for(r = 0; r < R; r++) {

FFTstart = 0;for(fft = 0; fft < nFFTs; fft++) {

w_index = 0;for(butterfly = 0; butterfly < (FFTsize/2); butterfly++) {

top_index = FFTstart+butterfly;bot_index = top_index+(FFTsize/2);temp = W[w_index]*data[bot_index];data[bot_index] = data[top_index]-temp; // update bot first!data[top_index] = data[top_index]+temp; // now update topw_index = w_index+nFFTs;

}FFTstart = FFTstart+FFTsize;

}nFFTs = (nFFTs>>1); FFTsize = (FFTsize<<1); // shifts are easy to do

}

The input values assumed to have been reordered.


Reordering the input

+

+

+

+

+

+

+

+

X[0]

X[1]

X[2]

X[3]

X[4]

X[5]

X[6]

X[7]

W 0

8

W 1

8

W 2

8

W 3

8

x[0]

x[2]

x[4]

x[6]

x[1]

x[3]

x[5]

x[7]

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

+

W 0

4

W 1

4

W 1

4

W 0

4

−

−

−

− −

−

−

−

−−

−

−

W 0

2

W 0

2

W 0

2

W 0

2

0-4

0+4

2+6

2-6

1+5

1-5

3+7

3-7

0+4

1+5

2+6

3+7

0-4

1-5

3-7

2-6


Continuing the reordering

+

+

+

+

+

+

+

+

X[0]

X[1]

X[2]

X[3]

X[4]

X[5]

X[6]

X[7]

W 08

W 18

W 28

W 38

x[0]

x[2]

x[4]

x[6]

x[1]

x[3]

x[5]

x[7]

+

+

+

+

+

+

+

+

W 04

W 14

W 14

W 04

−

−

−

−

−

−−

−

W 02

W 02

W 02

W 02

a

b

c

d

e

f

g

h

a

b

d

c

e

f

g

h+

+

+

+

+

+

+

+

−

−

−

−


Reordered radix-8 DIT

X[0]

X[1]

X[2]

X[3]

X[4]

X[5]

X[6]

X[7]

W 0

8

W 1

8

W 2

8

W 3

8

x[0]

x[2]

x[4]

x[6]

x[1]

x[3]

x[5]

x[7]

+

+

+

+

+

+

+

+

W 0

8

W 2

8

W 2

8

W 0

8

−

−

−

−

−

−−

−

W 0

8

W 0

8

W 0

8

W 0

8+

+

+

+

+

+

+

+

−

−

−

−

+

+

+

+

+

+

+

+


Can start going the other way

+

+

+

+

+

+

+

+

X[0]

X[1]

X[2]

X[3]

X[4]

X[5]

X[6]

X[7]

W 0

N

W 1N

W 2N

W 3

N

x[0]

x[2]

x[6]

x[4]

x[1]

x[3]

x[5]

x[7]

−

−

−

−


The radix-8 DIF FFT

X[0]

X[1]

X[2]

X[3]

X[4]

X[5]

X[6]

X[7]

W 0

8

W 1

8

W 2

8

W 3

8

x[0]

x[2]

x[4]

x[6]

x[1]

x[3]

x[5]

x[7]

+

+

+

+

+

+

+

+

W 0

8

W 2

8

W 2

8

W 0

8

−

−

−

−

−

−−

−

W 0

8

W 0

8

W 0

8

W 0

8+

+

+

+

+

+

+

+

−

−

−

−

+

+

+

+

+

+

+

+


The flood gate was opened

Basically we exploited the way we wrote the indices of the values being

transformed and ended up with a fast algorithm.

We also got a “new” algorithm by manipulating the signal flow graph.

There are lots of ways to write indices and lots of ways to reorder the

data flow.

Which is “best”? What is meant by best?


How many ways are there to index?

Numbers can be written as polynomials. For example we can

write

123410 = 1× 103+ 2× 102

+ 3× 103+ 4× 100.

We refer to 10 as being the radix.

N =

D−1∑

k=0

dkrk

Similarly we can write numbers in binary form as

123410 = 1× 210+ 0× 29

+ 0× 28+ 1× 27

+ 1× 26+ 0× 25

+ 1× 24+ 0× 23

+ 0× 22+ 1× 21

+ 0× 20,

= 100110100102.


A simple factoring of N

Numbers can also be written as the product of their factors.

For example

1234 = 2× 617.

Consider the number N = N1N2 where N1 and N2 are relatively

prime. It can be shown that we can uniquely write the integer

values from 0 through N − 1 as

n = n2N1 +n1, n1 = 0,1, . . . , N1 − 1, n2 = 0,1, . . . , N2 − 1

or alternatively as

k = k1N2 + k2, k1 = 0,1, . . . , N1 − 1, k2 = 0,1, . . . , N2 − 1.


FFT based only on simple factoring

X[k] =

N−1∑

n=0

x[n]e−j2πkn/N ,

X[k1N2 +n2] =

N1−1∑

n1=0

N2−1∑

n2=0

x[n2N1 +n1]e−j2π(k1N2+k2)(n2N1+n1)/(N1N2)

=

N1−1∑

n1=0

N2−1∑

n2=0

x[n2N1 +n1]e−j2π(k1n1N2+k2n2N1+k2n1)/(N1N2)

=

N1−1∑

n1=0

e−j2πk1n1/N1

e−j2πk2n1/N

N2−1∑

n2=0

x[n2N1 +n1]e−j2πk2n2/N2

.

Procedure: Form N1 N2point DFTs.

Weight the results using twiddlefactors.

Form N2 N1point DFTs.

N1N22 +N1N2 +N2N

21 = N1N2(N1 + 1+N2)

For N = 15 = 3× 5 compare N2= 225 to N1N2(N1 + 1+N2) = 135.


Prime Factor Algorithm index mapping

A more generalized mapping of the indices is

n = ((K1n1 +K2n2))N where 0 ≤ n1 < N1

0 ≤ n2 < N2

and

k = ((K3k1 +K4k2))N where 0 ≤ k1 < N1

0 ≤ k2 < N2.

The ( )N denotes using the quantity contained in the parentheses

moduloN .


Prime factor decomposition

((kn))N = ((K1K3n1k1 +K1K4n1k2 +K2K3n2k1 +K2K4n2k2))N

If values of K1, K2, K3, and K4 can be determined such that

((K1K4))N = ((K2K3))N = 0

then the DFT becomes

X[k] =

N1−1∑

n1=0

e−j2πk1n1K1K3/N

N2−1∑

n2=0

x[n1, n2]e−j2πk2n2K2K4/N .

Both the condition for generating 1to1 index maps and the above

modulo relationship can be satisfied. The result is a mapping of a

onedimensional DFT into a twodimensional DFT. For this case the

number of complex multiplications is

N2N21 +N1N

22 .

For N = 15 = 3× 5 this gives 120 complex multiplications.


PFA cost in terms of multiplications

For Nf relatively prime factors of N the number of complex

multiplications become

N

Nf−1∑

i=0

Ni.

For N = 8184 = 3× 8× 11× 31 the number of complex

multiplications is 53× 8184 as compared to the unmodified

DFT which uses 8184× 8184.

The PFA uses a factor of about 154 fewer.

If it were possible to use a 8192 value transform instead, a DIT

FFT would nominally use (8192/2)× 13 complex

multiplications. This is a factor of about 1260 fewer

multiplications than needed by the unmodified DFT definition.


C28x FFT and related functions

Started out with sprc081.zip and eventually end up in

c:\tidcs\c28\dsp_tbox. Moved to lab 6 directory.

Documented in FFT Library Module user’s Guide C28xFoundation

Software contained in fft_mdl.pdf. Essential reading.

3322--bbiitt RReeaall FFFFTT

EExxeeccuuttiioonn CCyycclleessFFFFTT ssiizzee

CCaassee 11 :: TTFF((QQ3311)) CCaassee 22 :: TTFF((QQ3300)) CCaassee 33 :: TTFF((QQ3300)) && OOTTPP

128 6509 6763 7017

256 14756 15394 16032

512 33081 34615 36149

1024 73422 77004 80536

3322--bbiitt CCoommpplleexx FFFFTT

128 11159 11671 12183

256 25901 27181 28461

512 59075 62146 65217

1024 132823 139991 147159


Time and storage

1024 real using 60 MHz clock takes about 1.2 ms.

32bit 1024 real requres 2048 16bit words for data.

1024 complex using 60 MHz clock takes about 2.2 ms.

32bit 1024 complex requires 4096 16bit words for data.

The total RAM on the C28017 controlSTICK is 6K 16bit words.

TI functions use DIT with input in bitreverse addressing form.

The TI functions do not scale as part of the transform.

TI does not provide an inverse FFT for the C28x.


FFT input scaling

Consider a solitary sinusoidal input where Bbit sample values

are placed into the low bits:

cos(2πfct) =ej2πfct + e−j2πfct

2

For an Nvalue DFT the gain at the fc frequency (assuming it

matches an analysis frequency) is N. If a 1024 point transform

is taken then the result might require 10+B1 bits.

Using C28017’s 12bit samples the maximum amplitude FFT

value is 2047× 1024/2 = 1,048064. This will fit in 21bits.

Actually, max complex DC in is the worst case input waveform.

Another “bad” waveform is a complex max amplitude square

wave. The fundamental has amplitude 2/π instead of 1/2.


Scaling when taking the IDFT

For the 12bit single sine wave using the DFT to compute the

IDFT will increase the word size by 10 bits. The result will fit

using a 32bit word size. Simply transform then shift right by

10 bits.

Do we need 21 bits or 12 bits for the result? We started with 12.


What to do if there aren’t enough bits?

On a fixed point DSP computer floating point is not normally an

option. When simulating floating point, performance takes a big

hit.

One could scale the partial results by a factor of 2 for each FFT

layer. This commonly done. It is conservative often scaling

values more than necessary costing in noise performance.

A hybrid fixed point floating point technique termed block

floating point is often a viable option.

One can find code examples on the web both for TI and

Motorola DSP devices for both scaling procedures. The block

point scaling is well supported in the Motorola DSP devices.


Block floating point

I can write values as m× 2c where m is a two’s complement

fraction having magnitude less than one and c is a two’s complement

integer.

0.25 can be written as 0.5× 2−1 or as 0.0625× 22

16 can be written as 0.5× 25 or as 0.03125× 29

If we have a array of values all using the same value of c we have

a set of values referred to as being in block floating point form.

In order to keep values fractional they must be scaled such that

the magnitude of the largest value is less than 1.

FFTs formed using block floating point are generally more accurate than

fixed point FFTs and less accurate than equivalent floating point ones.

The DSP56303 has hardware support that allows block floating

point FFTs to be formed only slightly slower that fixed point ones.

I don’t know how well the C5510 supports block floating point.


Singleton’s DFT speed-up procedure

In 1969 Singleton published a simple algorithm that reduces the

number of multiply operations for DFT’s by a factor of four.

Z[k] =

N−1∑

n=0

W knN z[n]

=

N−1∑

n=0

ze[n] cos(2πkn/N)− jzo[n] sin(2πkn/N)

Write Z[k] in terms of even and odd parts as Z[k] = Ze[k]+ Zo[k].

Ze[k] =

N−1∑

n=0

ze[n] cos(2πkn/N)

Zo[k] = −j

N−1∑

n=0

zo[n] sin(2πkn/N)


Singleton’s procedure continued

For N odd we can write

Ze[k] = z[0]+

(N−1)/2∑

n=1

(z[n]+ z[N −n]) cos(2πkn/N),

Zo[k] = −j

(N−1)/2∑

n=1

(z[n]− z[N −n]) sin(2πkn/N).

Because of symmetry the values of Ze[k] and Zo[k] need only be

computed for 0 ≤ k ≤ (N − 1)/2. Note that there were pairs of two’s

that cancelled out.

The above sums can be evaluated using

multiplies = 2N − 1

2

N − 1

2+ 2

N − 1

2

N − 1

2= (N − 1)2.

Depending on whether or not there are two ALUs and how they are

arranged the multiplication of complex values by real values in a

single instruction time may be possible. This would result in the

reduction of the number of multiplies by an another factor of two.


Singleton’s procedure completed

The even N case is left as an exercise (not assigned).

There is going to be a pass through the data to compute the

even and parts at the start of the procedure and a similar pass

at the end. This will add some additional overhead.

Depending on how the particular DSP Architecture we are using

does things, a speed up of perhaps as much of 4 to 8 times may

be possible over brute force.

This works even if N is prime!


Is all sweetness and light?

Of course, not all is sweetness and light. There are many worries

associated with efficiently computing DFTs. Some of these are:

◮ It is not always possible to compute a DFT inplace. Quite

often it is necessary to swing between a pair of working

areas as one moves between layers.◮ Does there exist code or at least an algorithm for efficiently

computing DFTs for the prime factors? There is always the

possibility of Singleton’s procedure at least dangling the

prospect of a four times speed up. However, better

speedups may be possible.◮ The transformed values generally need to be reordered. The

use of permutation arrays are useful but these too consume

memory resources.


Three multiplier complex multiplication

In general (a+ jb)× (c + jd) = (ac − bd)+ j(bc + ad).

This can be written using three multiplications as

(a+jb)×(c+jd) = a(c−d)+(a−b)d+j[b(c+d)+(a−b)d] .

~

ÅJÇ

ÅHÇ

Ä

Ç~ÅJÄÇ

ÄÅH~Ç

When multiplying by a constant, c + jd, the c +d and c − d can be

table lookup.


C FFT — An old friend!

/* Fast Fourier Transform Function (fft2)

Adapted from:

The Fast Fourier Transform and its ApplicationsJ. W. Cooley, P. A. Lewis, and P. D. WelchIEEE Transactions on Education, Vol. 12, No. 1,March 1969, pp 27-34.

28Feb87 Converted to C .. K. Metzger06Feb91 High-C conversion .. K.Metzger

Function forms the discrete Fourier transform of an arrayof double precision complex values. An integer power oftwo number of values is assumed to be contained in a hugearray.

void fft2(data, log2n, direction)

data huge pointer to double precision complex valueddata stored re,im,re,im,...

log2n int log base 2 of number of points totransform. Allowed range is 1 thru NLIMIT.

direction int which is - if going from time to frequency(uses -sine and divides values by number ofcomplex values). If >=0 goes from frequency totime.

*/


Bit reverse reorder the input

void fft2(double *data, int log2n, int direction){

unsigned n, i, j, el, le, le_half, to_freq;register unsigned val_i, rev_i;double *ptr1, *ptr2, temp, dbl_n, arg, t_re, t_im, u_re, u_im, w_re, w_im;

if (pi==0.0) pi=4.0*atan(1.0);to_freq=(direction<0) ? 1 : 0;dbl_n=(double)(n=1<<log2n);for (i=1; i<n-1; i++) {

val_i=i; rev_i=0;for (j=0; j<(unsigned)log2n; j++) {

rev_i=(rev_i<<1)|(val_i&0x0001);val_i>>=1;

}if (rev_i>i) {

temp= *(ptr1=data+(i<<1));

*ptr1= *(ptr2=data+(rev_i<<1));

*ptr2++=temp;temp= *(++ptr1);

*ptr1= *ptr2;

*ptr2=temp;}

}

The C5510 has hardware to (hopefully) simplify this task.


Compute the FFT and maybe normalize

for (el=0; el<(unsigned)log2n; el++) {le=(le_half=1<<el)<<1;u_re=1.0; u_im=0.0;w_re=cos(arg=pi/le_half);w_im=(to_freq) ? -sin(arg) : sin(arg);for (j=0; j<le_half; j++) {

for (i=j; i<n; i+=le) {ptr2=(ptr1=data+((i+le_half)<<1))+1;t_re= *ptr1*u_re-*ptr2*u_im;t_im= *ptr1*u_im+*ptr2*u_re;ptr2=data+(i<<1);

*ptr1++= *ptr2++-t_re;

*ptr1= *ptr2-t_im;

*ptr2--+=t_im;

*ptr2+=t_re;}t_re=u_re;u_re=u_re*w_re-u_im*w_im;u_im=t_re*w_im+u_im*w_re;

}}if (to_freq) {

for (i=0; i<n; i++) {

*data++/=dbl_n;

*data++/=dbl_n;}

}return;

}


meeting 13 - eecs.umich.edu

Documents