0 thesis committee: roger dannenberg (chair) guy blelloch robert harper perry cook, princeton...
Post on 21-Dec-2015
217 views
TRANSCRIPT
1
Thesis committee:Roger Dannenberg (Chair)
Guy BlellochRobert Harper
Perry Cook, Princeton University
Temporal type constructorsfor computer music programming
2
Computer music programming
Subdomains: Digital signal processing. Response to asynchronous events. Representations of musical and sonic
structure. Example applications:
Synthesize audio from a musical score. Abstract features from audio; alter features. Transform audio to compress it.
3
Analysis of audio
amplitudepp f
analyze
abstract
resynthesize
(modify)
frequency
render
4
The goals
Computer music programming should be expressive: programs are clear and
concise. general: programs fall within the
expressive range.
5
The current tradeoff
general
exp
ress
ive
unit-generatorprogramming(Csound)
low-levelprogramming(C++)
the promised land
6
What we have now
“Unit generator” programming (Csound). User configures black-box audio processors. Can’t express new DSP or new kinds of data.
New kind of data: spectral frames, for example.
Low-level programming (C++). Cumbersome without a computer music library. Libraries don’t support new kinds of data,
and don’t give much benefit for new DSP.
7
What do we need?
Write arbitrary DSP in a high-level language. No more writing unit generators in C.
Types higher and lower than “audio stream”. higher: analysis frames for a new
representation. lower: access to individual samples for
new DSP.
8
My proposal
Temporal type constructors. Proposed set: event, vector, infinite vector. Enable a pure applicative programming
style.
Through temporal type constructors, computer music programming can be both expressive and general.
9
A taste of the results
Chronic is a prototype system using this idea.
The FOF synthesis algorithm can’t be written in Csound.
C implementation is 235 lines, and awkward.
Chronic implementation is 34 lines,and closer to our idea of the algorithm.
10
Outline
Temporal type constructors. Code examples in Chronic. Related work. Chronic internals. Future work. Conclusions.
11
Temporal type constructors
event timestamped event:e.g. as a pair (, time).
vec finite vector of :an array of elements.
ivec infinite vector of :a time-indexed stream.
time integer sample count.
12
Digital audio stream
audio sample ivec(float might be chosen as the sample type.)
S SSSS
13
Multi-channel audio stream
multi_audio sample ivec vec
S SSSS
S SSSS
S SSSS
S SSSS
14
Short-time spectrum data
spectra complex vec ivec
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
15
A chord sequence
chordseq pitch vec event vec
@ @ @@
P
P
P
P
P
P
P
P
P
P
P
P
16
Musical-keyboard events
MIDI (pitchvelocity) event ivec
P V P V
P V P V
@
@
@
@
17
Gestural musical events
violin (pitch vec bowing vec) event ivec
B
@
@
@
@
P
B
P
B
P
B
P
B
P
B
P
B
P
B
P
B
P
B
P
B
P
B
P
B
P
18
Explicit vs. implicit time
Implicit (Csound): out = -in Code runs in a context holding the current
time:for (t=0; ; ++t) out[t] = -in[t]
Looped unavoidably — hope it’s what you want.
Explicit (Chronic): out = map (x. -x) in out and in are of type float ivec. i.e. they are explicitly temporal data. Explicit model, with map, subsumes implicit.
19
Explicit time
Time information is built into the data. Code can stand outside of time.
vs. operating within some implicit “now”. Advantages:
A strictly more powerful model of time. Implicit time can do delay, but can’t do the
inverse. Types are more tractable than code. The FOF example will show how this works.
20
Chronic
Built inside O’Caml as a set of libraries.core: E event
V vec
IV ivec
EV event vec
EIV event ivec
library: L float and otherLV vec…
21
A couple of IV functions
IV.iterate (fun x -> x +. 0.5) 1.[| 1.; 1.5; 2.; 2.5; ... |]
(* y = IV.map succ (IV.delay 2 y) *)IV.delay_rec 2 [| 0; 5 |] (IV.map succ)[| 0; 5; 1; 6; ... |]
22
A couple of library functions
let fs = 44100. (* sampling frequency *)
LV.osc_sine 1000 (220./.fs) 0.25(* 1000 samples of 220-Hz cosine *)
LIV.para_eq (1200./.fs) 12. 0.5 x(* filter x to boost a 0.5-octave band around 1200 Hz by 12 dB *)
23
Examples built with Chronic
FOF synthesis. Computer-music scores. Two reverberators. An FFT-based pitch shifter.
24
FOF synthesis
Makes a sound with a peak in its spectrum.
frequency
level
pitch peak pitch peak
25
The FOF waveform
Series of enveloped sine-wave ‘grains’.
1 / Fpitch 1 / Fpeak
26
A fitting data type
grains: float vec event ivec
@ @ @
F F F
F F F F
F F F
@
F F F
27
Skeleton of FOF code
let fof (f_pitch: float ivec) phase0f_peak bandwidth dbrise fall dur (risefalltab: float vec) =
let grain_times = LIV.phasor_wrap f_pitch phase0 in
let fgrain t = ... (* miracle occurs *) in grain @@ (int t)in let grains = IV.map fgrain grain_times
in EIV.vfold (+.) 0. grains
float vec event ivec
float ivec
float ivec
float -> float vec event
28
The missing piece
let fgrain t =let sine = LV.osc_sine dur f_peak (~-.(frac t) *. f_peak) in
let kenv = exp(~-.pi*.bandwidth) inlet env = V.iterate (fun x -> kenv *. x) 1.0 dur in
let smooth_ph = EV.pwl_list0.0 [(rise, 1.); (dur-1-fall, 1.); (dur-1, 0.)] in
let smooth = LV.tablei risefalltab smooth_ph in
let ampl = L.db_to_amp db inlet grain = V.map3 (fun x y z -> ampl *. x*.y*.z) sine env smoothin grain @@ (int t)
29
FOF in Chronic vs. in C
What I showed you was slightly simplified. Less time-varying control, no
“octaviation”. This was 19 lines; full FOF is 34.
Csound’s FOF in C is 235. More importantly, it’s unintuitive.
30
FOF in C: what goes wrong?
#include "cs.h" /* UGENS7.C */#include "ugens7.h"#include <math.h>
/* loosely based on code of Michael Clarke, University of Huddersfield */
#define FZERO (0.0f)#define FONE (1.0f)
static int newpulse(FOFS *, OVRLAP *, float *, float *, float *);
void fofset0(FOFS *p, int flag){ if ((p->ftp1 = ftfind(p->ifna)) != NULL && (p->ftp2 = ftfind(p->ifnb)) != NULL) { OVRLAP *ovp, *nxtovp; long olaps; p->durtogo = (long)(*p->itotdur * esr); if (*p->iphs == FZERO) /* if fundphs zero, */ p->fundphs = MAXLEN; /* trigger new FOF */ else p->fundphs = (long)(*p->iphs * fmaxlen) & PMASK; if ((olaps = (long)*p->iolaps) <= 0) { initerror("illegal value for iolaps"); return; } auxalloc((long)olaps * sizeof(OVRLAP), &p->auxch); ovp = &p->basovrlap; nxtovp = (OVRLAP *) p->auxch.auxp; do { ovp->nxtact = NULL; ovp->nxtfree = nxtovp; /* link the ovlap spaces */ ovp = nxtovp++; } while (--olaps); ovp->nxtact = NULL; ovp->nxtfree = NULL; p->fofcount = -1; p->prvband = FZERO; p->expamp = FONE; p->prvsmps = 0; p->preamp = FONE; p->xincod = (p->XINCODE & 0x7) ? 1 : 0; p->ampcod = (p->XINCODE & 0x2) ? 1 : 0; p->fundcod = (p->XINCODE & 0x1) ? 1 : 0; p->formcod = (p->XINCODE & 0x4) ? 1 : 0; if (flag) p->fmtmod = (*p->ifmode == FZERO) ? 0 : 1; } p->foftype = flag;}
void fofset(FOFS *p){ fofset0(p, 1);}
void fofset2(FOFS *p){ fofset0(p, 0);}
void fof(FOFS *p){ OVRLAP *ovp; FUNC *ftp1, *ftp2; float *ar, *amp, *fund, *form; long nsmps = ksmps, fund_inc, form_inc; float v1, fract ,*ftab;
if (p->auxch.auxp==NULL) { /* RWD fix */ initerror("fof: not initialized"); return; } ar = p->ar; amp = p->xamp; fund = p->xfund; form = p->xform; ftp1 = p->ftp1; ftp2 = p->ftp2; fund_inc = (long)(*fund * sicvt); form_inc = (long)(*form * sicvt); do { if (p->fundphs & MAXLEN) { /* if phs has wrapped */ p->fundphs &= PMASK; if ((ovp = p->basovrlap.nxtfree) == NULL) perferror("FOF needs more overlaps"); if (newpulse(p, ovp, amp, fund, form)) { /* init new fof */ ovp->nxtact = p->basovrlap.nxtact; /* & link into */ p->basovrlap.nxtact = ovp; /* actlist */ p->basovrlap.nxtfree = ovp->nxtfree;
} } *ar = FZERO; ovp = &p->basovrlap;
while (ovp->nxtact != NULL) { /* perform cur actlist: */ float result; OVRLAP *prvact = ovp; ovp = ovp->nxtact; /* formant waveform */ fract = PFRAC1(ovp->formphs); /* from JMC Fog*/ ftab = ftp1->ftable + (ovp->formphs >> ftp1->lobits);/*JMC Fog*//* printf("\n ovp->formphs = %ld, ", ovp->formphs); */ /* TEMP JMC*/ v1 = *ftab++; /*JMC Fog*/ result = v1 + (*ftab - v1) * fract; /*JMC Fog*//* result = *(ftp1->ftable + (ovp->formphs >> ftp1->lobits) ); */ if (p->foftype) { if (p->fmtmod) ovp->formphs += form_inc; /* inc phs on mode */ else ovp->formphs += ovp->forminc; } else {#define kgliss ifmode /* float ovp->glissbas = kgliss / grain length. ovp->sampct is incremented each sample. We add glissbas * sampct to the pitch of grain at each a-rate pass (ovp->formphs is the index into ifna; ovp->forminc is the stepping factor that decides pitch) */ ovp->formphs += (long)(ovp->forminc + ovp->glissbas * ovp->sampct++); } ovp->formphs &= PMASK; if (ovp->risphs < MAXLEN) { /* formant ris envlp */ result *= *(ftp2->ftable + (ovp->risphs >> ftp2->lobits) ); ovp->risphs += ovp->risinc; } if (ovp->timrem <= ovp->dectim) { /* formant dec envlp */ result *= *(ftp2->ftable + (ovp->decphs >> ftp2->lobits) ); if ((ovp->decphs -= ovp->decinc) < 0) ovp->decphs = 0; } *ar += (result * ovp->curamp); /* add wavfrm to out */ if (--ovp->timrem) /* if fof not expird */ ovp->curamp *= ovp->expamp; /* apply bw exp dec */ else { prvact->nxtact = ovp->nxtact; /* else rm frm activ */ ovp->nxtfree = p->basovrlap.nxtfree;/* & ret spc to free */ p->basovrlap.nxtfree = ovp; ovp = prvact; } } p->fundphs += fund_inc; if (p->xincod) { if (p->ampcod) amp++; if (p->fundcod) fund_inc = (long)(*++fund * sicvt); if (p->formcod) form_inc = (long)(*++form * sicvt); } p->durtogo--; ar++; } while (--nsmps);}
static int newpulse(FOFS *p, OVRLAP *ovp, float *amp, float *fund, float *form){ float octamp = *amp, oct; long rismps, newexp = 0;
if ((ovp->timrem = (long)(*p->kdur * esr)) > p->durtogo) /* ringtime */ return(0); if ((oct = *p->koct) > FZERO) { /* octaviation */ long ioct = (long)oct, bitpat = ~(-1L << ioct); if (bitpat & ++p->fofcount) return(0); if ((bitpat += 1) & p->fofcount) octamp *= (FONE + ioct - oct); } if (*fund == FZERO) /* formant phs */ ovp->formphs = 0; else ovp->formphs = (long)(p->fundphs * *form / *fund) & PMASK; ovp->forminc = (long)(*form * sicvt); if (*p->kband != p->prvband) { /* bw: exp dec */ p->prvband = *p->kband; p->expamp = (float)exp((double)(*p->kband * mpidsr)); newexp = 1; }
/* Init grain rise ftable phase. Negative kform values make the kris (ifnb) initial index go negative and crash csound. So insert another if-test with compensating code. */ if (*p->kris >= onedsr && *form != FZERO) { /* init fnb ris */ if (*form < FZERO && ovp->formphs != 0) ovp->risphs = (long)((MAXLEN - ovp->formphs) / -*form / *p->kris); else ovp->risphs = (long)(ovp->formphs / *form / *p->kris); ovp->risinc = (long)(sicvt / *p->kris); rismps = MAXLEN / ovp->risinc; } else { ovp->risphs = MAXLEN; rismps = 0; } if (newexp || rismps != p->prvsmps) { /* if new params */ if (p->prvsmps = rismps) /* redo preamp */ p->preamp = (float)pow(p->expamp, -rismps); else p->preamp = FONE; } ovp->curamp = octamp * p->preamp; /* set startamp */ ovp->expamp = p->expamp; if ((ovp->dectim = (long)(*p->kdec * esr)) > 0) /* fnb dec */ ovp->decinc = (long)(sicvt / *p->kdec); ovp->decphs = PMASK; if (!p->foftype) { /* Make fof take k-rate phase increment: Add current iphs to initial form phase */ ovp->formphs += (long)(*p->iphs * fmaxlen); /* krate phs */ ovp->formphs &= PMASK; /* Set up grain gliss increment: ovp->glissbas will be added to ovp->forminc at each pass in fof2. Thus glissbas must be equal to kgliss / grain playing time. Also make it harmonic, so integer kgliss can represent octaves (ie pow() call). */ ovp->glissbas = ovp->forminc * (float)pow(2.0, (double)*p->kgliss); /* glissbas should be diff of start & end pitch*/ ovp->glissbas -= ovp->forminc; ovp->glissbas /= ovp->timrem; ovp->sampct = 0; /* Must be reset in case ovp was used before */ } return(1);}
static int rngflg=0;
Can’t represent grains. Can’t stand outside of time Has to loop over output samples, and think
“What is the set of active grains right now? Are some dying? Are new ones born? Which envelopes are in their rise phase? entering fall phase? …”
You don’t want to think that way about FOF. Want to loop over grains, not samples.
31
Computer music scores
Construct a score, and synthesize from it:type note = float * (float vec) (* dB, Hz *)score: note event vecsynth_beep: note -> float veclet sound = EV.vfold (+.) 0. (V.map (E.lift synth_beep) score)
Hierarchical structure.type 'a element = Note of 'a | Riff of 'a element event vec
Measure event timestamps in fractional beats. Tempo-map from beats to samples.
32
The components of a pitch shifter
overlapped FFT
correct frequencies
rescale frequencies
compute spectrum
overlapped IFFT
float ivec
complex vec ivec
complex vec ivec
(float * float) vec ivec
(float * float) vec ivec
float ivec
float ivec
float ivec
pitch shifterfloat ivec
(float * float) vec ivec
sinusoidalanalyzer float ivec
complex vec ivec
complex vec ivec
float ivec
f: complex vec ivec -> complex vec ivec
spectralmodifier float ivec
float ivec
pitch shifter
33
Related work
Languages with temporal type constructors.
Languages with atomic signals and events. Events with explicit time. Events in implicit time. Events not first-class.
Languages with signals only. Languages with events only.
34
Fran
Elliott and Hudak, 1997. “Functional reactive animation”
Used to define objects’ trajectories, etc. Animation, not video — no frames or
pixels. Behavior is Time -> . Event is time-sorted stream of Time * . Time is continuous.
35
Continuous versus discrete time
Animation is continuous change. DSP is discrete.
Digital filters are based on unit delays. The FFT relies on discrete time and
frequency. A delay line can’t hold a continuous-time
signal. So “delay x by 1” is t . (x (t-1)). Feedback delay involves x (t-1), x (t-2), x (t-3), …
Two different ways of programming.
36
ALDiSP
Freericks, 1996. For digital signal processing.
stream: demand-driven (like ivec). pipe: producer-driven. A pipe is a channel for asynchronous
events. Event timing is implicit. Representing temporal data is not the
goal.
37
Signals and events
Atomicity of signals precludes general DSP. Some languages have events with explicit
time: Arctic (Dannenberg et al., 1986):
applicative programming for reactive systems. SuperCollider (McCartney, 1996):
scores are lazy lists of particular events. Some have events in implicit time. Or events not first-class—score
sublanguage.
38
Inside Chronic
Everything besides ivecs is pretty easy.
The properties of a good ivec. Chronic’s ivec implementation. Phases: building and computation. A little benchmark on static
dataflow.
39
Desirable properties of an ivec
Correct asymptotic space and time use.
Block computation. Consumer control of block length. Efficient fan-out to multiple
consumers. In-place update.
40
Chronic’s ivec
implementation An ivec is a reference to an ivec_dat.
An ivec_dat is an object. Has method compute (upto: time) -> unit
Writes output into a shared buffer.
41
f: x . x+2x0: 0
iterate_dat
evens:
The building phase
let evens = IV.iterate (fun x -> x+2) 0
let powtwo = IV.iterate (fun x -> x*2) 1
let powfour = IV.peekiv evens powtwo (* index into powtwo by evens *)
f: x . x*2x0: 1
iterate_dat
powtwo:
peekiv_dat
powfour:t:
x:
42
The computation phase
[0], [1]?
[0], [1]?
[0], [2]?
f: x . x*2x0: 1
iterate
t: x:
peekiv
f: x . x+2x0: 0
iterate
[0, 2, 4, 6, …]
[1, 2, 4, 8, …]
1, 4
0, 2
1, 4
Demand-driven dataflow:
43
Function calls are expensive
function call: IV.map2 (+.) x y
inlined: IV.add2 x y
C++ inlined: for (i=0; i<len; ++i) z[i] = x[i]+y[i];
Relative times for optimal block length (256):map2 9.3add2 1.0C++ 0.36
44
Future work
Comprehensions. Sampling rates. Arbitrary feedback delay. Lazy vectors. Real-time?
45
vec and ivec comprehensions
Instead of IV.map2 (fun xi yi -> xi + 2*yi) x y,write { xi + 2*yi: xi in x, yi in y }
or just { x + 2*y }
More readable. Can generate specialized code. Accomplish with camlp4 preprocessor?
or with C++ template tricks?
46
Signals with sampling rates
Now you can use signals of differing rates,but you get no checking of rate mismatches.
Audio signal: 44100 Hz. Control signal: 1000 Hz. Incorporate sampling rate into sig,
isig types.
47
Conclusions
Unify computer-music sublanguages.
Think and program outside of time.
If we construct types, we can take them apart.
48
Unify sublanguages
Csound has three separate languages:event placement, signal routing, and DSP (“score”, “orchestra”, and C).
The divisions cut across useful interaction.
Nyquist unifies the event and routing levels.
Chronic unifies all three.
49
Stand outside of time
Program in time: logical time advances as the program runs. An event’s time is “now”.
Out of time: all time is explicitly in the data. The program’s execution is atemporal.
Allows vfold (in FOF code) to be factored out. Out-of-time is often the way we think about
an algorithm.
50
Deconstructing constructed types
Traditional computer music languages make the audio signal an atomic type—a black box.
Then there is no notion of an audio sample. Other types: spectra, LPC frames, …? Add them as more atomic types?
Add corresponding suite of unit generators, too. A constructed type is no longer a black box.
51
Questions we can now address
How can computer music languages support writing new DSP and new representations?
How can libraries for low-level languages support new DSP and new representations?
How can we build better tools for researching music and DSP algorithms?
52
Summary
Temporal type constructors lead to a better way of doing computer music programming.
53
54
Synthesis from a score
let motif = base bend . let bleep = pitch .
bend
base
filterosc
pitch
let shorten = x . timescale 0.5 xlet double = sequence [motif, motif]let zeno = sequence (iterate 10 shorten motif)let score =
zeno double
double
doublezeno
zeno
zeno
let audio = apply bleep score
55
What’s wrong?
Csound aims to be expressive, high-level:audio signal, not audio sample, is atomic.
Csound types go no higher and no lower. Higher: stream of frames of analysis data.
Can’t express a new analysis system. Lower: access to individual samples.
Can’t express new DSP.
56
Music languages (Csound)
New DSP generally cannot be expressed. No access to individual elements of audio
data. Recursive delay is restricted.
Code is really scalar, mapped over time. Time is factored out, unavailable.
Can’t construct new types.
57
Low-level languages (C++)
It is hard to write a good support library. Most assume all data is synchronous
signals. Infinite data is awkward.
Libraries don’t help new DSP code much. Fine-grained primitives are hard to
identify.
58
Implementations
Chronic is a prototype implementation of this style of programming.
One possible future use: a framework for — developing computer music algorithms. analyzing and manipulating sonic data. Similar niche to Matlab.
59
Something implicit time can’t do
The implicit model can represent delay: out = temp; temp = in
It cannot represent the inverse operation. Undesired delay leaks out, breaking
modularity.
Explicit time supports the inverse:out = drop 1 in drop 2 [1 0 6 6] = [6 6]
Explicit time, with map, subsumes implicit.
60
Why this matters
A DSP operation may add undesired delay.
In explicit time, this can be removed. In implicit time, the delay leaks out.
Must delay other signals to keep them aligned.
A signal’s delayedness is not part of its type.
61
A couple of EIV functions
EIV.pwl 3. [| 4.@@2; 1.@@5 |][| 3.; 3.5; 4.; 3.; 2.; 1.; ... |]
0 1 2 3 4 5
EIV.vfold (+) 0 [|[| 1; 1; 1; 1 |] @@ 2;[| 2; 2; 2; 2 |] @@ 4 |]
[| 0; 0; 1; 1; 3; 3; 2; 2; ... |] 0 1 2 3 4 5 6 7
[ 1 1 1 1 ] [ 2 2 2 2 ]
62
Two reverberators
Based on feedback-delay structures. Moorer: filtered comb. Gardner: nested
allpass. Feedback delay, y = delay (f y):
x yD
g
+x yD
y
f
let f x y = IV.map2 (+.) x (IV.map (fun y -> g *. y) y)
let echo length x = IV.delayz_rec2 length f x
(f y) (f y)
63
A complication
Can’t access the inside of a feedback delay.
N1 N2 lowpassX
g
0.5
Y
0.5
IV.delayz_rec2
N1 N2
Kludge: duplicate part of it instead.
D
64
Feedback delay: a comparison
In low-level languages— you have to maintain grungy delay
queues. In computer music languages—
you often can’t represent feedback delay. In Chronic—
high-level representation of feedback loops,
but not arbitrary flow graphs.
65
Why not just a stream?
type ’a ivec = Ivec of (unit -> ’a * ’a ivec)
Has
66
An ivec is an ivec_dat ref
class [’a] ivec_dat mutable in-place method get_buf ()
-> ’a vec fan-out from buf method compute (upto: time) control of block length
-> unit (* side effect: writes to buf *)method seek (upto: time)
type ’a ivec = ’a ivec_dat ref
compute 10; use buf.(0) to buf.(9);
compute 20; use buf.(0) to buf.(9); …
67
Subclassing ivec_dat
class [’a, ’b] map_dat (f: ’a -> ’b) (x: ’a ivec) =
object inherit [’b] ivec_dat
method compute_hook
(* call !x#compute; use !x#get_buf (); write to buf *)
let map (f: ’a -> ’b) x =
ref ((new map_dat f x) :> (’b ivec_dat))
68
69
The components of a pitch shifter
overlapped FFT
correct frequencies
rescale frequencies
compute spectrum
overlapped IFFT
float ivec
complex vec ivec
complex vec ivec
(float * float) vec ivec
(float * float) vec ivec
float ivec
70
float ivec
float ivec
pitch shifter
71
float ivec
(float * float) vec ivec
sinusoidalanalyzer
72
float ivec
complex vec ivec
complex vec ivec
float ivec
f: complex vec ivec -> complex vec ivec
spectralmodifier
73
Reusing the components
overlapped FFT
overlapped IFFT
overlapped FFT
correct frequencies apply function f
Sinusoidal analyser Spectral manipulator
output: (float * float) vec ivec f: complex vec ivec -> complex vec ivec