spectral learning for expressive interactive ensemble ...gxia/pdf/ismir2015gusandrogertalk.pdf · 3...

Carnegie Mellon

Spectral Learning for Expressive Interactive Ensemble Music Performance

Gus (Guangyu) Xia Yun Wang Roger Dannenberg Geoffrey Gordon School of Computer Science Carnegie Mellon University

Carnegie Mellon

Introduction: Musical background

2

Interaction

Expression Rehearsal

Gus Xia et al. Carnegie Mellon

Carnegie Mellon

Introduction: Technical background

3

Interaction Expression Score Following and Automatic

Accompaniment

Expressive Performance Rendering

Expressive Interactive Performance


Carnegie Mellon

Introduction: Problem Overview

4

¢  We start from piano duets ¢  Focusing on expressive timing and dynamics ¢  Model expressions as co-evolving time series ¢  Use Spectral Learning

For interactive ensemble music performance, how can we build artificial performers that automatically improve their ability to sense and respond to human musicians’ expression with rehearsal experience?


Carnegie Mellon

5

Outline

¢  Introduction ¢  Data Collection ¢  Method ¢  Demos ¢  Future Work & Conclusion


Carnegie Mellon

Data Collection

¢  Musicians: §  10 music graduate students play duet pieces in 5 pairs.

¢  Music pieces: §  3 pieces of music are selected, Danny boy, Serenade

(by Schubert), and Ashokan Farewell. §  Each pair performs every piece of music 7 times.

¢  Recording settings: §  Recorded by electronic pianos with MIDI output.

6 Gus Xia et al. Carnegie Mellon

Carnegie Mellon

7

Base line ¢  Linear extrapolation based on previous notes :

Real time

Ref

eren

ce ti

me

Score Following Automatic Accompaniment

Real time

next note’s reference time

predicted next note’s performance time

Ref

eren

ce ti

me


Carnegie Mellon

Spectral Learning for Linear Dynamic System

8

¢  Model: !! = !!!!! + !!! + !! !!!!!!!~!(0,!)!!! = !!! + !!! + !! !!!!!!!!~!(0,!)!

High-dim feature: performance and score information of the past p beats

Performance feature of 2nd piano: timing + dynamics

Lower-dim feature: hidden (mental) states at time t


Carnegie Mellon

Spectral Learning(1): Oblique projections

9

!(!!) = [!!! !!!! !!!!] !!!!!!!

!

¢  We don’t know the future. ¢  Partially explain future observations based on the

history !! ≝ [!!! !!!! !0] !

!!!!0!


Carnegie Mellon

Spectral Learning(2): State estimation

10

!! =!!"!!!!!⋮! !!!!!!

!! = !Σ!! = (!Σ!!)(Σ

!!!!!)!

¢  States estimation by SVD

¢  Moreover, enforce a bottleneck by throwing out near-zero singular values and corresponding columns in U and V.

!! = !!!!! + !!! + !! !!!! = !!! + !!! + !! !!


Carnegie Mellon

Spectral Learning(3): Estimate parameter

11

¢  Based on estimated hidden states, the parameters could be estimated from the following equation:

zt

zt+1

zt-‐1

ut-‐1

ut

ut+1

yt-‐1

yt

yt+1

…

…

!! = !!! + !!! + !! !!!!!~!(0,!)!

!!!!!! =

! !! !

!!!! +

!!!! !

!! = !!!!! + !!! + !! !!!!!!!!!~!(0,!)!


Carnegie Mellon

Result: An example

12

BL: Base-line Linear extrapolation LDS: Spectral Learning （ONLY 4 rehearsals!）

25 30 35 40 450

0.05

0.1

0.15

Score time (sec)

Tim

e re

sidu

al (s

ec)

BLLDS


Carnegie Mellon

Result: Overall

13

LR: Linear regression NN: Neural network

training set size: 4 training set size: 8 training set size: 16


Carnegie Mellon

Audio Demo ¢  Base Line:

¢  Spectral Learning: 4 training examples


Carnegie Mellon

15

Conclusion

¢  An artificial performer for interactive performance using spectral learning

¢  A combination of expressive performance and automatic accompaniment

¢  Generate more human-like interactive performance just based on 4 rehearsals


Carnegie Mellon

16

Future Work

¢  Cross-piece models (in progress) ¢  Plugin with music robots (in progress) ¢  Online learning and decoding


Carnegie Mellon

17

Thanks! Gus Xia et al. Carnegie Mellon

Carnegie Mellon

Result: Performers effect

18

LDS−4 LDS−4same0

0.02

0.04

0.06

0.08

0.1

0.12

Tim

ing

resi

dual

(sec

)

LDS−4 LDS−4same0

5

10

15

Dyn

amic

s re

sidu

al (M

IDI v

eloc

ity)

Danny BoySerenadeAshokan Farewell

Danny BoySerenadeAshokan Farewell


Carnegie Mellon

Neural Network

¢  Model:

¢  10-dimensional hidden layer ¢  Objective function: mean absolute error ¢  30 epochs of SGD training ¢  Learning rate decays from 0.1 to 0.05

exponentially

zt

ut

yt

1 1

2 2

( )t tt t

f W u by Wz

bz= += +

( 0)0,,

)( 0)

(x

f xxx

⎧=

>≤⎨⎩

Gus Xia et al. Carnegie Mellon 19

Carnegie Mellon

Audio Demo ¢  Base Line:

¢  Note-specific approach: 34 training examples

¢  General feature approach: 4 training examples

¢  LDS approach: 4 training examples


spectral learning for expressive interactive ensemble ...gxia/pdf/ismir2015gusandrogertalk.pdf · 3...

Documents