time-scale and pitch modification algorithms review alexey lukin

16
Time-scale and pitch Time-scale and pitch modification modification Algorithms review Algorithms review Alexey Lukin

Upload: andrew-freeman

Post on 16-Dec-2015

221 views

Category:

Documents


1 download

TRANSCRIPT

Time-scale and pitch Time-scale and pitch modificationmodification

Algorithms reviewAlgorithms review

Alexey Lukin

““Time-scale and pitch modification algorithms”Time-scale and pitch modification algorithms”22/19/19

The problemThe problem

Goal: change duration or tonality of musical Goal: change duration or tonality of musical piecepiece

Naïve approach:Naïve approach:► (analog) record on tape and change playback (analog) record on tape and change playback

speedspeed► (digital) resample the waveform(digital) resample the waveform

Alas: pitch and duration change Alas: pitch and duration change synchronously!synchronously!

Celine Dion Speed up by 20%

““Time-scale and pitch modification algorithms”Time-scale and pitch modification algorithms”33/19/19

The problemThe problem

Goal: independent control of times-scale and Goal: independent control of times-scale and pitch, timbre should be natural!pitch, timbre should be natural!

Applications:Applications:►Samplers and virtual instrumentsSamplers and virtual instruments► Production: synchronization of audio and videoProduction: synchronization of audio and video► Post-production: pull-up, pull-downPost-production: pull-up, pull-down►Entertainment: karaoke (changing key)Entertainment: karaoke (changing key)►Education: sonic microscopeEducation: sonic microscope►More?More?

““Time-scale and pitch modification algorithms”Time-scale and pitch modification algorithms”44/19/19

Time domainTime domain

Time-domain algorithms operate with the Time-domain algorithms operate with the waveform, not spectrumwaveform, not spectrum1.1. Break the signal into short granulesBreak the signal into short granules

2.2. Repeat or discard (or shift) some granules to change Repeat or discard (or shift) some granules to change durationduration

3.3. Resample to change pitchResample to change pitch

Some pictures in this presentationare taken from Ph.D. thesis of J. Bonada

““Time-scale and pitch modification algorithms”Time-scale and pitch modification algorithms”55/19/19

Time domainTime domain

Time-domain algorithms operate with the Time-domain algorithms operate with the waveform, not spectrumwaveform, not spectrum1.1. Break the signal into short granulesBreak the signal into short granules

2.2. Repeat or discard (or shift) some granules to change Repeat or discard (or shift) some granules to change durationduration

3.3. Resample to change pitchResample to change pitch

Problems:Problems:► Granules can add in-phase (good) or out-of-phase Granules can add in-phase (good) or out-of-phase

(bad)(bad)► Transients are duplicated or discardedTransients are duplicated or discardedGuitar+castanets Slow down to 220% length

““Time-scale and pitch modification algorithms”Time-scale and pitch modification algorithms”66/19/19

Time domainTime domain

Solutions:Solutions:►Ensure that pasted granules are in phase by Ensure that pasted granules are in phase by

selecting granule size to be multiple of pitch selecting granule size to be multiple of pitch (requires autocorrelation or pitch analysis)(requires autocorrelation or pitch analysis)

► Prohibit duplicating and skipping of transient Prohibit duplicating and skipping of transient granules (requires detection of transients and granules (requires detection of transients and advanced scheduling of granules duplication)advanced scheduling of granules duplication)

Fixed granule size

Pitch-synchronous granule size (“PSOLA”)

Pitch-synchronous granule size, transients detection

““Time-scale and pitch modification algorithms”Time-scale and pitch modification algorithms”77/19/19

Time domainTime domain

Pitch-synchronous overlap-add (PSOLA)Pitch-synchronous overlap-add (PSOLA)

►Granules are 2 pitch Granules are 2 pitch periods longperiods long

►Granules are repeated Granules are repeated or discardedor discarded

►Requires pitch Requires pitch detection detection →→ unstable unstable results for non-pitched results for non-pitched or polyphonic materialor polyphonic material

““Time-scale and pitch modification algorithms”Time-scale and pitch modification algorithms”88/19/19

Time domainTime domain

SummarySummary►Very fast (1…5% CPU)Very fast (1…5% CPU)►Good quality for pitched signals (solo instruments, vocal)Good quality for pitched signals (solo instruments, vocal)

► Poor quality for non-pitched and polyphonic material:Poor quality for non-pitched and polyphonic material: Amplitude modulation (out-of-phase overlapping of granules Amplitude modulation (out-of-phase overlapping of granules

for some parts/instruments)for some parts/instruments) Repeated or discarded transients (unless special care taken)Repeated or discarded transients (unless special care taken)

ImplementationsImplementations►Editors, samplers: Audition, Cubase , Logic, Ableton, ACIDEditors, samplers: Audition, Cubase , Logic, Ableton, ACID►Vocal correctors: Melodyne, AutotuneVocal correctors: Melodyne, Autotune

+

““Time-scale and pitch modification algorithms”Time-scale and pitch modification algorithms”99/19/19

VocodersVocoders

Frequency-domain algorithms operate with a Frequency-domain algorithms operate with a short-time spectrum of the signalshort-time spectrum of the signal

Idea: build a spectrogram of a signal Idea: build a spectrogram of a signal (using a (using a

short-time Fourier transform)short-time Fourier transform) and re-synthesize a and re-synthesize a signal from a spectrogram with a different signal from a spectrogram with a different time stride (hop)time stride (hop)

Problem: during synthesis, signal granules Problem: during synthesis, signal granules can overlap out-of-phasecan overlap out-of-phase

Solution: phase modificationSolution: phase modification

at each frequency channelat each frequency channel

called phase unwrappingcalled phase unwrapping

““Time-scale and pitch modification algorithms”Time-scale and pitch modification algorithms”1010/19/19

VocodersVocoders

Traditional vocoder Traditional vocoder algorithm:algorithm:

1.1. Calculate shift-time Fourier Calculate shift-time Fourier transform (STFT) of a signaltransform (STFT) of a signal

2.2. Unwrap phases of each Unwrap phases of each frequency channel frequency channel (to (to compensate for change of synthesis compensate for change of synthesis

stride at step 3), stride at step 3), don’t modify don’t modify magnitudesmagnitudes

3.3. Synthesize a signal using Synthesize a signal using inverse STFT with a different inverse STFT with a different time stridetime stride

““Time-scale and pitch modification algorithms”Time-scale and pitch modification algorithms”1111/19/19

VocodersVocoders

Magnitudes do not changeMagnitudes do not change

Phase unwrapping equations should provide Phase unwrapping equations should provide in-phase overlapping of shifted granules at in-phase overlapping of shifted granules at each frequency channel – “horizontal phase each frequency channel – “horizontal phase coherence”coherence”

),(),( kuak

us tXtY

kakuak

ua

uk RtXtX ),(),( 1 (phase increment)

ukp

ak

ua Rt

1)(̂ (phase unwrapping)

)(ˆ),(),( 1 uaksk

usk

us tRtYtY (synthesis phase)

““Time-scale and pitch modification algorithms”Time-scale and pitch modification algorithms”1212/19/19

VocodersVocoders

Phase coherence problemPhase coherence problem►Horizontal phase coherence is ensured by phase Horizontal phase coherence is ensured by phase

unwrappingunwrapping►How about vertical phase coherence How about vertical phase coherence (coherence of (coherence of

phases between different frequency bins)phases between different frequency bins)? It is lost! ? It is lost! (except (except

cases of integer stretching ratios)cases of integer stretching ratios) This leads to: This leads to: ““Phasiness” due to out-of-phase signals in frequency Phasiness” due to out-of-phase signals in frequency

bins within every signal harmonicbins within every signal harmonic Transients are time-smeared along the whole granuleTransients are time-smeared along the whole granule

Guitar+castanets Vocoder 220% length

““Time-scale and pitch modification algorithms”Time-scale and pitch modification algorithms”1313/19/19

VocodersVocoders

Vertical phase coherence improvement: Vertical phase coherence improvement: “phase locking” algorithm locks phases “phase locking” algorithm locks phases within each spectrum peakwithin each spectrum peak1.1. Divide frequency spectrum into intervals of Divide frequency spectrum into intervals of

harmonicsharmonics

2.2. Unwrap phase of central (peak) frequency channelUnwrap phase of central (peak) frequency channel

3.3. Modify phases of other bins accordingly to the Modify phases of other bins accordingly to the phase of the central channelphase of the central channel

This reduces phasiness, but still doesn’t help This reduces phasiness, but still doesn’t help transientstransients

No phase locking Phase locking

““Time-scale and pitch modification algorithms”Time-scale and pitch modification algorithms”1414/19/19

VocodersVocoders

How to improve sharpness of transients?How to improve sharpness of transients?► Frequency resolution of human hearing is not Frequency resolution of human hearing is not

uniform: it is better at low frequencies and worse uniform: it is better at low frequencies and worse at high frequenciesat high frequencies

►So, we can use longer STFT windows at bass (for So, we can use longer STFT windows at bass (for getting better frequency resolution) and shorter getting better frequency resolution) and shorter windows at treblewindows at treble

Just phase locking

Phase locking andmultiple window sizes

““Time-scale and pitch modification algorithms”Time-scale and pitch modification algorithms”1515/19/19

VocodersVocoders

How to improve sharpness of transients?How to improve sharpness of transients?►We can directly paste transients to output without We can directly paste transients to output without

stretching (and phase modification)stretching (and phase modification)►Unwrapping of steady harmonics through Unwrapping of steady harmonics through

transientstransients

Phase locking andmultiple window sizes

+ transients pasted

““Time-scale and pitch modification algorithms”Time-scale and pitch modification algorithms”1616/19/19

VocodersVocoders

SummarySummary►Good quality for complex, polyphonic signalsGood quality for complex, polyphonic signals

►Some phasiness (even with phase locking)Some phasiness (even with phase locking)►Smearing of transients (unless special care taken)Smearing of transients (unless special care taken)►Noises sometimes sound unnaturallyNoises sometimes sound unnaturally►CPU-intensive (but still faster than realtime)CPU-intensive (but still faster than realtime)

ImplementationsImplementations►Specialized software: SlowGold, Serato Specialized software: SlowGold, Serato

Time’n’Pitch, iZotope RadiusTime’n’Pitch, iZotope Radius

+