elec484 phase vocoder kelley fea overview analysis phase synthesis phase transformation phase time...
Post on 18-Dec-2015
226 views
TRANSCRIPT
ELEC484Phase Vocoder
Kelley Fea
Overview
Analysis Phase Synthesis Phase Transformation Phase
Time Stretching Pitch Shifting Robotization Whisperation To Do
Denoising Stable/Transient Components Separation
Analysis Phase
Analysis Phase
Based on Bernardini’s documentpv_analyze.m
Inputs: inx, w, Ra Uses hanningz.m to create window Modulates signal with window Performs FFT and fftshift Outputs: Mod_y, Ph_y
(Moduli and Phase)
pv_analyze.m
function [Mod_y, Ph_y] = pv_analyze(inx, w, Ra)% pv_analyze.m for ELEC484 Project Phase 1% Analysis phase... based on Bernardini% inx = original signal% w = desired window size% Ra = analysis hop size % Get size of inx; store rows and columns separately[xrow, xcolumn] = size(inx); % Create Hanning window% using the hanningz code found in Bernardiniwin = hanningz(w);
pv_analyze.m
% Figure out the number of windows requirednum_win = ceil( (xrow - w + Ra) / Ra ); % Matrix for storing time slices (ts)ts = zeros(w, num_win); % Modulation of the signal with the window happens herecount = 1;for i = 0:num_win% the frame ends... frame_end = w - 1;
pv_analyze.m
% checks to see where the end of the frame should be% if the count + frame_end goes outside of the size limitations do... if ( count + frame_end >= size(inx,1)) frame_end = size(inx,1) - count; end% determine where the end of the window is win_end = frame_end+1;% Set value of the time slice to match the windowed segment ts = inx( count : count + frame_end ) .* win( 1 : win_end );
pv_analyze.m
% FFT value of ts using fftshift which moves zero frequency component
Y( 1 : win_end,i+1 ) = fft( fftshift(ts) );% Increment count by hop size count = count + Ra;end % End for loop
% Set output values for Moduli and Phase and return the matricesMod_y = abs(Y);Ph_y = angle(Y);end % End ph_analyze.m
Synthesis Phase
Synthesis Phase
Also based on Bernardini’s documentpv_synthesize.m
Inputs: Mod_y, Ph_y, w, Rs, Ra Uses hanningz.m to create window Calculates difference between actual and target
phases (delta phi) Recombines Moduli and Phase into Array of
complex numbers
Synthesis Phase
Performs IFFT and Overlap add Sum all samples using tapering window Final result is divided by absolute of the maximum
value Output: outx
pv_synthesize.m
function outx = pv_synthesize( Mod_y, Ph_y, w, Rs, Ra )% pv_synthesize.m for ELEC484 Project Phase 1 % Set number of bins and frames based on the size of the phase
matrix[ num_bins, num_frames ] = size (Ph_y);% Set matrix delta_phi to roughly the same size as the phase matrixdelta_phi = zeros( num_bins, num_frames-1 );% PF same size as Ph_yPF = zeros( num_bins, num_frames );% Create tapering windowwin = hanningz(w);
pv_synthesize.m
% Phase unwrapping to recover precise phase value of each bin% omega is the normal phase increment for Ra for each binomega = 2 * pi * Ra * [ 0 : num_bins - 1 ]' / num_bins; for idx = 2 : num_frames ddx = idx-1;% delta_phi is the difference between the actual and target phases% pringcarg is a separate function delta_phi(:,ddx) = princarg(Ph_y(:,idx)-Ph_y(:,ddx)-omega);% phase_inc = the phase increment for each bin phase_inc(:,ddx)=(omega+delta_phi(:,ddx))/Ra;end % End for loop
pv_synthesize.m
% Recombining the moduli and phase...% the initial phase is the samePh_x(:,1) = Ph_y(:,1); for idx = 2:num_frames ddx = idx - 1; Ph_x(:,idx) = Ph_x(:,ddx) + Rs * phase_inc(:,ddx);end% Recombine into array of complex numbersZ = Mod_y .* exp( i * Ph_x );% IFFT and overlap add% Create X of specified sizeX = zeros( ( num_frames * Rs ) + w, 1);
pv_synthesize.m
count = 1;for idx = 1:num_frames endx = count + w - 1; real_ifft = fftshift( real( ifft( Z(:,idx) ))); X( [count:endx] )= X(count:endx) + real_ifft .* win; count = count + Rs;end % sum of all samples multiplied by tapering windowk = sum( hanningz(w) .* win ) / Rs;X = X / k;% Dividing by the maximum keeps things in proportionoutx = X/abs(max(X));end % end ph_synthesize.m
hanningz.m
Used because hann() gives incorrect periodicity:
w = .5*(1 - cos(2*pi*(0:n-1)'/(n)));
princarg.m
Returns the principal argument of the nominal initial phase of each frame
a=Phasein/(2*pi);k=round(a);Phase=Phasein-k*2*pi;
Cosine Wave Test 1 (w = Ra = Rs)
Cosine Wave Test 1 (w = Ra = Rs)
0 100 200 300 400 500 600-100
-50
0
50
100input
Spectrum of Waveforms For Circular Convolution
0 100 200 300 400 500 600-4
-2
0
2
4Output
Cosine Wave (Ra = Rs = w/8)
0 100 200 300 400 500 600-1
-0.5
0
0.5
1input
Waveforms For Circular Convolution
0 100 200 300 400 500 600-1
-0.5
0
0.5
1Output
Cosine Wave – Zoom
300 350 400 450 500
-0.2
0
0.2
0.4input
Waveforms For Circular Convolution
300 350 400 450 500
-0.4
-0.2
0
0.2
Output
Toms Diner
0 0.5 1 1.5 2 2.5
x 105
-0.4
-0.2
0
0.2
0.4input
Waveforms For Circular Convolution
0 0.5 1 1.5 2 2.5
x 105
-0.4
-0.2
0
0.2
0.4Output
Piano
0 1 2 3 4 5 6 7
x 104
-1
-0.5
0
0.5
1input
Waveforms For Circular Convolution
0 1 2 3 4 5 6 7
x 104
-1
-0.5
0
0.5
1Output
Figure 8.1 (DAFX)
Time Stretching
Modify hop size ratio between analysis (Ra) and synthesis (Rs)
% Analysis function[Mod_y, Ph_y] = pv_analyze(inx, w, Ra);% Do Time Shifting here %% Modify hop size ratio hop_ratio = Rs / Ra;hop_ratio = 2;Rs = hop_ratio * Ra;% Synthesis functionX2 = pv_synthesize( Mod_y, Ph_y, w, Rs, Ra );
Ratio = Rs/Ra = 0.5
0 100 200 300 400 500 600-1
-0.5
0
0.5
1input
Waveforms For Time Stretching - 0.5
0 50 100 150 200 250 300-10
-5
0Output
Toms Diner
0 0.5 1 1.5 2 2.5
x 105
-0.4
-0.2
0
0.2
0.4input
Waveforms For Time Stretching - 0.5
0 2 4 6 8 10 12
x 104
-0.2
0
0.2
0.4
0.6Output
Piano
0 1 2 3 4 5 6 7
x 104
-1
-0.5
0
0.5
1input
Waveforms For Time Stretching - 0.5
0 0.5 1 1.5 2 2.5 3 3.5 4
x 104
-1
-0.5
0
0.5
1Output
Ratio = Rs/Ra = 2
0 100 200 300 400 500 600-1
-0.5
0
0.5
1input
Waveforms For Time Stretching - 2
0 100 200 300 400 500 600 700 800 900 1000-2
-1
0
1Output
Toms Diner
0 0.5 1 1.5 2 2.5
x 105
-0.4
-0.2
0
0.2
0.4input
Waveforms For Time Stretching - 2
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
x 105
-0.4
-0.2
0
0.2
0.4Output
Piano
0 1 2 3 4 5 6 7
x 104
-1
-0.5
0
0.5
1input
Waveforms For Time Stretching - 2
0 2 4 6 8 10 12 14
x 104
-1
-0.5
0
0.5
1Output
Pitch Shifting
Attempted to multiply a factor by the phase
Pitch Shifting
% Analysis function[Mod_y, Ph_y] = pv_analyze(inx, w, Ra);% Do Pitch Shifting here %Ph_y = princarg(Ph_y*1.5);% Synthesis functionX4 = pv_synthesize( Mod_y, Ph_y, w, Rs, Ra );
Pitch Shifting – Cosine
0 100 200 300 400 500 600-1
-0.5
0
0.5
1input
Waveforms For Pitch Shifting - 0.5
0 100 200 300 400 500 600-1
0
1
2Output
Pitch Shifting – Toms Diner
0 0.5 1 1.5 2 2.5
x 105
-0.4
-0.2
0
0.2
0.4input
Waveforms For Pitch Shifting - 0.5
0 0.5 1 1.5 2 2.5
x 105
-0.4
-0.2
0
0.2
0.4Output
Pitch Shifting – Piano
0 1 2 3 4 5 6 7
x 104
-1
-0.5
0
0.5
1input
Waveforms For Pitch Shifting - 0.5
0 1 2 3 4 5 6 7
x 104
-1
0
1
2Output
Robotization
Set phase (Ph_y) to zero
% Analysis function
[Mod_y, Ph_y] = pv_analyze(inx, w, Ra);
% Do Robotization here %
Ph_y = zeros(size(Ph_y));
% Synthesis function
xout = pv_synthesize( Mod_y, Ph_y, w, Rs, Ra );
Robotization – Cosine
0 100 200 300 400 500 600-1
-0.5
0
0.5
1input
Waveforms For Robotization
0 100 200 300 400 500 600-0.5
0
0.5
1Output
Robotization – Toms Diner
0 0.5 1 1.5 2 2.5
x 105
-0.4
-0.2
0
0.2
0.4input
Waveforms For Robotization
0 0.5 1 1.5 2 2.5
x 105
-0.5
0
0.5Output
Robotization – Piano
0 1 2 3 4 5 6 7
x 104
-1
-0.5
0
0.5
1input
Waveforms For Robotization
0 1 2 3 4 5 6 7
x 104
-1
-0.5
0
0.5
1Output
Whisperization
deliberately impose a random phase on a time-frequency representation
% Analysis function
[Mod_y, Ph_y] = pv_analyze(inx, w, Ra);
% Do Whisperization here %
Ph_y = ( 2*pi * rand(size(Ph_y, 1), size(Ph_y, 2)) );
% Synthesis function
xout = pv_synthesize( Mod_y, Ph_y, w, Rs, Ra );
Whisperization – Cosine
0 100 200 300 400 500 600-1
-0.5
0
0.5
1input
Waveforms For Whisperization
0 100 200 300 400 500 600-0.5
0
0.5
1Output
Whisperization – Toms Diner
0 0.5 1 1.5 2 2.5
x 105
-0.4
-0.2
0
0.2
0.4input
Waveforms For Whisperization
0 0.5 1 1.5 2 2.5
x 105
-0.2
-0.1
0
0.1
0.2Output
Whisperization – Piano
0 1 2 3 4 5 6 7
x 104
-1
-0.5
0
0.5
1input
Waveforms For Whisperization
0 1 2 3 4 5 6 7
x 104
-0.4
-0.2
0
0.2
0.4Output
Denoising
emphasize some specific areas of a spectrum
Stable Components Separation
Calculate the instantaneous frequency by making the derivative of the phase along the time axis.
Check if this frequency is within its “stable range”.
Use the frequency bin or not for the reconstruction.
Transient Components Separation
Conclusion
Rest of effects need to be properly implemented:Stable/Transient Components SeparationDenoising
Questions?
Thank you!