soundfield navigation using an array of higher-order ... · soundfield navigation using an array...
TRANSCRIPT
Soundfield Navigation using an Array of Higher-Order Ambisonics Microphones
1
AES International Conference on Audio for Virtual and Augmented Reality
September 30th, 2016
Joseph G. Tylka (presenter) Edgar Y. Choueiri
3D Audio and Applied Acoustics (3D3A) Laboratory Princeton University
www.princeton.edu/3D3A
HOA mic. 2
Valid region
Soundfield Navigation
2
HOA microphone
HOA mic. 3
Listening position
Accurate region
HOA mic. 4
Sound source
[1] Poletti (2005). “Three-Dimensional Surround Sound Systems Based on Spherical Harmonics.”See:
Overview• Previous work
• Proposed method for soundfield navigation
• Evaluation - numerical simulations and metrics
• Results
• Conclusions and future work
3
Previous Work• Collaborative blind source separation [5]
• Ideal for soundfields with discrete sources
• Degradation of sound quality due to artifacts
• Weighted average of ambisonics signals [6]
• Comb-filtering and skewed localization
4
[5] Zheng (2013). Soundfield navigation: Separation, compression and transmission. [6] Southern, Wells, and Murphy (2009). “Rendering walk-through auralisations using wave-based acoustical models.”
Proposed Method
5
Valid region
Basic Principle
6
HOA mic. 2
HOA mic. 1
Sound source
Listening position
HOA mic. 3
Ambisonics Translation
y
z
x
~d
7
b(k) = T(k; ~d) · a(k)
b(k)
a(k)
[7] Zotter (2009). Analysis and Synthesis of Sound-Radiation with Spherical Arrays. [9] Gumerov and Duraiswami (2005). Fast Multipole Methods for the Helmholtz Equation in Three Dimensions.See:
Proposed Method• Pose as frequency-dependent
inverse problem
• Write translation matrix from listening position to each of P microphones
• When multiplied by x, should give measured signals
• Compute regularized pseudoinverse via singular value decomposition of M
8
M · x = y
x̃ = V⇥⌃+U
⇤ · y
y =
2
6664
pw1b1pw2b2...p
wPbP
3
7775M =
2
6664
pw1T(� ~d1)pw2T(� ~d2)
...pwPT(� ~dP )
3
7775M · x = y
Unknown HOA signals
Measured HOA signals
Translation matrices
Least-squares estimate
Microphone Validity
9
HOA signals from mic. 1
HOA signals from mic. P
Interpolated HOA signals
Compute HOA signals at listening position
Re-normalize weights
Interpolation weights
Determine valid mic’s
Listening position
Detect and locate near-field sources [5]
Microphone positions
[5] Zheng (2013). Soundfield navigation: Separation, compression and transmission.
Evaluation
10
Numerical Simulations
11
!
"
Δϕ
#!
!
"
Δ
!"#$Δ
!"#$Δ
(%)
(&)
Simulation #1 Simulation #2
Point sourceHOA microphone
Key
Listening position
Localization Prediction• Using precedence-effect
based localization model [11]
1.Transform to plane-wave impulse responses (IRs)
2.Split each IR into wavelets
3.Threshold to find onset times
4.FFT to find frequency-dependent source gains
12
Plane-wave IR
High-pass
Find peaks
Wavelets
Window
[11] Stitt, Bertet, and van Walstijn (2016). “Extended Energy Vector Prediction of Ambisonically Reproduced Image Direction at Off-Center Listening Positions”
Results
13
Recall: Numerical Simulation #1
14
!
"
Δϕ
#!
Point sourceHOA microphone
Key
Listening position
Coloration: Simulation #1
15
Distance: rS = 1 m Input order: Lin = 4 Spacing: Δ = 0.5 m
!°
"#°
$!°
%#°
&!°
'#°
(!°
!" #"" !"" #""" !""" #"!
"
!"
#""
#!"
$""# #" #"""%! ! !"
!"#$%#&'( ()*)
!"#$%&'()
((*)
!Δ
Weighted Average Method
ϕ = !°
"#°
$!°
%#°
&!°
'#°
(!°
!" #"" !"" #""" !""" #"!
"
!"
#""
#!"
$""# #" #"""%! ! !"
!"#$%#&'( ()*)
!"#$%&'()
((*)
!Δ
Proposed Method
ϕ =
Coloration: Simulation #1 continued
16
!!" = !
!!" = "
!!" = #
!!" = $
!!" = %
!" #"" !"" #""" !""" #"!
"
!"
#""
# #" #"""$! ! !"
!"#$%#&'( ()*)
!"#$%&'()
((*)
!Δ
Result: the proposed method achieves negligible coloration for kΔ ≤ 2LinProposed method only
Distance: rS = 1 m Azimuth: ϕ = 45° Spacing: Δ = 0.5 m
Localization: Simulation #1
17
!"#$%&"' ()$*
+"$-,-
!"# !"$ !"% !"& !"' !"( !") #!
$!
&!
*!
+!% ' ( ) ## #% #' #(
!""#$ %&#'()* Δ (+)
!"#$%&'$(&")*++"+ϵ
(°)
!Δ
7.7°
3.9°
Result: for small spacings (Δ < 0.5 m), the proposed method (“Reg-LS”) achieves improved localization
Distance: rS = 1 m Input order: Lin = 4 Frequency: f = 1 kHz Averaged over azimuth
Localization: Simulation #1 continued
18
Result: the proposed method achieves accurate localization for kΔ ≤ 2Lin
Proposed method only Distance: rS = 1 m Frequency: f = 1 kHz Averaged over azimuth
��� = � � �
��� ��� ��� ��� ��� ��� ��� ��
��
��
��
��� � � � �� �� �� ��
����� ������� Δ (�)
�����������������ϵ
(°)
�Δ
��� = � � �
��� ��� ��� ��� ��� ��� ��� ��
��
��
��
��� � � � �� �� �� ��
����� ������� Δ (�)
�����������������ϵ
(°)
�Δ
Weighted Avg.
Recall: Numerical Simulation #2
19
!
"
Δ
!"#$Δ
!"#$Δ
(%)
(&)
Point sourceHOA microphone
Key
Listening position
Localization: Simulation #2
20
Result: inclusion of invalid microphones can significantly degrade localization
Proposed method only Input order: Lin = 4 Frequency: f = 1 kHz Averaged over azimuth
(a) Source position rS = (0.75Δ, 0, 0)
����� ���� ���
��� ��� ��� ��� ��� ��� ��� ��
��
��
��
��� � � � �� �� �� ��
����� ������� Δ (�)
�����������������ϵ
(°)
�Δ
(b) Source position rS = (0.75Δ, 0.75Δ, 0)
����� ����
���
��� ��� ��� ��� ��� ��� ��� ��
��
��
��
��� � � � �� �� �� ��
����� ������� Δ (�)
�����������������ϵ
(°)
�Δ
Summary and Conclusions• Presented a method of soundfield navigation:
• Regularized, least-squares using an array of HOA microphones
• Explored coloration and localization errors
• For a pair of microphones: kΔ ≤ 2Lin
• Demonstrated error introduced by “invalid” microphones
• Future work:
• Validate objective predictions
• Minimize spectral coloration
21
References
22
[1] M. A. Poletti, “Three-Dimensional Surround Sound Systems Based on Spherical Harmonics,” J. Audio Eng. Soc., vol. 53, no. 11, pp. 1004–1025 (2005).
[2] N. Hahn and S. Spors, “Physical Properties of Modal Beamforming in the Context of Data-Based Sound Reproduction,” presented at the 139th Convention of the Audio Engineering Society, (2015 Oct.) convention paper 9468.
[3] F. Winter, F. Schultz, and S. Spors, “Localization Properties of Data-based Binaural Synthesis including Translatory Head-Movements,” presented at the 7th Forum Acusticum, (2014 Sept.).
[4] J. G. Tylka and E. Y. Choueiri, “Comparison of Techniques for Binaural Navigation of Higher-Order Ambisonic Soundfields,” presented at the 139th Convention of the Audio Engineering Society, (2015 Oct.) convention paper 9421.
[5] X. Zheng, Soundfield navigation: Separation, compression and transmission, Ph.D. thesis, University of Wollongong (2013).
[6] A. Southern, J. Wells, and D. Murphy, “Rendering walk-through auralisations using wave-based acoustical models,” presented at the 17th European Signal Processing Conference (2009).
[7] F. Zotter, Analysis and Synthesis of Sound-Radiation with Spherical Arrays, Ph.D. thesis, University of Music and Performing Arts Graz (2009).
[8] C. Nachbar, F. Zotter, E. Deleflie, and A. Sontacchi, “ambiX - A Suggested Ambisonics Format,” presented at the 3rd Ambisonics Symposium (2011 June).
[9] N. A. Gumerov and R. Duraiswami, Fast Multipole Methods for the Helmholtz Equation in Three Dimensions, Elsevier Science (2005).
[10] M. A. Gerzon, “General Metatheory of Auditory Localisation,” presented at the 92nd Convention of the Audio Engineering Society, (1992) convention paper 3306.
[11] P. Stitt, S. Bertet, and M. van Walstijn, “Extended Energy Vector Prediction of Ambisonically Reproduced Image Direction at Off-Center Listening Positions,” J. Audio Eng. Soc., vol. 64, no. 5, pp. 299–310 (2016).
[12] J. Fliege and U. Maier, “The distribution of points on the sphere and corresponding cubature forumlae,” IMA Journal of Numerical Analysis, vol. 19, no. 2, pp. 317–334 (1999).