aes134 th conven tion pro gram - aes | audio engineering ...using time scale modified speech, and...

Audio Engineering Society 134th Convention Program, 2013 Spring 1

At recent AES conventions, authors have had the op-tion of submitting complete 4 - to 10- page manuscriptsfor peer-review by subject-matter experts. The followingpaper has been recognized as winner of the AES 134thConvention Peer-Reviewed Paper Award.

Loudness Measurement of Multitrack Audio ContentUsing Modifications of ITU-R BS.1770—Pedro Duarte Pestana, Catholic University ofOporto (CITAR-UCP), Oporto, Portugal andLusíada University of Portugal (ILID), Oporto,

Portugal; and Centro de Estatística e Aplicacoes(CEAUL-FCUL), Lisbon, Portugal; Joshua D.Reiss, Queen Mary University of London, London, UK; Alvaro Barbos, Catholic University of Oporto (CITAR-UCP),

Oporto, PortugalConvention Paper 8813

To be presented on Saturday, May 4, in Session P2—Audio Signal Processing—Part 1

* * * *

The AES has launched an opportunity to recognize stu-dent members who author technical papers. The StudentPaper Award Competition is based on the preprint manu-scripts accepted for the AES convention.A number of student-authored papers were nominated.

The excellent quality of the submissions has made theselection process both challenging and exhilarating. The award-winning student paper will be honored dur-

ing the Convention, and the student-authored manuscriptwill be considered for publication in a timely manner forthe Journal of the Audio Engineering Society.Nominees for the Student Paper Award were required

to meet the following qualifications:

(a) The paper was accepted for presentation at theAES 134th Convention.(b) The first author was a student when the work was

conducted and the manuscript prepared.(c) The student author’s affiliation listed in the manu-

script is an accredited educational institution.(d) The student will deliver the lecture or poster pre-

sentation at the Convention.

* * * * *

The Winner of the 134th AES Convention Student Paper Award is:

Parametric Spatial Audio Coding for Spaced Microphone Array Recordings—ArchontisPolitis, Mikko-Ville Laitinen, Aalto University, Espoo, Finland; Jukka Ahonen, Akukon Ltd.,

Helsinki, Finland; Ville Pulkki, Aalto University, Espoo, Finland

Convention Paper 8905

To be presented on Tuesday, May 7, in Session 16—Spatial Audio—Part 2: 3-D Microphone

and Loudspeaker Systems

* * * * *Student/Career EventSTUDENT DELEGATE ASSEMBLY MEETING PLACESaturday 10:00 – 18:30Sunday – Tuesday 9:00 – 18:30Foyer

Come visit the SDA Booth to find out about AES studentevents at chapters around the world. This is also whereyou will see postings about finalists in the recording anddesign competitions.

Saturday, May 4 10:00 Sala Saba

Technical Committee Meeting on Perception and Subjective Evaluation of Audio Signals

Session P1 Saturday, May 410:30 – 12:30 Sala Carducci

EDUCATION AND SEMANTIC AUDIO

Chair: Jörn Loviscach, Bielefeld University of Applied Sciences, Bielefeld, Germany

10:30

P1-1 Alternate Software and Pedagogical Strategies for Teaching Audio Technology to Classically Trained Music Students—Christopher J. Keyes, Hong Kong Baptist University, Kowloon, Hong Kong

Despite the many benefits, teaching audio andaudio technology to music students brings asmany challenges as rewards. Their vastly differ-ent scientific and technical backgrounds often require fundamentally different pedagogical

AAEESS 113434thth CCoonnvvenenttiion Proon ProggramramMay 4 – 7, 2013

Fontana di Trevi Conference Centre, Rome, Italy

�

2 Audio Engineering Society 134th Convention Program, 2013 Spring

strategies from other subjects in a music curricu-lum. The dissemination of information via lecturesand/or lengthy technical readings is especiallyproblematic. This paper addresses these chal-lenges by presenting a new pedagogical approach centered on a free pedagogical soft-ware package: the Interactive Workshop for AudioIntelligence and Literacy (i-WAIL). When com-bined with inquiry-based learning approaches andprojects based on the student’s area of interest, ithas so far proven highly effective with music stu-dents of a wide range of previous abilities.Convention Paper 8809

11:00

P1-2 Auditory-Visual Attention Stimulator—Adam Kupryjanow, Lukasz Kosikowski, PiotrOdya, Andrzej Czyzewski, Gdansk University of Technology, Gdansk, Poland

A new approach to lateralization irregularitiesformation was proposed. The emphasis is put onthe relationship between visual and auditory attention. In this approach hearing is stimulatedusing time scale modified speech, and sight isstimulated rendering the text of the currentlyheard speech. Moreover, displayed text is modi-fied using several techniques, i.e., zooming,highlighting, etc. In the experimental part of thepaper, results obtained for the reading compre-hension training were presented. It was shownthat usage of the proposed method could improve this skill in a group of children betweenthe ages of 7 and 8 years. Convention Paper 8810

11:30

P1-3 Evaluation of Acoustic Features for MusicEmotion Recognition—Chris Baume, BBC Research and Development, London, UK

Classification of music by mood is a growingarea of research with interesting applications, including navigation of large music collections.Mood classifiers are usually based on acousticfeatures extracted from the music, but often theyare used without knowing which ones are mosteffective. This paper describes how 63 acousticfeatures were evaluated using 2,389 musictracks to determine their individual usefulness inmood classification, before using feature selec-tion algorithms to find the optimum combination. Convention Paper 8811

12:00

P1-4 Investigating Auditory Human-Machine Interaction: Analysis and Classification of Sounds Commonly Used by Consumer Devices—Konstantinos Drossos,1 Rigas Kotsakis,2 Panos Pappas,3 George M. Kalliris,2Andreas Floros11Ionian University, Corfu, Greece2Aristotle University of Thessaloniki, Thessaloniki, Greece3Technological Educational Institute of Ionian Islands, Lixouri, Greece

Many common consumer devices use a shortsound indication for declaring various modes oftheir functionality, such as the start and the end

of their operation. This is likely to result in an intuitive auditory human-machine interaction, imputing a semantic content to the sounds used.In this paper we investigate sound patternsmapped to “Start” and “End” of operation mani-festations and explore the possibility such semantics’ perception to be based either onusers’ prior auditory training or on sound pat-terns that naturally convey appropriate informa-tion. To this aim, listening and machine learningtests were conducted. The obtained results indi-cate a strong relation between acoustic cuesand semantics along with no need of priorknowledge for message conveyance. Convention Paper 8812

Session P2 Saturday, May 410:30 – 12:30 Sala Foscolo

AUDIO SIGNAL PROCESSING—PART 1

Chair: Danilo Comminiello, Sapienza University of Rome, Rome, Italy

10:30

P2-1 Loudness Measurement of Multitrack AudioContent Using Modifications of ITU-RBS.1770 —Pedro Duarte Pestana,1,2 Joshua D.Reiss,3 Alvaro Barbosa11Catholic University of Oporto (CITAR-UCP), Oporto, Portugal 2Lusíada University of Portugal (ILID), Oporto, Portugal; and Centro de Estatística e Aplicacoes (CEAUL-FCUL), Lisbon, Portugal3Queen Mary University of London, London, UK

The recent loudness measurement recommen-dations by the ITU and the EBU have gainedwidespread recognition in the broadcast commu-nity. The material i t deals with is usually full-range mastered audio content, and its applic-ability to multitrack material is not yet clear. Inthe present work we investigate how well theevaluated perception of single track loudnessagrees with the measured value as defined byITU-R BS.1770. We analyze the underlying fea-tures that may be the cause for this disparity andpropose some parameter alterations that mightyield better results for multitrack material withminimal modification to their rating of broadcastcontent. The best parameter sets are then evaluat-ed by a panel of experts in terms of how well theyproduce an equal-loudness multitrack mix and areshown to be significantly more successful.Convention Paper 8813

11:00

P2-2 Delayless Robust DPCM Audio Transmissionfor Digital Wireless Microphones—FlorianPflug, Tim Fingscheidt, Technische UniversitätBraunschweig, Braunschweig, Germany

The employment of digital wireless microphonesin professional contexts requires ultra-low delay,strong robustness, and high audio quality. Onthe other hand, audio coding is required in orderto comply with the restrictions on the bandwidthof the radio channel, making the resulting

Technical Program


source-coded audio signal more vulnerable tochannel distortions. Therefore, in this paper wepresent a transmission system for differentialpulse-code modulated (DPCM) audio with receiver-sided soft-decision error concealmentexploiting channel reliability information, explicitredundancy by simple delayless parity-checkcodes, and residual redundancy within thesource-coded audio signal. Simulations on fre-quency-shift keying (FSK)-modulated channelswith additive white Gaussian noise show consid-erable gains in audio quality compared to hard-decision decoding and soft-decision decodingonly exploiting reliability information and 0th-order a priori knowledge. Convention Paper 8814

11:30

P2-3 Violin Sound Computer Classification Basedon Expert Knowledge—Adam Robak, EwaLukasik, Poznan University of Technology, Poznan, Poland

The paper presents results of the analysis of violins recorded during the final stage of the international violin-makers competition held inPoznan in 2011. In the quest for attributes thatare both efficient for machine learning and inter-pretable for human experts we referred to the research of violin acousticians: Duennwald,Buen, and Fritz and calculated violin sound pow-er in frequency bands recommended by theseresearchers. The resulting features, obtained forthe averaged spectra of the musical piecesplayed at the competition, were used for cluster-ing and classification experiments. Results arediscussed, and a notable experiment is present-ed where the classifier assigns each analyzedviolin to an instrument from the precedent violin-makers’ competition (2001) and compares theirranking. Convention Paper 8815

12:00

P2-4 A Finite Difference Method for the Excitationof a Digital Waveguide String Model—Leonardo Gabrielli,1 Luca Remaggi,1 StefanoSquartini,1 Vesa Välimäki21Universitá Politecnica delle Marche, Ancona, Italy2Aalto University, Espoo, Finland

With Digital Waveguide modeling (DWG) a num-ber of excitation methods have been proposedto feed the delay line properly. Generally speak-ing these may be based on signal models fittingrecorded samples, excitation signals extractedfrom recorded samples or digital filter networks.While allowing for a stable, computationally effi-cient sound emulation, they may be unable toemulate secondary effects generated by thephysical interaction of, e.g., distributed interac-tion of string and hammer. On the other hand,FDTD (Finite Difference Time Domain) modelsare more accurate in the emulation of the physi-cal excitation mechanism at the expense of ahigher computational cost and a complex coeffi-cient design to ensure numerical stability. In thispaper a mixed model is proposed composed of atwo-step FDTD model, a commuted DWG, and

an adaptor block to join the two sections. Prop-erties of the model are provided and computerresults are given for the case of the Clavinet tangent-string mechanism as an example application. Convention Paper 8816

Workshop 1 Saturday, May 410:30 – 12:30 Auditorium Loyola

CAPTURING THE ACOUSTICS OF CONCERT HALLS WITH A LOUDSPEAKER ORCHESTRA

Chair: Tapio Lokki, Aalto University, Aalto, Finland

Panelists: Jukka PätynenSakari Tervo

The sensory evaluation process to assess the quality ofconcert hall acoustics requires that the spatial sound in aconcert hall can be reproduced accurately with a multi-channel reproduction system. This workshop introducesmethods to create a listening system, with which a sub-ject can switch between the acoustics of different rooms,e.g., concert halls, in real time. The acoustics is capturedby measuring spatial impulse responses from 34 loud-speakers on stage with a microphone array. The impulse responses are spatially encoded for a 24-loudspeaker 3-D setup. Finally, a convolution of anechoic symphonyorchestra recordings with spatially decomposed convolu-tion reverbs reproduces the spatial sound from the con-cert hall for listening and for comparison of acoustics.The workshop includes several sound samples from famous European concert halls.

Session EB1 Saturday, May 411:00 – 12:30 Foyer

ENGINEERING BRIEFS—POSTERS: PART 1

11:00

EB1-1 Timecode-Aware Loudness Monitoring: Accelerate Engineers’ Everyday Workflow—Arnaud Laborie, Samuel Tracol, Arnaud Destinay, Jacques Di Giovanni, Trinnov Audio, Bry sur Marne, France

While EBU R-128 loudness normalization is inthe process of being adopted by a majority ofEuropean countries, most real-time loudnessmeters aren’t still completely adapted to mixingengineers’ workflows, as continuous projectmeasurements are always required to keep con-sistent loudness values. By slaving the loudnessmeasurement to an incoming time code, everyloudness and true peak values are constantlyrecorded and time stamped, allowing their calcu-lation at any time. Engineers no longer need tomanually pause, resume, or even start a measurement over to keep a relevant loudnessmonitoring. Engineering Brief 78

11:00

EB1-2 Control of the Audio Signal Using ThermalParameter for Protection of the Voice Coil—Oanjin Kim, Keeyeong Cho, Jongwoo Kim, Samsung Electronics Co., Ltd., Suwon, Korea

�


This engineering brief presents a procedure for asignal processing method to protect a voice coilfrom overheating. The basic concept is that ofestimating temperature of the voice coil withheat transfer model and controlling the outputlevel before the voice coil is burnt. This papermainly focuses on the calculation method andvarious considerations in level control. A basicheat transfer model and a commonly usedmethod in calculation of the thermal parametersare introduced. Engineering Brief 79

11:00

EB1-3 Automatic Segmentation of Concert Recordings via a Heuristic Approach—Andrew Ayers, University of Miami, CoralGables, FL, USA

In the age of digital recordings, many institutionsmaintain large databases of concert recordings.While segmentation of these concert recordingsfor mastering and production is a time-consumingtask for humans, this paper presents a novelheuristic algorithm to automate that process. Build-ing on other work in audio segmentation, tech-nique from the music informatics community isused to detect events, classify them, and segmententire concert recordings unsupervised. A brief review of previous work and the methodologyused in this approach are provided, as well as theresults obtained on a corpus of sixteen concerts.Engineering Brief 80

11:00

EB1-4 Simulation of a Near Field Loudspeaker System on Headphones—Erich Meier,amoenus audio by Erich Meier, Bern, Switzerland

The vast majority of recorded music is producedfor reproduction via loudspeakers positioned atthe standard 60° stereo triangle. To achieve thesame sound impression on headphones with itsnear-field 180° transducer positions, the soundof the standard stereo 60° triangle has to be sim-ulated. We describe a circuit using different seri-al and parallel delay-and-filter paths for directand cross-feed channels, with the goal toachieve accurate near-field loudspeaker-soundand also a good externalized localization. Engineering Brief 81

11:00

EB1-5 Influence of First Reflections in ListeningRoom on Subjective Listener Impression of Reproduced Sound—Hidetaka Imamura,1Atsushi Marui,1 Toru Kamekawa,1 MasatakaNakahara21Tokyo University of the Arts, Tokyo, Japan2SONA Corp., Tokyo, Japan

This study investigated the perceptual factors regarding room acoustics such as spatial impression and timbre preferences, with focuson the arrival direction and pattern of the earlyreflections. Impulse responses were recordedwith varying wall reflection and evaluated in asubjective test. Although no significant differencein timbre preferences and some evaluation

terms were found in subjective listening test, thevariation of early reflections did significantly influence listeners spatial impression. Themethod and results of the analysis is reported inthe presentation. Engineering Brief 82

11:00

EB1-6 Workload Estimation for Low-Delay Segmented Convolution—Malte Spiegelberg,HAW Hamburg, Hamburg, Germany

Zero-delay convolution usually follows a hybrid approach with convolution processing steps inboth the time and the frequency domain [Gard-ner, J. Audio Eng. Soc., vol. 43, 127-136 (1995Mar.)]. Implementations are likely to ask for dynamic coding, and related workload estima-tions are focused on efficiency and are limited tothe hybrid approach. This paper considers sim-pler implementations of segmented convolutionthat work in the frequency domain only and thatachieve acceptable low delay for real-time appli-cations when processing several seconds of impulse-response in FIR mode. Workload andmemory demand are estimated for this approachin the context of likely application parameters. Engineering Brief 88

Tutorial 1 Saturday, May 411:00 – 12:30 Sala Alighieri

MICROPHONE TECHNOLOGY

Presenter: Ron Streicher, Pacific Audio-Visual Enterprises, Pasadena, CA, USA

How do microphones work? What differentiates one operat-ing type of transducer from another? How and why do theysound different? What are polar patterns, and how do theyaffect the way a microphone responds to sound? What is“proximity effect,” and why do some mics exhibit more of itthan others? What’s the truth about capsule size—does itreally matter? These are just a few of the topics covered inthis in-depth tutorial on Microphone Technology.

Workshop 2 Saturday, May 411:00 – 12:30 Sala Manzoni

APPLICATIONS OF 3-D AUDIO IN AUTOMOTIVE

Chair: Alan Trevena, Jaguar Land Rover, Gaydon, UK

Panelists: Oliver Hellmuth, Fraunhofer IIS, Eerlangen, GermanyMichael Kelly, DTS, London, UKAdam Sulowski, Audi AG, Ingolstadt, GermanyBert Van Daele, Auro-3D, Mol, Belgium

While there are a number of technologies aimed at improving the spatial rendering of recorded sounds in automobiles, few are offer the advantages and chal-lenges as 3-D surround. This workshop will explore theo-retical applications and system configurations as well aslimitations of 3-D surround applications in automotive.Questions such as what is the reference experience, andhow is a system evaluated will be addressed.

Technical Program



Technical Committee Meeting on Human Factors in Audio Systems


Technical Committee Meeting on Loudspeakers and Headphones

Special EventAWARDS PRESENTATION AND KEYNOTE ADDRESSSaturday, May 4, 13:00 – 14:00Auditorium Loyola

Opening Remarks: • Executive Director Bob Moses• President Frank Wells• Convention Chair Umberto Zanghieri

Program: • AES Awards Presentation• Introduction of Keynote Speaker by Convention Chair Umberto Zanghieri• Keynote Address by Stephen Webber

Awards PresentationPlease join us as the AES presents Special Awards tothose who have made outstanding contributions to theSociety in such areas of research, scholarship, and pub-lications, as well as other accomplishments that havecontributed to the enhancement of our industry. Theawardees are:

BRONZE MEDAL AWARD• Bob Schulein

FELLOWSHIP AWARD• Zbigniew Kulka• Lauri Savioja

CITATION AWARD• Umberto Zanghieri

Keynote SpeakerThis year’s Keynote Speaker is Stephen Webber, inter-nationally recognized author, professor of music produc-tion and engineering, and star hip-hop DJ. Author ofsuch successful books as Turntable Technique: The Artof the DJ, Webber is an international clinician who pre-sents workshops, master classes, and performance sem-inars on production, mixing techniques, and songwriting.An Emmy-winning composer who’s works include the“Stylus Symphony,” Webber was recently appointed Director of Music Technology Innovation at Berklee’snew campus in Valencia, Spain. He addressed theBerklee Graduating Class of 2012 and has performedand lectured extensively in the US, China, Australia,Central America, and Europe. Webber is also an accom-plished studio designer whose clients include actors JackBlack and Ben Stiller, and a producer/engineer who’sworked with NAS, Mark O’Connor, and DJ Premier. Entitled “Inventing the Album of 2025,” Webber’s AES

Keynote will seek to consider what is technologically,economically, and artistically possible, to decipher whatthe record album should look, sound, and feel like come2025.


Technical Committee Meeting on Automotive Audio

Saturday, May 4 14:00 Sala Montale

Standards Committee Meeting on SC-02-02, DigitalInput/Output Interfacing

Session P3 Saturday, May 414:15 – 17:15 Sala Carducci

PERCEPTION

Chair: Frank Melchior, BBC R&D, London, UK

14:30

P3-1 The Relation between Preferred TV Program Loudness, Screen Size, and Display Format—Ian Dash,1 Todd Power,2 Densil A. Cabrera21Consultant, Marrickville, NSW, Australia2University of Sydney, Sydney, NSW, Australia

The effect of television screen size and displayformat on preferred TV program loudness wasinvestigated by listening tests using typical pro-gram material. While no significant influence onpreferred program loudness was observed fromscreen size or color level, other preference pat-terns related to soundtrack content type wereobserved that are of interest. Convention Paper 8817

15:00

P3-2 Vibration in Music Perception—SebastianMerchel, M. Ercan Altinsoy, Dresden Universityof Technology, Dresden, Germany

The coupled perception of sound and vibration isa well-known phenomenon during live pop or organ concerts. However, even during a sym-phonic concert in a classical hall, sound can excite perceivable vibrations on the surface ofthe body. This study analyzes the influence ofaudio-induced vibrations on the perceived quali-ty of the concert experience. Therefore, soundand seat vibrations are controlled separately inan audio reproduction scenario. Because thecorrelation between sound and vibration is natu-rally strong, vibrations are generated from audiorecordings using various approaches. Differentparameters during this process (frequency andintensity modifications) are examined in relationto their perceptual consequences using psy-chophysical experiments. It can be concludedthat vibrations play a significant role during theperception of music. Convention Paper 8818

15:30P3-3 An Assessment of Virtual Surround Sound

Systems for Headphone Listening of 5.1 Multichannel Audio—Chris Pike, Frank Melchior, BBC Research and Development, Salford, UK

It is now common for broadcast signals to fea-ture 5.1 surround sound. It is also increasinglycommon that audiences access broadcast content on portable devices using headphones.Binaural techniques can be applied to create aspatially enhanced headphone experience fromsurround sound content. This paper presents asubjective assessment of the sound quality of 12state-of-the-art systems for converting 5.1 sur-round sound to a 2-channel signal for head- �


phone listening. A multiple stimulus test wasused with hidden reference and anchors; the ref-erence stimulus was an ITU stereo down-mix.Dynamic binaural synthesis, based on individual-ized binaural room impulse response measure-ments and head orientation tracking, was alsoincorporated into the test. The experimental design and detailed analysis of the results arepresented in this paper. Convention Paper 8819

16:00

P3-4 Effect of Target Signal Envelope on DirectionDiscrimination in Spatially Complex SoundScenarios—Olli Santala, Marko Takanen, VillePulkki, Aalto University, Aalto, Finland

The temporal envelope of a sound signal hasbeen found to have an effect on localization.Whether this is valid for spatially complex sce-narios was addressed by conducting a listeningexperiment in which a spatially distributed soundsource consisted of a target between two inter-fering noise-like sound sources, all emittingsound simultaneously. All the signals were har-monic complex tones with components within 2kHz–8.2 kHz and were presented using loud-speaker reproduction in an anechoic chamber.The phases of the harmonic tones of the targetsignal were altered, causing the envelope tochange. The results indicated that prominentpeaks in the envelope of the target signal aidedin the discrimination of its direction inside thewidely distributed sound source. Convention Paper 8820

16:30

P3-5 A Framework for Adaptive Real-Time Loudness Control—Andrea Alemanno,1Alessandro Travaglini,2 Simone Scardapane,1Danilo Comminiello,1 Aurelio Uncini11Sapienza University of Rome, Rome, Italy 2Fox International Channels Italy, Guidonia Montecelio (RM), Italy

Over the last few years, loudness control repre-sents one of the most frequently investigated top-ics in audio signal processing. In this paper we de-scribe a framework designed to provide adaptivereal-time loudness measurement and processingof audio files and streamed content being repro-duced by mobile players hosted in laptops, tablets,and mobile phones. The proposed method aims toimprove the users’ listening experience by normal-izing the loudness level of the content in real-time,while preserving the original creative intent of theoriginal soundtrack. The loudness measurementand adaptation is based on a customization of theHigh Efficiency Loudness Model algorithm de-scribed in the AES Convention Paper #8612(“HELM: High Efficiency Loudness Model forBroadcast Content,” presented at the 132nd Con-vention, April 2012). Technical and subjective testswere performed in order to evaluate the perfor-mance of the proposed method. In addition, theway the subjective test was arranged offered theopportunity to gather information on the preferredTarget Level of streamed and media files repro-duced on portable devices.Convention Paper 8821

17:00

P3-6 The Perception of Masked Sounds and Reverberation in 3-D vs. 2-D Playback Systems—Giulio Cengarle,1 Alexandre Pereda21Imm Sound S.A., a Dolby company, Barcelona,Spain 2Fundació Barcelona Media, Barcelona, Spain

This paper presents studies on perceptual aspects of spatial audio and their dependencyon the playback format. The first study regardsthe perception of sound in the presence of amasker in stereo, 5.1, and 3-D. Psychoacoustictests show that the detection threshold improveswith the spread of the masker, which justifies theclaim that individual elements of dense sound-tracks are more audible when they are distrib-uted in a wider panorama. The second study indicates that the perception of the reverberationlevel does not depend on the playback format.The joint interpretation of these results justifiesthe claim that in 3-D sound engineers can usehigher levels of reverberation without compro-mising the intelligibility of the sound sources. Convention Paper 8822

Workshop 3 Saturday, May 414:15 – 16:15 Sala Alighieri

SOUND MIXERS DISCUSS THEIR CRAFT

Chair: Brian McCarty, Coral Sea Studios Pty. Ltd., Clifton Beach, QLD, Australia

Panelists: Marco CoppolecchiaSimone CorelliCarols Zarattini

Cinema sound has traditionally been limited in fidelity because optical soundtracks, used until recently, wereincapable of delivering full-bandwidth audio to the the-aters. As digital cinema replaces film in distribution,sound mixers are now delivering uncompressed loss-less tracks to the audience. Leading sound mixers fromboth Hollywood and Europe will discuss their approachand methodology in producing award-winning sound-tracks.

Workshop 4 Saturday, May 414:15 – 15:45 Sala Manzoni

SEMANTIC ANALYSIS FOR SPEECH SIGNALS

Co-chairs: Jörn Loviscach, University of Applied Sciences, Bielefeld, GermanyChristian Uhle, Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany

Panelist: Yves Raimond, BBC R&D, London, UK

Semantic audio analysis is the automated extraction ofmeaning from audio signals, with applications in, e.g.,music information retrieval, intelligent audio editing, andautomated music transcription. The workshop focuses onthe semantic analysis of signals containing speech. Itdiscusses speech detection methods and their applica-tion to speech enhancement, broadcast monitoring, loud-ness control and audio coding, presents recent work onautomatic tagging of speech in the BBC World Service

Technical Program


radio archive, and gives an overview of possible futuredevelopments, e.g., speaker identification, emotionrecognition, and automated editing of speech recordings.The panelists will discuss the state of the art in semanticaudio analysis for speech and its challenges.

Session P4 Saturday, May 414:30 – 16:30 Sala Foscolo

AUDIO SIGNAL PROCESSING—PART 2

Chair: Leonardo Gabrielli, Università Politecnica delle Marche, Ancona, Italy

14:30

P4-1 User-Driven Quality Enhancement for AudioSignal—Danilo Comminiello, Simone Scardapane, Michele Scarpiniti, Aurelio Uncini,Sapienza University of Rome, Rome, Italy

Classical methods for audio and speech enhancement are often based on error-drivenoptimization strategies, such as the mean-square error minimization. However, these approaches do not always satisfy the quality requirements demanded by users of the system.In order to meet subjective specifications, we putforward the idea of a user-driven approach toaudio enhancement through the inclusion in theoptimization stage of an interactive evolutionaryalgorithm (IEA). In this way, performance of thesystem can be adapted to any user in a princi-pled and systematic way, thus reflecting the desired subjective quality. Experiments in thecontext of echo cancellation support the pro-posed methodology, showing significant statisti-cal advantage of the proposed framework withrespect to classical approaches.Convention Paper 8823

15:00

P4-2 Partial Spectral Flatness Measure for Tonality Estimation in a Filter Bank-BasedPsychoacoustic Model for Perceptual AudioCoding—Armin Taghipour,1 Maneesh ChandraJaikumar,1,2 Bernd Edler,1 Holger Stahl21International Audio Laboratories Erlangen, Erlangen, Germany 2Hochschule Rosenheim, University of Applied Science, Rosenheim, Germany

Perceptual audio codecs use psychoacousticmodels for irrelevancy reduction by exploitingmasking effects in the human auditory system.In masking, the tonality of the masker plays animportant role and therefore should be evaluatedin the psychoacoustic model. In this study a par-tial Spectral Flatness Measure (SFM) is appliedto a filter bank-based psychoacoustic model toestimate tonality. The Infinite Impulse Response(IIR) band-pass filters are designed to take intoaccount the spreading in simultaneous masking.Tonality estimation is adapted to temporal andspectral resolution of the auditory system. Employing subjective audio coding preferencetests, the Partial SFM is compared with predic-tion-based tonality estimation. Convention Paper 8824

15:30

P4-3 A New Approach to Model-Based Development for Audio Signal Processing—Carsten Tradowsky,1,2 Peter Figuli,1 Erik Seidenspinner,1 Felix Held,1 Jürgen Becker11Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany 2CTmusic, Karlsruhe, Germany

Today, digital audio systems are restricted intheir functionality. For example, a digital audioplayer still has a resolution of 16-bit and a sam-ple rate of 44.1 kHz. This relatively low qualitydoes not exhaust the possibilities given by mod-ern hardware for music production. In most cas-es, the functionality is described in software.This abstraction is very common these days, asonly few engineers understand the potential oftheir target hardware. The design-time increasessignificantly to develop efficiently for the targethardware. Because of the use of common com-piler tool chains the software is staticallymapped onto the hardware. This restricts thenumber of channels per processing core to aminimum when targeting high quality audio. Onepossibil i ty to close the productivity gap, described above, is to use a high-level model-based development approach. The audio signalprocessing flow is described in a more abstracthigh level using the model-based developmentapproach. This model is then platform-indepen-dently compiled including automatically generat-ed simulation and verification input. Platform-dependent code can be automatically generatedout of the verified model. This enables the evalu-ation of different target architectures and theirtrade-offs using the same model description.This paper presents a concept to use a model-based approach to describe audio signal pro-cessing algorithms. This concept is used to com-pile C- and HDL-code out of the same modeldescription to evaluate different target platforms.The goal of this paper is to compare trade-offsfor audio signal processing algorithms using amulticore Digital Signal Processor (DSP) targetplatform. Measurements using data parallelisminside the generated code show a significantspeedup on the multicore DSP platform. A con-clusion will be made regarding the usability ofthe proposed model-based tool flow as well asthe applicability on the multicore DSP platform. Convention Paper 8825

16:00

P4-4 Accordion Music and its Automatic Transcription to MIDI Format—Tomasz Maciejewski, Ewa Lukasik, Poznan University of Technology, Poznan, Poland

The paper is devoted to the problems related tothe automatic transcription of the accordionsound. The accordion is a musical instrumentfrom the free-reed aerophone family that is ableto produce polyphonic, multi-chord music. Firstthe playing modes are briefly characterized andproblems related to the polyphonic nature of thesound is discussed. Then the results of theanalysis and MIDI transcription of the right-sidemonophonic and polyphonic melodies are pre-sented. Finally, an attempt to transcribe music

�


generated by both sides of the instrumentrecorded in two channels is presented giving thefoundation to further research. Convention Paper 8827

Student/Career EventOPENING AND STUDENT DELEGATE ASSEMBLY MEETING – PART 1Saturday, May 4, 14:30 – 16:00Auditorium Loyola

Chair: Philip Waldenberger

Vice Chair:Marija Kovacina

The first Student Delegate Assembly (SDA) meeting isthe official opening of the Convention’s student programand a great opportunity to meet with fellow students fromall corners of the world. This opening meeting of the Stu-dent Delegate Assembly will introduce new events andelection proceedings, announce candidates for the com-ing year’s election for the European and InternationalRegions, announce the finalists in the four new recordingcompetition categories, and announce any upcomingevents of the Convention. Students and student sectionswill be given the opportunity to introduce themselves andtheir activities, in order to stimulate international con-tacts. Students will also have the opportunity to hearfrom various AES officers about opportunities and schol-arships available through the society. All students and educators are invited to participate in

this meeting. Election results and Recording CompetitionAwards will be given at the Student Delegate AssemblyMeeting–Part 2 on the final afternoon of the convention.

Session P5 Saturday, May 415:00 – 16:30 Foyer

POSTERS: SPEECH PROCESSING

15:00

P5-1 A Speech-Based System for In-Home Emergency Detection and Remote Assistance—Emanuele Principi,1 Danilo Fuselli,2 Stefano Squartini,1 Maurizio Bonifazi,2Francesco Piazza11Universitá Politecnica delle Marche, Ancona, Italy2FBT Elettronica Spa, Recanti (MC), Italy

This paper describes a system for the detectionof emergency states and for the remote assis-tance of people in their own homes. Emergen-cies are detected recognizing distress calls bymeans of a speech recognition engine. When anemergency is detected, a phone call is automati-cally established with a relative or friend bymeans of a VoIP stack and an Acoustic EchoCanceller. Several low-consuming embeddedunits are distributed throughout the house tomonitor the acoustic environment, and one cen-tral unit coordinates the system operation. Thisunit also integrates multimedia content deliveryservices and home automation functionalities.Being an ongoing project, this paper describesthe entire system and then focuses on the algo-rithms implemented for the acoustic monitoringand the hands-free communication services.Preliminary experiments have been conductedto assess the performance of the recognition

module in noisy and reverberated environmentsand the out of grammar rejection capabilities.Results showed that the implemented PowerNormalized Cepstral Coefficients extractionpipeline improves the word recognition accuracyin noisy and reverberated conditions, and that in-troducing a “garbage phone” in the acousticmodel allows to effectively reject out of grammarwords and sentences. Convention Paper 8828

15:00

P5-2 Assessment of Speech Quality in the DigitalAudio Broadcasting (DAB+) System—StefanBrachmanski, Maurycy Kin, Wroclaw Universityof Technology, Wroclaw, Poland

The methods for assessment of speech qualityfall into two classes: subjective and objectivemethods. This paper includes an overview of selected methods of subjective listening mea-surements (ACR, DCR) recommended by ITU-T.The influence of a bit-rate value on the soundquality was a subject of research presented inthis paper. The influence of the Spectral BandReplication (SBR) process on the speech qualitywas also investigated. The tested samples weretaken from the Digital Audio Broadcasting exper-imental emission in Poland as well as from an internet network. The subjective assessment forDAB speech signals has been performed withthe use of both ACR and DCR methods. Itturned out that SBR process influences signifi-cantly the speech quality at the lower bit-ratesmaking it as good as for higher bit-rates. It wasalso found that for higher bit-rate values (96kbit/s, or higher), the use of both methods caus-es the different results. Convention Paper 8829

15:00

P5-3 Investigation on Objective Quality Evaluationfor Heavily Distorted Speech—Mitsunori Mizumachi, Kyushu Institute of Technology, Kitakyushu, Fukuoka, Japan

Demand for evaluating speech quality is on the increase. It is advisable for evaluating the speechquality to employ the common objective measurefor the wide variety of adverse speech signals.Unfortunately, current speech quality measuresdo not suit for heavily distorted speech signals. Inthis paper both the applicability and the limit of theperceptual evaluation of speech quality (PESQ)are investigated compared with the subjectivemean opinion score (MOS) for noise-added andnoise-reduced speech signals. It is found that thePEAQs are compatible with the MOSs for thenoise-reduced speech signals in the non-station-ary noise conditions.Convention Paper 8830

15:00

P5-4 Novel 5.1 Downmix Algorithm with ImprovedDialogue Intelligibility—Kuba Lopatka, BartoszKunka, Andrzej Czyzewski, Gdansk University ofTechnology, Gdansk, Poland

A new algorithm for 5.1 to stereo downmix is

Technical Program


introduced that addresses the problem of dia-logue intelligibility. The algorithm utilizes proposedsignal processing algorithms to enhance the intel-ligibility of movie dialogue, especially in difficult lis-tening conditions or in compromised speaker set-up. To account for the latter, a playbackconfiguration utilizing a portable device, i.e., an ul-trabook, is examined. The experiments are pre-sented that confirm the efficiency of the intro-duced method. Both objective measurements andsubjective listening tests were conducted. Thenew downmix algorithm is compared to the outputof a standard downmix matrix method. The resultsof subjective tests prove that an improved dia-logue intelligibility is achieved.Convention Paper 8831

15:00

P5-5 Monaural Speech Source Separation by Estimating the Power Spectrum Using Multi-Frequency Harmonic Product Spectrum—David Ayllon, Roberto Gil-Pita,Manuel Rosa-Zurera, University of Alcala, Alcaláde Henares, Spain

This paper proposes an algorithm to performmonaural speech source separation by means oftime-frequency masking. The algorithm is basedon the estimation of the power spectrum of theoriginal speech signals as a combination of acarrier signal multiplied by an envelope. A Multi-Frequency Harmonic Product Spectrum (MF-HPS) algorithm is used to estimate the funda-mental frequency of the signals in the mixture.These frequencies are used to estimate both thecarrier and the envelope from the mixture. Bina-ry masks are generated comparing the estimat-ed spectra of the signals. Results show an important improvement in the separation in com-parison to the original algorithm that only usesthe information from the HPS. Convention Paper 8832

15:00

P5-6 The Effectiveness of Speech TransmissionIndex (STI) in Accounting for the Effects ofMultiple Arrivals—Timothy J. Ryan,1 RichardKing,2 Jonas Braasch,3William L. Martens41Webster University, St. Louis, MO, USA2McGill University, Montreal, Quebec, Canada 3Rensselaer Polytechnic Institute, Troy, NY, USA4University of Sydney, Sydney, NSW, Australia

The authors conducted concurrent experimentsemploying subjective evaluation methods to examine the effects of the manipulation of sever-al sound system design and optimization para-meters on the intelligibility of reinforced speech.During the course of these experiments, objec-tive testing methods were also employed tomeasure the Speech Transmission Index (STI)associated with each of the variable treatmentsused. Included in this paper is a comparison ofthe results of these two testing methods. The results indicate that, while STI is capable of detecting many effects of multiple arrivals, it appears to overestimate the degradation to intel-ligibility caused by multiple arrivals with short delay times. Convention Paper 8833

15:00

P5-7 Introducing Synchronization of Speech Mixtures in Blind Sparse Separation Problems—Cosme Llerena, Lorena Álvarez,Roberto Gil-Pita, Manuel Rosa-Zurera, University of Alcala, Alcalá de Henares, Spain

This paper explores the feasibility of using syn-chronization of speech mixtures prior to blindsparse source separation methods in order toimprove their results. Broadly, methods that assume sparse sources use level and phase dif-ferences between mixtures as their features, andthey separate signals from them. If each mixtureis considerably delayed with respect to the restof them, the information extracted from these dif-ferences can be wrong. With this idea in mind,this paper will focus on using Time Delay Esti-mation algorithms in order to synchronize themixtures and observing the improvement that itprovokes in a Blind Sparse Source Separation algorithm. The results obtained show the feasibilityof using synchronization of the speech mixtures.Convention Paper 8834

15:00

P5-8 An Embedded-Processor Driven Test Benchfor Acoustic Feedback Cancellation in RealEnvironments—Francesco Faccenda, StefanoSquartini, Emanuele Principi, Leonardo Gabrielli,Francesco Piazza, Universitá Politecnica dellaMarche, Ancona (AN), Italy

In order to facilitate the communication amongspeakers, speech reinforcement systemsequipped with microphones and loudspeakersare employed. Due to the acoustic couplings between them, the speech intelligibility may beruined and, moreover, high channel gains coulddrive the system to instability. Acoustic Feed-back Cancellation (AFC) methods need to beapplied to keep the system stable. In this papera new Test Bench for testing AFC algorithms inreal environments is proposed. It is based on theTMS320C6748 processor, running the Suppres-sor-PEM algorithm, a recent technique based onthe PEM-AFROW paradigm. The partitionedblock frequency domain adaptive f i l ter (PB-FDAF) paradigm has been adopted to keepthe computational complexity low. A professionalsound card and a PC, where an automatic gaincontroller has been implemented to prevent signalclipping, complete the framework. Several experi-mental tests confirmed the framework suitability tooperate under diverse acoustic conditions.Convention Paper 8835


Technical Committee Meeting on Hearing and Hearing Loss Prevention


Technical Committee Meeting on Network Audio Signals


Standards Committee Meeting on SC-04-01,Acoustics and Sound Source Modeling �


Tutorial 2 Saturday, May 416:30 – 18:30 Auditorium Loyola

ARE WE THERE YET? THE ULTIMATEULTRA-PORTABLE PRODUCTION/RECORDINGSTUDIO. FROM IDEA TO FINAL MASTER: HOW TOWRITE, SEQUENCE, PRODUCE YOUR MUSIC, ANDCONTROL YOUR STUDIO USING ONLY YOUR IPAD

Presenter: Andrea Pejrolo

In this highly interactive and hands-on presentation youwill learn the tools, techniques, tips, and tricks requiredto write, produce, and mix a song using only your iPad. Through practical examples and scenarios you will learn

how to:

• Pick the best software for sequencing, producing,and mixing your music • Pick the best iPad-compatible hardware tools (micro-

phones, audio interface, MIDI interfaces, controller, etc.) • Setup your mobile production studio • Sketch your musical ideas • Use your iPad as a creative inspirational tool for mu-

sic composition and sound design • Sequence and arrange your music ideas on your

iPad • Add real instruments and vocals • Do a final professional mix • Master your final mix

In addition, through practical examples and scenarios,you will learn: how to set up your studio in order to useyour iOS device as a controller; configuration of iOS device for Logic Pro, Pro Tools, Digital Performer, Live,and Cubase/Nuendo; wireless (MIDI over WI-FI) andwired MIDI connection; proprietary versus open source(OSC, etc.) options; designing your own graphic inter-face and controllers; ergonomic aspects of using youriOS device in the studio. Who should attend? Anyone who wants to create

some music with their iPads, from beginners to ad-vanced, Musicians, producers, recording engineers,home studio owners from intermediate to advanced.

Tutorial 3 Saturday, May 416:30 – 18:30 Sala Alighieri

MUSIC PRODUCTION FOR FILM

Presenters:Simon FranglenBrian McCarty

Music production is the last piece of the production puz-zle to be undertaken in a film production. It occurs afterthe film is edited and just prior to sound mixing. Thecomposers have developed new tools to prepare the music including temp synth scores and other techniques.Simon Franglin is a Golden Globe nominee and hasworked on Avatar, Titanic, and Skyfall 007, among manyothers.

Tutorial 4 Saturday, May 417:00 – 18:30 Sala Foscolo

UNDERSTANDING MICROPHONE SPECIFICATIONS

Presenter: Eddy B. Brixen, EBB-consult, Smorum, Denmark

There are lots and lots of microphones available to the audio engineer. The final choice is often made on the

basis of experience or perhaps just habits. (Sometimesthe mic is chosen because of the looks … .) Nevertheless,there is good information to be found in the microphonespecifications. This tutorial will present the most importantmicrophone specs and provide the attendee with up-to-date information on how these specs are obtained and understood. It also takes a critical look on how specs arepresented to the user, what to look for and what to expect. This tutorial is a follow up on the project X85, which

has been taking place in the AES standards committee(SC 04-04).

Professional Training Session 1 Saturday, May 417:00 – 19:00 Sala Manzoni

THE USE OF ANALOG GEAR IN THE DIGITAL ERA

Presenter: Alan Hyatt

The history of PMI Audio Group and connected brands.Why we do what we do: a tour about the technical/philo-sophical differences that makes our brands special andwhat it is that we do differently from other companies.


Technical Committee Meeting on Semantic AudioAnalysis


Standards Committee Meeting on SC-02-08, AudioFile Transfer and Exchange

Professional Training Session 2 Saturday, May 417:30 – 18:30 Sala Carducci

WIDE BANDWIDTH EQUALIZATION IN THE ANALOG DOMAIN

Based on research related to the sensitivity of our earsthroughout the full bandwidth of our hearing, the Char-terOak PEQ-1 provides extremely musical equalization ina stereo tone control of unparalleled musicality. Utilizinga unique circuit, and power distribution, along with aunique configuration with respect to bandwidth and cutand boost around the center frequency choices, thePEQ-1 achieves extremely powerful and musical equal-ization with minimal phase shift at the high frequencies,and extremely high headroom and low distortion.

Special EventOPEN HOUSE OF THE TECHNICAL COUNCIL AND THE RICHARD C. HEYSER MEMORIAL LECTURE Saturday, May 4, 19:00 – 20:00Auditorium Loyola

Lecturer: Wolfgang Klippel

The Heyser Series is an endowment for lectures by emi-nent individuals with outstanding reputations in audio engi-neering and its related fields. The series is featured twiceannually at both the United States and European AESConventions. Established in May 1999, The Richard C.Heyser Memorial Lecture honors the memory of RichardHeyser, a scientist at the Jet Propulsion Laboratory, whowas awarded nine patents in audio and communicationtechniques and was widely known for his ability to clearlypresent new and complex technical ideas. Heyser wasalso an AES governor and AES Silver Medal recipient.The Richard C. Heyser distinguished lecturer for the

Technical Program


134th AES Convention is Wolfgang Klippel. Klippel stud-ied electrical engineering at the University of Technologyof Dresden. After graduating in speech recognition, hejoined a loudspeaker company in the eastern part of Ger-many. He was engaged in research on transducer model-ing, acoustic measurement, and psychoacoustics. In1987 he received a Ph.D. in technical acoustics. Afterspending a post-doctoral year at the Audio ResearchGroup in Waterloo, Canada, and working at Harman International, Northridge, CA, he went back to Dresden in1997 and founded Klippel GmbH which develops novelkinds of control and measurements systems dedicated toloudspeakers and other transducers. In 2007 he becameprofessor of electro-acoustics at the University in Dres-den. He is an AES Fellow, holds the AES publicationaward of 1993, and the ALMA Titanium Driver Award.The title of his lecture is, “Small, Loud-Speakers: TakingPhysics to the Limit.”The loudspeaker is the weakest part in the audio chain.

This statement might be true because the electro-acousti-cal transducer is one of the remains of the analog era stillusing a moving coil and a diaphragm, much as it was acentury ago. A more important argument is the low effi-ciency of the loudspeaker generating more heat thansound power output while adding undesired distortion tothe output signal. The transducer, enclosure, and otheracoustical elements increasingly determine the size,weight, and cost of the audio system because electronicparts become smaller or are replaced by digital signalprocessing.At the moment there is no alternative transduction prin-

ciple that is mature enough to compete with the conven-tional techniques. Therefore the moving coil as the bestpractice is preserved and interlaced with new ideas pro-vided by research, design, and manufacturing. This evo-lutionary process also results in significant progress in theabsence of a revolutionary change. The development ofsmaller loudspeakers for personal audio equipment, auto-motive, and professional applications is a visible exampleof this process. The generation of sufficient acousticaloutput at acceptable quality requires higher amplitudes inthe mechanical system and pushes the working range tothe physical limits.The cultivation of large signal transducer performance

is the central topic in this lecture. In the eighties the non-linear and thermal modeling of the transducer attractedthe interest of more and more researchers eventually resulting in a reliable theory. The theory has been evalu-ated by fitting the models to real transducers and measur-ing the large signal parameters dynamically. This identifi-cation technique requires only the electrical signal atloudspeaker terminals and uses the back EMF for sens-ing the velocity of the voice coil. Adaptive measurementtechniques using the loudspeaker itself as sensor havebeen developed to monitor the behavior of the loudspeak-er in normal applications (e.g., car) while reproducing music or other audio signals. Recorded parameter variationand state variables (e.g. voice coil displacement) reveal theaging of the suspension and the impact of the climate.The nonlinear and thermal parameters open a new way

for loudspeaker diagnostics because these parameters describe the properties of the loudspeaker itself without theinteraction with a stimulus (e.g., test signal, music). Numer-ical simulation tools (e.g., FEM) applied to magnetic andsuspension systems show the relationship between materi-al, geometry, and the nonlinear parameters. However, onlya measurement can reveal an offset in the rest position ofthe voice coil caused by suspension parts made out of fab-ric, rubber, foam, and other visco-elastic material.Operating loudspeakers at high amplitudes increases

the risk of damaging the loudspeaker. The large signal

model provides all state information (e.g., displacementand temperature of the voice coil) indicating a mechanicaland thermal overload of the transducer. In woofers themotor and suspension nonlinearity is used to limit themaximum peak displacement and to prevent the voicecoil from bottoming. New measurement techniques havebeen developed to detect a rubbing voice coil, loose parti-cles, and other loudspeaker defects that produce impul-sive distortion unacceptable to the human ear. The mea-surement of harmonic and intermodulation distortioncannot provide a comprehensive description of the largesignal behavior as it reveals only symptoms of the nonlin-earities depending on the choice of the stimulus. In con-trast, the extended large signal model can be used to pre-dict all state variables and the output signal for anarbitrary input signal. The effect of each nonlinearity andcooling mechanisms can be analyzed and design choicescan be evaluated before the first prototype is finished.Novel auralization techniques make it possible to enhance or attenuate different kinds of signal distortion bya user defined scaling factor and to assess the audibilityand impact on sound quality by listening tests or percep-tual models. The closer link between physical and per-ceptual assessment opens new ways for defining the tar-get performance of an audio product more accurately inmarketing, developing products at optimal performance-cost ratio, and for ensuring sufficient quality in manufac-turing. This optimization process leads to more or lessnonlinear transducers having a clearly defined (regular)nonlinear characteristic. Distortion may become audiblefor a critical stimulus but is accepted in a trade-off be-tween maximum output, efficiency, size, weight, and cost.The main objective in taming the transducer nonlinearitiesis the motor stability, to keep the voice coil in the gap, andto gain maximum peak to peak displacement. Asymmetri-cal curve shapes of the nonlinearities are reduced toavoid a DC-displacement of the coil generated by a non-linear rectification process that reduces efficiency andmay cause bottoming of the voice coil.Portable applications where power consumption and

battery capacity is an issue will require transducers withthe highest efficiency and minimal use of resources (e.g.,neodymium-magnet). Such a “green” speaker is a highlynonlinear speaker using a motor topology where the voicecoil exploits the magnet field in the gap and gives thehighest force factor value Bl(x) at the rest position x=0.Unfortunately, the varying force factor Bl(x) generates sig-nificant intermodulation distortion throughout the audioband if the voice coil is displaced and windings leave thegap. This undesired side-effect of an efficient motor struc-ture can be compensated by an inverse nonlinear prepro-cessing of the electrical input signal. The control algo-rithms use the large signal transducer model and identifythe parameters adaptively. By merging electro-acousticsand signal processing the loudspeaker becomes a self-learning system providing optimum performance over thelifetime of the audio product. Certainly, this is not the laststep in the evolution of the loudspeaker ...

Student/Career EventAES STUDENT MEET-UP! Saturday, May 4, 22:00 – 24:00Spanish Steps

Audio Students! We’re all going to “meet-up” at theSpanish Steps for a late evening get together at one ofRome’s most iconic locations. We’ll hang out, talk, playsome ukulele, and meet new audio student friends fromaround the globe. Wear your AES 134th Conventionbadge so you’ll be easily identifiable.

�


Session P6 Sunday, May 509:00 – 13:00 Sala Carducci

RECORDING AND PRODUCTION

Chair: Alex Case, University of Massachusetts –Lowell, Lowell, MA, USA

09:00

P6-1 Automated Tonal Balance Enhancement forAudio Mastering Applications—Stylianos-Ioannis Mimilakis,1 Konstantinos Drossos,2Andreas Floros,2 Dionysios Katerelos11Technological Educational Institute of Ionian Island, Lixouri, Greece 2Ionian University, Corfu, Greece

Modern audio mastering procedures are involved with the selective enhancement or attenuation of specific frequency bands. Themain reason is the tonal enhancement of theoriginal / unmastered audio material. The afore-mentioned process is mostly based on the musi-cal information and the mode of the audio material. This information can be retrieved froma listening procedure of the original stimuli, orthe correspondent musical key notes. The cur-rent work presents an adaptive and automatedequalization system that performs the aforemen-tioned mastering procedure, based on a novelmethod of fundamental frequency tracking. Inaddition to this, the overall system is being evaluated with objective PEAQ analysis andsubjective listening tests in real mastering audioconditions. Convention Paper 8836

09:30

P6-2 A Pairwise and Multiple Stimuli Approach to Perceptual Evaluation of MicrophoneTypes—Brecht De Man, Joshua D. Reiss,Queen Mary University of London, London, UK

An essential but complicated task in the audioproduction process is the selection of micro-phones that are suitable for a particular source.A microphone is often chosen based on price orcommon practices, rather than whether the mi-crophone actually sounds best in that particularsituation. In this paper we perceptually assesssix microphone types for recording a femalesinger. Listening tests using a pairwise and mul-tiple stimuli approach are conducted to identifythe order of preference of these microphonetypes. The results of this comparison are dis-cussed, and the performance of each approachis assessed. Convention Paper 8837

10:00

P6-3 Comparison of Power Supply Pumping ofSwitch-Mode Audio Power Amplifiers withResistive Loads and Loudspeakers asLoads—Arnold Knott, Lars Press Petersen,Technical University of Denmark, Kgs. Lyngby, Denmark

Power supply pumping is generated by switch-mode audio power amplifiers in half-bridge con-

figuration, when they are driving energy backinto their source. This leads in most designs to arising rail voltage and can be destructive for ei-ther the decoupling capacitors, the rectifierdiodes in the power supply or the power stage ofthe amplifier. Therefore precautions are taken bythe amplifier and power supply designer to avoidthose effects. Existing power supply pumpingmodels are based on an ohmic load attached tothe amplifier. This paper shows the analyticalderivation of the resulting waveforms and ex-tends the model to loudspeaker loads. Measure-ments verify that the amount of supply pumpingis reduced by a factor of four when comparingthe nominal resistive load to a loudspeaker. Asimplified and more accurate model is proposedand the influence of supply pumping on the audio performance is proven to be marginal. Convention Paper 8838

10:30

P6-4 The Psychoacoustic Testing of the 3-D Multi-format Microphone Array Design and the Ba-sic Isosceles Triangle Structure of the Arrayand the Loudspeaker Reproduction Configu-ration—Michael Williams, Sounds of Scotland,Le Perreux sur Marne, France

Optimizing the loudspeaker configuration for 3-D microphone array design can only be achievedwith a clear knowledge of the psychoacousticparameters of reproduction of height localization.Unfortunately HRTF characteristics do not seemto give us useful information when applied toloudspeaker reproduction. A set of psychoa-coustic parameters have to be measured for different configurations in order to design an effi-cient microphone array recording system, evenmore so, if a minimalistic approach to array design is going to be a prime objective. In partic-ular the position of a second layer of loudspeak-ers with respect to the primary horizontal layer isof fundamental importance and can only bebased on the psychoacoustics of height percep-tion. What are the localization characteristics between two loudspeakers situated in each ofthe two layers? Is time difference as against lev-el difference a better approach to interlayer localization? This paper will try to answer thesequestions and also justify the basic isosceles tri-angle loudspeaker structure that will help to opti-mize the reproduction of height information. Convention Paper 8839

11:00

P6-5 A Perceptual Audio Mixing Device—Michael J. Terrell, Andrew J. R. Simpson, Mark B. Sandler, Queen Mary University of London, London, UK

A method and device is presented that allowsnovice and expert audio engineers to performmixing using perceptual controls. In this paperwe use Auditory Scene Analysis [Bregman,1990, MIT Press, Cambridge] to relate the multitrack component signals of a mix to the per-ception of that mix. We define the multitrackcomponents of a mix as a group of audiostreams, which are transformed into soundstreams by the act of reproduction, and which

Technical Program


are ultimately perceived as auditory streams bythe listener. The perceptual controls provide direct manipulation of loudness balance within amixture of sound streams, as well as the overallmix loudness. The system employs a computa-tional optimization strategy to perform automaticsignal gain adjustments to component audio-streams, such that the intended loudness balance of the associated sound-streams is produced. Perceptual mixing is performed usinga complete auditory model, based on a model ofloudness for time-varying sound streams [Glas-berg and Moore, J. Audio Eng. Soc., vol. 50,331-342 (2002 May)]. The use of the auditorymodel enables the loudness balance to be auto-matically maintained regardless of the listeninglevel. Thus, a perceptual definition of the mix ispresented that is listening-level independent,and a method of realizing the mix practically is given. Convention Paper 8840

11:30

P6-6 On the Use of a Haptic Feedback Device for Sound Source Control in Spatial Audio Systems—Frank Melchior, Chris Pike, Matthew Brooks, Stuart Grace, BBC Research and Development, Salford, UK

Next generation spatial audio systems are likelyto be capable of 3-D sound reproduction. Sys-tems currently under discussion require thesound designer to position and manipulatesound sources in three dimensions. New intu-itive tools, designed to meet the requirements ofaudio production environments, are needed tomake efficient use of this new technology. Thispaper investigates a haptic feedback controlleras a user interface for spatial audio systems.The paper will give an overview of conventionaltools and controllers. A prototype has been developed based on the requirements of differ-ent tasks and reproduction methods. The imple-mentation will be described in detail and the results of a user evaluation will be given. Convention Paper 8841

12:00

P6-7 Audio Level Alignment—Evaluation Methodand Performance of EBU R 128 by AnalyzingFader Movements—Jon Allan, Jan Berg, LuleåUniversity of Technology, Piteå, Sweden

A method is proposed for evaluating audio meters in terms of how well the engineer con-forms to a level alignment recommendation andsucceeds to achieve evenly perceived audio lev-els. The proposed method is used to evaluatedifferent meter implementations, three conform-ing to the recommendation EBU R 128 and oneconforming to EBU Tech 3205-E. In an experi-ment, engineers participated in a simulated livebroadcast show and the resulting fader move-ments were recorded. The movements were analyzed in terms of different characteristics:fader mean level, fader variability, and fadermovement. Significant effects were found show-ing that engineers do act differently dependingon the meter and recommendation at hand. Convention Paper 8842

12:30

P6-8 Balance Preference Testing Utilizing a System of Variable Acoustic Condition—Richard King,1,2 Brett Leonard,1,2 ScottLevine,1,2 Grzegorz Sikora31McGill University, Montreal, Quebec, Canada2The Centre for Interdisciplinary Research in Music Media and Technology, Montreal, Quebec, Canada 3Bang & Olufsen Deutschland GmbH, Pullach, Germany

In the modern world of audio production, thereexists a significant disconnect between the music mixing control room of the audio profes-sional and the listening space of the consumer orend user. The goal of this research is to evaluatea series of varying acoustic conditions commonlyused in such listening environments. Expert lis-teners are asked to perform basic balancingtasks, under varying acoustic conditions. The lis-tener can remain in position while motorized pan-els rotate behind a screen, exposing a differentacoustic condition for each trial. Results showthat listener fatigue as a variable is thereby elimi-nated, and the subject’s aural memory is quicklycleared, so that the subject is unable to adapt tothe newly presented condition for each trial.Convention Paper 8843

Session P7 Sunday, May 509:00 – 11:30 Sala Foscolo

TRANSDUCERS—PART 1: LOUDSPEAKERS

Chair: Balazs Bank, Budapest University of Technology and Economics, Budapest, Hungary

09:00

P7-1 Distortion Improvement in the Current Coil of a Loudspeaker—Gaël Pillonnet,1 EricSturtzer,1 Timothé Rossignol,2 Pascal Tournier,2Guy Lemarquand31University of Lyon, Lyon, France2ONSemiconductor, Toulouse, France3Universite du Mans, Le Mans Cedex 9, France

This paper deals with the comparison of voltageand current driving units in an active audio system.The effect of the audio amplifier control on the cur-rent coil of an electrodynamic loudspeaker is pre-sented. In voltage control topology, the electro-magnetic force linked to coil current is controlledthrough the load impedance. Thus, the electro-mechanical conversion linearity is decreased bythe impedance variation, which implies a reductionof the overall audio quality. A current drivingmethod could reduce the effect of the non-linearimpedance by controlling the coil current directly,thereby the acceleration. Large signal impedancemodeling is given in this paper to underline thenon-linear effects of electrodynamic loudspeakerparameters on the coupling. As a result, the practi-cal comparison of voltage and current drivenmethods proves that the current control reducesthe voice coil current distortions in the three differ-ent loudspeakers under test. Convention Paper 8844

�


09:30

P7-2 Driving Electrostatic Transducers—DennisNielsen, Arnold Knott, Michael A. E. Andersen,Technical University of Denmark, Kgs. Lyngby,Denmark

Electrostatic transducers represent a very interesting alternative to the traditional ineffi-cient electrodynamic transducers. In order toestablish the full potential of these transducers,power amplifiers that fulfill the strict require-ments imposed by such loads (high imped-ance, frequency depended load, and high biasvoltage for linearization) must be developed.This paper analyzes power stages and biasconfigurations suitable for driving an electro-static transducer. Measurement results of a ± 300 V prototype amplifier are shown. Mea-suring THD across a high impedance source isdiscussed and a high voltage attenuation inter-face for an audio analyzer is presented. THDbelow 0:1% is reported.Convention Paper 8845

10:00

P7-3 Boundary Element Simulation of an Arrayable Loudspeaker Horn—Tommaso Nizzoli,1 Stefano Prati21Acoustic Vibration Consultant, Reggio Emilia, Italy2RCF S.p.A., Reggio Emilia, Italy

The Boundary Element method implemented ina commercial code is used to ver i fy theacoustic directional characteristic in the farfield of an arrayable loudspeaker’s horn incomparison with the full space, far field, mea-sured acoustic balloon. A simple model of thefull arrayable loudspeaker horn’s splay ideal-izes each source only by the calculated emis-sion on a flat plane at each horn’s mouth. Thisapproach reduces significantly the BEM’s cal-culation time with regard to having to modeleach source by its full geometry. Comparisonwith the full space, far field, predicted pressurefrom test data shows good agreement in all thefrequency range of interest.Convention Paper 8846

10:30

P7-4 Single Permanent Magnet Co-Axial Loudspeakers—Dimitar Dimitrov, BMS Production, Sofia, Bulgaria

Co-axial loudspeakers are designed with a sin-gle ring permanent magnetic structure assuringdual flux path for the two voice coil gaps. Internalmagnet is used in two realizations with parallelflux division at both its diameters. These two varieties are convenient for Nd magnets and oneof them has its local internal flux path crossingtwo gaps in series for dual membrane compres-sion driver implementation. Another realizationuses an external permanent magnetic structurewith series flux through the gaps. Proposed co-axial loudspeaker types are very compact,simple, and l ightweight. They all can use“Stepped Gap” designs for their low frequencyvoice coils. Comparative measurements withconventional co-axial loudspeakers reveal com-

petitive performance with much reduced weightand production cost. Convention Paper 8847 [Paper presented by Plamen Valtchev]

11:00

P7-5 Multiple Low-Frequency Corner Folded Horn Designs—Rumen Artarski,1 Plamen Valtchev,2Dimitar Dimitrov,3 Yovko Handzhiev31AVC, Sofia, Bulgaria 2Univox, Sofia, Bulgaria 3BMS Production, Sofia, Bulgaria

Low-Frequency horn loaded systems for pi/2 radiation are designed employing push-push operated dual membrane loudspeakers, closelymounted in parallel against each other. TreeAxis Symmetrical and efficient horn mouth load-ing is transformed to a symmetrical and uniformmembrane cone loading. By doubling loud-speaker diaphragm, and respectively the hornthroat, lower horn cut-off frequency wasachieved with the same extension rate, besidesacoustic power doubling. Two such corner hornsystems could be stacked together for quarterspace pi-loading, with important usage in front-of-stage subwoofer applications. Four horn sys-tems are grouped together over the floor or onthe ceiling for 2 pi-radiation. Finally, eight suchsystems could be united for full space low-frequency radiation. Convention Paper 8848

Loudness 1 Sunday, May 509:00 – 10:00 Sala Alighieri

LOUDNESS 101—A HITCHHIKER'S GUIDE TO AUDIO NIRVANA

Presenter: Florian Camerer, ORF - Austrian TV; Vienna, Austria; EBU - European Broadcasting Union

This session will bring participants up to speed regardingmost aspects of loudness control and metering. It is tar-geted to sound engineers in general, giving a brief introto the algorithm and the metering paradigms and thenexpanding to common misunderstandings, dangers aswell as chances and challenges. Some new conceptslike “gating” and “true peak level” will be explained in detail. As chairman of the European loudness groupPLOUD and senior post-pro mixing engineer at ORF(Austrian TV), Florian Camerer has enough experienceunder his belt to provide a thorough workout!

Student/Career EventSTUDENT DESIGN COMPETITIONSunday, May 5, 09:00 – 10:30Auditorium Loyola

Moderator: Colin Pfund, Chair, AES SDA, North andLatin

American Regions

The three graduate level and three undergraduate levelfinalists of the AES Student Design Competition will pre-sent and defend their designs in front of a panel of expertjudges. This is an opportunity for aspiring student hard-ware and software engineers to have their projects reviewed by the best minds in the business. It's also an

Technical Program


invaluable career-building event and a great place forcompanies to identify their next employees. Studentsfrom both audio and non-audio backgrounds are encour-aged to submit entries. Few restrictions are placed onthe nature of the projects, but designs must be for usewith audio. Examples include loudspeaker design, DSPplug-ins, analog hardware, signal analysis tools, mobileapplications, and sound synthesis devices. Productsshould represent new, original ideas implemented inworking-model prototypes.The Student Design Competition Judges are: Aurelio

Uncini, Sapienza University of Rome; Stefano Daino,DSP-Quattro; Patrizio Pisani, ZP Engineering; RobertoMagalotti, B&C Speakers.

Sunday, May 5 09:00 Sala Saba

Technical Committee Meeting on Acoustics and Sound Reinforcement

Sunday, May 5 09:00 Sala Montale

Standards Committee Meeting on SC-02-01, DigitalAudio Measurement Techniques

Session P8 Sunday, May 509:30 – 11:00 Foyer

POSTERS: AUDIO PROCESSING AND SEMANTICS

09:30

P8-1 Combination of Growing and Pruning Algorithms for Multilayer Perceptrons forSpeech/Music/Noise Classification in DigitalHearing Aids—Lorena Álvarez, Enrique Alexandre, Cosme Llerena, Roberto Gil-Pita,Manuel Rosa-Zurera, University of Alcala, Alcaláde Henares, Spain

This paper explores the feasibility of combiningboth growing and pruning algorithms in someway that the global approach results in finding asmaller multilayer perceptron (MLP) in terms ofnetwork size, which enhances the speech/mu-sic/noise classification performance in digitalhearing aids, with the added bonus of demand-ing a lower number of hidden neurons, and con-sequently, lower computational cost. With this inmind, the paper will focus on the design of anapproach that starts adding neurons to an initialsmall MLP until the stopping criteria for thegrowing stage is reached. Then, the MLP size isreduced by successively pruning the least signif-icant hidden neurons while maintaining a contin-uous decreasing function. The results obtainedwith the proposed approach will be comparedwith those obtained when using both growingand pruning algorithms separately. Convention Paper 8850

09:30

P8-2 Automatic Sample Recognition in Hip-HopMusic Based on Non-Negative Matrix Factorization—Jordan L. Whitney, Colby N.Leider, University of Miami, Coral Gables, FL,USA

We present a method for automatic detection ofsamples in hip-hop music. A sample is definedas a short extraction from a source audio corpus

that may have been embedded into another audio mixture. A series of non-negative matrixfactorizations are applied to spectrograms of hip-hop music and the source material from a mas-ter corpus. The factorizations result in matricesof base spectra and amplitude envelopes for theoriginal and mixed audio. Each window of themixed audio is compared to the original audioclip by examining the extracted amplitude en-velopes. Several image-similarity metrics areemployed to determine how closely the samplesand mixed amplitude envelopes match. Prelimi-nary testing indicates that, distinct from existingaudio fingerprinting algorithms, the algorithm wedescribe is able to confirm instances of samplingin a hip-hop music mixture that the untrained lis-tener is frequently unable to detect. Convention Paper 8851

09:30

P8-3 Performance Optimization of GCC-PHAT forDelay and Polarity Correction under RealWorld Conditions—Nicholas Jillings, Alice Clifford, Joshua D. Reiss, Queen Mary University of London, London, UK

When coherent audio streams are summed, de-lays can cause comb filtering and polarity inver-sion can result in cancellation. The GCC-PHATalgorithm is a popular method for detecting (andhence correcting) the delay. This paper exploresthe performance of the Generalized Cross Cor-relation with Phase Transform (GCC-PHAT) fordelay and polarity correction, under a variety ofdifferent conditions and parameter settings, andoffers various optimizations for those conditions.In particular, we investigated the performancefor moving sources, background noise, and reverberation. We consider the effect of varyingthe size of the Fourier Transform when perform-ing GCC-PHAT. In addition to accuracy, compu-tational efficiency and latency were also used asmetrics of performance. Convention Paper 8852

09:30

P8-4 Reducing Binary Masking Artifacts in BlindAudio Source Separation—Toby Stokes,Christopher Hummersone, Tim Brookes, University of Surrey, Guildford, Surrey, UK

Binary masking is a common technique for sepa-rating target audio from an interferer. Its use isoften justified by the high signal-to-noise ratioachieved. The mask can introduce musical noiseartifacts, limiting its perceptual performance andthat of techniques that use it. Three mask-pro-cessing techniques, involving adding noise orcepstral smoothing, are tested and theprocessed masks are compared to the ideal bi-nary mask using the perceptual evaluation foraudio source separation (PEASS) toolkit. Eachprocessing technique's parameters are opti-mized before the comparison is made. Eachtechnique is found to improve the overall percep-tual score of the separation. Results show atrade-off between interferer suppression and artifact reduction. Convention Paper 8853

�


09:30

P8-5 Detection of Sinusoids Using StatisticalGoodness-of-Fit Test—Pushkar P. Patwardhan, Ravi R. Shenoy, Nokia India Pvt.Ltd., Bangalore, India

Detection of tonal components from magnitudespectrum is an important initial step in severalspeech and audio processing applications. Inthis paper we present an approach for detectingsinusoidal components from the magnitudespectrum using “goodness-of-fit” test. The keyidea is to test the null-hypothesis that the regionof spectrum under observation is drawn from themagnitude spectrum of an ideal windowed-sinu-soid. This hypothesis is tested with a chi-square“goodness-of-fit” test. The outcome of this hypothesis test is a decision about the presenceof sinusoid in the observed region of magnitudespectrum. We have evaluated the performanceof the proposed approach using syntheticallygenerated samples containing steady and modu-lated harmonics in clean and noisy conditions. Convention Paper 8854

09:30

P8-6 Novel Designs for the Parametric Peaking EQUser Interface for Single Channel CorrectiveEQ Tasks—Christopher Dewey, JonathanWakefield, University of Huddersfield, Huddersfield, West Yorkshire, UK

This paper evaluates the suitability of existingparametric peaking EQ interfaces of analog anddigital mixing desks and audio plugins for singlechannel corrective EQ tasks. It proposes novelalternatives based upon displaying FFT bin max-imums for the full audio duration behind the EQcurve, automatically detecting and displaying thetop five FFT bin maximum peaks to assist theengineer, an alternative numerical list display oftop five FFT bin maximum peaks, and an inter-face that allows direct manipulation of the dis-played FFT bin maximums. All interfaces wereevaluated based on the time taken to perform acorrective EQ task, preference ranking, andqualitative comments. Results indicate that thenovel EQ interfaces presented have potentialover existing EQ interfaces. Convention Paper 8855

09:30

P8-7 Drum Replacement Using Wavelet Filtering—Robert Barañski,1 Szymon Piotrowski,1Magdalena Plewa21AGH University of Science and Technology, Kracow, Poland 2Gdansk University of Technology, Gdansk, Poland

The paper presents the solution that can beused to unify snare drum sound within a chosenfragment. The algorithm is based on the wavelettransformation and allows replacement of sub-bands of particular sounds, which are outside acertain range. Five experienced sound engi-neers put the algorithm under the test usingsamples of five different snare drums. Waveletfiltering seems to be useful in terms of drum replacement, while the sound engineers

response was, in the most cases, positive. Convention Paper 8856

09:30

P8-8 Collaborative Annotation Platform for AudioSemantics—Nikolaos Tsipas, Charalampos A.Dimoulas, George M. Kalliris, George Papanikolaou, Aristotle University of Thessaloniki, Thessaloniki, Greece

In the majority of audio classification tasks thatinvolve supervised machine learning, groundtruth samples are regularly required as traininginputs. Most researchers in this field usually annotate audio content by hand and for their individual requirements. This practice resulted inthe absence of solid datasets and consequentlyresearch conducted by different researchers onthe same topic cannot be effectively pulled together and elaborated on. A collaborative audio annotation platform is proposed for bothscientific and application oriented audio-seman-tic tasks. Innovation points include easy opera-tion and interoperability, on the fly annotationwhile playing audio content online, efficient col-laboration with feature engines and machinelearning algorithms, enhanced interaction, andpersonalization via state of the art Web 2.0 /3.0services. Convention Paper 8857

09:30

P8-9 Investigation of Wavelet Approaches forJoint Temporal, Spectral and Cepstral Features in Audio Semantics—CharalamposA. Dimoulas, George M. Kalliris, Aristotle University of Thessaloniki, Thessaloniki, Greece

The current paper focuses on the investigationof wavelet approaches for joint time, frequency,and cepstral audio feature extraction. Waveletshave been thoroughly studied over the lastdecades as an alternative signal analysis approach. Wavelet-features have also been suc-cessfully implemented in a variety of patternrecognition applications, including audio seman-tics. Recently, wavelet-adapted mel-frequencycepstral coefficients have been proposed as applicable features in speech recognition andgeneral audio classification, incorporating per-ceptual attributes. In this context, variouswavelet configuration-schemes are examined forwavelet-cepstral audio features extraction. Addi-tional wavelet parameters are utilized in the for-mation of wavelet-feature-vectors and evaluatedin terms of salient feature ranking. Comparisonswith classical time-frequency and cepstral audiofeatures are conducted in typical audio-seman-tics scenarios. Convention Paper 8858

Tutorial 5 Sunday, May 510:00 – 11:00 Sala Manzoni

3-D AUDIO—PRODUCE THE NEW DIMENSION

Presenter: Tom Ammermann, New Audio Technology GmbH, Hamburg, Germany

This workshop will have two parts. The first will be a pre-

Technical Program


sentation where new experiences and different ways tohandle the different approaches like film, music, and radio plays will be shown. Complete sessions will beopened and production experiences, editing, as well asworkflows and possibilities to delivery 3-D/spatial contentinto the common market will be shown and discussed.The second part of this workshopwill be a Q & A ses-

sion while a group of max. 10 participants can listen andwork with 3-D content via a sophisticated 3D speaker vir-tualization on headphones.


ALL LOUDNESS RECOMMENDATIONS ARE EQUAL—BUT SOME ARE MORE EQUAL THAN OTHERS

Presenter: Andrew Mason, BBC Research and Development, London, UK

This tutorial will give an explanation of the different loud-ness standards used in Europe (R 128), the US (A/85),and in other countries such as Australia and Japan.


Technical Committee Meeting on Microphones and Applications


LOUDNESS FOR COMMERCIALS—HOW ESTHETICS CHANGE(D)

Presenters: Matteo Milani, Freelance Sound Designer, Producer, Music ComoserAlessandro Travaglini, Sound Designer/Producer, Fox italyRubens Zambelli, Technical Service Manager, ADSTREAM ItaliaCarlos Zarattini, Sound Designer/Music Composer, Discovery Channels Italy

Commercials have often been the number one complaintwhen it came to loudness problems and level jumps. Thefear to be softer than the competition has led to over-compression of the spots and to an extremely narrowloudness range as well as the”perception of the audiosignal being constantly smashed. 'Free transients” andthe liberation from this loudness competition has now finally come with the transition to loudness normalization,and especially for commercials this is more than wel-come. Four sound designers who have a lot of experi-ence in producing detailed sound tracks for commercialswill demonstrate how this new paradigm has changedand is still changing their approach and show how thenew dynamic possibilities can be used to great effect.Among the points discussed will be:

• Use of headroom and dynamic processors• Use of sound effects (in particular low-frequency

sounds)• Speech definition and sound artifacts, and• Short-term loudness limitations

Examples of their work will be played and explained.

Tutorial 6 Sunday, May 511:00 – 13:00 Auditorium Loyola

DRUM PROGRAMMING

Presenter: Justin Paterson, London College of Music, University of West London, London, UK

Drum programming has been evolving at the heart ofmany studio productions for some 30 years. Over thisperiod, technological opportunities for enhanced creativi-ty have multiplied in numerous directions. This tutorialwill demonstrate a number of these directions as theyare often implemented in contemporary professionalpractice, showing contrasting techniques used in the cre-ation of both human emulation and the unashamedlysynthetic. Alongside this, many of the studio productiontechniques often used to enhance such work will be discussed, ranging from dynamic processing to intricateautomation. The session will include numerous live demonstrations

covering a range of approaches. Although introducing all key concepts from scratch, its range and hybridizationshould provide inspiration even for experienced practitioners.


Technical Committee Meeting on Audio Forensics


Standards Committee Meeting on SC-07-01, Metadatafor Audio

Student/Career EventSTUDENT DESIGN EXHIBITIONSunday, May 5, 11:15 – 12:45Foyer

All accepted student entries to the AES Student DesignCompetition will have the opportunity to show off theirdesigns at this poster/tabletop exhibition. This audio sci-ence fair is free and open to all convention attendeesand is an opportunity for aspiring student hardware andsoftware engineers to have their projects seen by theAES design community. It's also an invaluable career-building event and a great place for companies to identifytheir next employees. Students from both audio and non-audio backgrounds are encouraged to submit entries.Few restrictions are placed on the nature of the projects,but designs must be for use with audio. Examples include loudspeaker design, DSP plug-ins, analog hard-ware, signal analysis tools, mobile applications, andsound synthesis devices. Products should representnew, original ideas implemented in working-model prototypes.

Tutorial 7 Sunday, May 511:30 – 13:00 Sala Manzoni

ACOUSTIC ENHANCEMENT SYSTEMS—THE BASICS

Presenter: Ben Kok, SCENA acoustic consultants, Uden, The Netherlands

In the last decades multiple systems for electronicacoustic enhancement have been introduced. Somehave disappeared over time, but others appear to be set-tled firmly. The claim for these systems is that theacoustics of avenue can be changed at the press of a �

18 Audio Engineering Society 122nd Convention Program, 2007 Spring

button, at cost significantly lower than variable acousticsby structural means. Also, in situations where the naturalacoustics did not work out properly, these systems areused to correct the flaws, again at lower costs than thestructural alternatives. The question, of course, is dothese systems really work and if so, what are the differ-ences between these systems? And what would be thestructural alternatives? This tutorial will identify what acoustic properties can

or should be influenced by an acoustic enhancementsystem. In relation to this, working principles and philoso-phies of some of the most popular systems are analyzedand similarities and differences are identified and relatedto specific acoustic situations.


ARE MOVIES TOO LOUD? THE LOUDNESS RACE REACHES THE CINEMA

Presenters: Florian CamererEelco Grimm

Cinema operators are on the receiving end of growingnumbers of complaints from the audience of soundtracksthat are “too loud.” Just turning them down results in low-ered dialog levels, which leaves the movie quieter butunintelligible. The EU has loudness standards that arenow starting to be applied to cinemas. There is a possi-ble need for standards to ensure the theaters are givensoundtracks that meet EU laws. This workshop will investigate the issues involved and the work necessaryto resolve the issues.

Tutorial 8 Sunday, May 512:00 – 13:00 Sala Foscolo

PERCEPTUAL TESTING OF SPEECH IN MODERN AUDIO PRODUCTS

Presenter: Dan Foley, Audio Precision

We are witnessing a convergence of audio technologiesin modern compact consumer products, with small com-puting devices processing voice and general audiothrough a range of options that include analog, electro-acoustic, Bluetooth, HDMI, and more. Whereas in thepast sine and noise-based test signals have been usedto verify performance, these new devices require both amixture of connectivity options and the ability to assessthe perceptual needs of voice communication.Tests such as Perceptual Speech Quality (PESQ) and

Perceptual Listening Quality (POLQA) are now beingused to fully assess device performance where codecsand other non-linear processing is used. This presenta-tion will focus on how these perceptual methods workand when to use them in conjunction with, or in lieu of,classic audio test methods and techniques.


Technical Committee Meeting on Studio Practicesand Production

Student/Career EventEDUCATION FAIRSunday, May 5, 13:00 – 14:30Foyer

Institutions offering studies in audio (from short courses

to graduate degrees) will be represented in a “table top”session. Information on each school’s respective programs will be made available through displays andacademic guidance. There is no charge for schools toparticipate.


Technical Committee Meeting on Audio for Telecommunications


Technical Committee Meeting on Spatial Audio


Standards Committee Meeting on SC-05-02, AudioConnectors


MAKE LUFS NOT WAR

Presenter: Thomas Lund, TC Electronic A/S, Risskov, Denmark

Panelists: Florian Camerer, ORF - Austrian TV, Vienna, AustriaGeorge Massenburg, McGill University, Montreal, Quebec, Canada

Newly produced pop/rock music rarely sounds good onfine loudspeakers, commercials on TV are annoyinglyloud, and a visit to the cinema may be a deafening expe-rience. This is audio's dark middle ages, from whichthere will be little content for future generations to enjoy. However, 2013 could be the year where a renaissance

again spreads from Italy. Transparent loudness normal-ization has arrived to radio, TV, and iPod; and the panelsets out to describe the far-reaching implications this willhave on audio production at large. Hear about new quali-ty-defining criteria, and save your next album for genera-tions to come.

Session P9 Sunday, May 514:30 – 18:00 Sala Carducci

ROOM ACOUSTICS

Chair: Chris Baume, BBC Research & Development, UK

14:30

P9-1 Various Applications of Active Field Control—Takayuki Watanabe, Masahiro Ikeda, YamahaCorporation, Hamamatsu, Shizuoka, Japan

The Active Field Control system is an acousticenhancement system that was developed to improve the acoustic conditions of a space so asto match the acoustic conditions required for avariety of different types of performance pro-grams. This system is unique in that it uses FIRfiltering to ensure freedom of control and theconcept of spatial averaging to achieve stabilitywith a lower number of channels than compara-

Technical Program


tive systems. This system has been used in over70 projects in both the U.S. and Japan. This paper will provide an overview of the character-istics of the system and examples of how thesystem has been applied. Convention Paper 8859

15:00

P9-2 Comparative Acoustic Measurements: Spherical Sound Source vs. Dodecahedron—Plamen Valtchev,1 Denise Gerganova21Univox, Sofia, Bulgaria 2Spherovox, Sofia, Bulgaria

Spherical sound source, consisting of a pair ofcoaxial loudspeakers and a pair of compressiondrivers and radiating into a common radially ex-panding horn, is used for acoustic measure-ments of rooms for speech and music. For exactly the same source-microphone pair posi-tions, comparative measurements are made witha typical dodecahedron, keeping the same microphone technique, identical signals, andrecording hardware under the same measuringconditions. Several software programs wereused for evaluation of the acoustical parametersextracted from impulse responses. Parametersare presented in tables and graphics for bettersound source comparisons. Spherical soundsource reveals higher dynamic range and per-fectly repeatable parameters with source rota-tion, which is in contrast to dodecahedron,where rotation steps resulted in some parame-ters’ deviation. Convention Paper 8860

15:30

P9-3 Archaeoacoustics: An Introduction—A NewTake on an Old Science—Lise-Lotte Tjellesen,Karen Colligan, CLARP, London, UK

What is Archaeoacoustics and how is it defined?This paper will discuss the history and varying aspects of the discipline of archaeoacoustics, i.e.,sound that has been measured, modeled, and analyzed with modern techniques in and aroundAncient sites, temple complexes, and standingstones. Piecing together sound environmentsfrom a long lost past it is brought to life as a toolfor archaeologists and historians. This paper willprovide a general overview of some of the mostprolific studies to date, discuss some measure-ment and modeling methods, and discuss wherearchaeoacoustics may be headed in the futureand what purpose it serves in academia.Convention Paper 8861

16:00

P9-4 Scattering Effects in Small-Rooms: From Time and Frequency Analysis to Psychoacoustic Investigation—Lorenzo Rizzi,Gabriele Ghelfi, Suono e Vita - Acoustic Engineering, Lecco, Italy

This work continues the authors’ effort to optimize a DSP tool for extrapolating from R.I.R.information regarding mixing time and soundscattering effects with in-situ measurements.Confirming past thesis, a new specific experi-

ment allowed to scrutinize the effects of QRDscattering panels over non-Sabinian environ-ments, both in frequency and in time domain.Listening tests have been performed to investi-gate perception of scattering panels effectingsmall-room acoustic quality. The sound diffusionproperties have been searched with specificheadphone auralization interviews, convolvingknown R.I.R.s with anechoic musical samplesand correlating calculated data to psychoacousticresponses. The results validate the known effecton close field recording in small rooms for musicand recording giving new insights.Convention Paper 8862

16:30

P9-5 The Effects of Temporal Alignment of Loudspeaker Array Elements on Speech Intelligibility—Timothy J. Ryan,1 Richard King,2Jonas Braasch,3William L. Martens41Webster University, St. Louis, MO, USA2McGill University, Montreal, Quebec, Canada3Rensselaer Polytechnic Institute, Troy, NY, USA4University of Sydney, Sydney, NSW, Australia

The effects of multiple arrivals on the intelligi-bility of speech produced by live-sound rein-forcement systems are examined. Investigatedvariables include the delay t ime between arrivals from multiple loudspeakers within anarray and the geometry and type of array. Sub-jective testing, using captured binaural record-ings of the Modified Rhyme Test under varioustreatment conditions, was carried out to deter-mine the first- and second-order effects of thetwo experimental variables. Results indicatethat different interaction effects exist for differ-ent amounts of delay offset.Convention Paper 8863

17:00

P9-6 Some Practical Aspects of STI Measurementand Prediction—Peter Mapp, Peter Mapp Associates, Colchester, Essex, UK

The Speech Transmission Index (STI) has become the internationally accepted method oftesting and assessing the potential intelligibilityof sound systems. The technique is standard-ized in IEC 60268-16, however, it is not a flaw-less technique. The paper discusses a numberof common mechanisms that can affect the accuracy of STI measurements and predictions.In particular it is shown that RaSTI is a poor pre-dictor of STI in many sound system applicationsand that the standard speech spectrum assumed by STI often does not replicate thespeech spectrum of real announcements and isnot in good agreement with other speech spec-trum studies. The effects on STI measurementsof common signal processing techniques suchas equalization, compression, and AGC are alsodemonstrated and the implications discussed.The simplified STI derivative STIPA is shown tobe a more robust method of assessing soundsystems than RaSTI and when applied as a direct measurement method can have significantadvantages over Impulse Response-based STImeasurement techniques. Convention Paper 8864

�


17:30

P9-7 Combined Quasi-Anechoic and In-Room Equalization of Loudspeaker Responses—Balazs Bank, Budapest University of Technologyand Economics, Budapest, Hungary

This paper presents a combined approach toloudspeaker/room response equalization basedon simple in-room measurements. In the firststep, the anechoic response of the loudspeaker,which mostly determines localization and timbreperception, is equalized with a low-order non-minimum phase equalizer. This is actually doneusing the gated in-room response, which ofcourse means that the equalization is incorrectat low frequencies where the gate time is shorterthan the anechoic impulse response. In the sec-ond step, a standard, fractional-octave resolutionminimum-phase equalizer is designed based onthe in-room response pre-equalized with thequasi-anechoic equalizer. This second step, inaddition to correcting the room response, auto-matically compensates the low-frequency errorsmade in the quasi-anechoic equalizer designwhen we were using gated responses. Convention Paper 8826

Session P10 Sunday, May 514:15 – 17:15 Sala Foscolo

TRANSDUCERS—PART 2: ARRAYS, MICROPHONES

Chair: TBD

14:30

P10-1 The Radiation Characteristics of an Array ofHorizontally Asymmetrical Waveguides thatUtilize Continuous Arc Diffraction Slots—Soichiro Hayashi, Akira Mochimaru, Paul F.Fidlin, Bose Corporation, Framinham, MA, USA

Previous work presented the radiation character-istics of a horizontally asymmetrical waveguidethat utilizes a continuous arc diffraction slot. Itshowed good coverage control above 1 kHz, aslong as the waveguide center line axis anglestays below a certain angle limit. This paper examines the radiation characteristics of an array of horizontally asymmetrical waveguides.Waveguides with different angular variations aredeveloped, and several vertical arrays are con-structed consisting of those waveguides. The radiation characteristics of the arrays are mea-sured. Horizontally symmetrical and asymmetri-cal vertical arrays are compared, and the consis-tency inside of the coverage and limitations arediscussed. Convention Paper 8865

15:00

P10-2 Numerical Simulation of Microphone WindNoise, Part 1: External Flow—Juha Backman,Nokia Corporation, Espoo, Finland

This paper discusses the use of the computa-tional fluid dynamics (CFD) for computationalanalysis of microphone wind noise. The first partof the work, presented in this paper, discussesthe behavior of the flow around the microphone.

One of the practical questions answered in thiswork is the well-known difference between “popnoise,” i.e., noise caused by transient flows, andwind noise generated by more stationary flows.It appears that boundary layer separation andrelated modification of flow field near the bound-ary is a significant factor in transient flow noise,while vortex shedding, emerging at higher flowvelocities, is significant for steady state flow. Thepaper also discusses the effects of the geometri-cal shape and surface details on the wind noise. Convention Paper 8866

15:30

P10-3 Listener Preferences for Different HeadphoneTarget Response Curves—Sean Olive, ToddWelti, Elisabeth McMullin, Harman International,Northridge, CA, USAThere is little consensus among headphone man-ufacturers on the preferred headphone target fre-quency response required to produce optimalsound quality for reproduction of stereo record-ings. To explore this topic further we conductedtwo double-blind listening tests in which trainedlisteners rated their preferences for eight differentheadphone target frequency responses repro-duced using two different models of headphones.The target curves included the diffuse-field andfree-field curves in ISO 11904-2, a modified diffuse-field target recommended by Lorho, theunequalized headphone, and a new target response based on acoustical measurements ofa calibrated loudspeaker system in a listeningroom. For both headphones the new targetbased on an in-room loudspeaker response wasthe most preferred target response curve.Convention Paper 8867

16:00

P10-4 Optimal Condition of Receiving Transducerin Wireless Power Transfer Based on Ultrasonic Resonance Technology—WooSubYoum, Gunn Hwang, Sung Q Lee, Electronicsand Telecommunications Research Institute(ETRI), Daejeon, KoreaRecently, wireless power transfer technologyhas drawn lots of attention because of wire reduction and charging convenience. Previoustechnologies such as magnetic resonance andinduction coupling have drawbacks of shorttransfer distance and harmfulness to humanhealth. As an alternative, ultrasonic resonancewireless power transfer technology is proposed.A pair of commercial ultrasonic transducer arrays for sensor application is used for wirelesspower transfer. There are two parameters thathave to be decided to maximize the efficiency ofpower transfer. The first parameter is load resis-tance that is connected to receiving transducer.It should be matched to the impedance of receiv-ing the transducer at resonance frequency. Thesecond one is the operating frequency f trans-mitting transducer. It should be matched to theoptimal frequency of the receiving transducer. Inthis paper this optimal load resistance and thefrequency of the receiving transducer are ana-lyzed based on circuit theory and verifiedthrough experiment. Convention Paper 8868

Technical Program


16:30

P10-5 Design of a Headphone Equalizer ControlBased on Principal Component Analysis—Felix Fleischmann, Jan Plogsties, BernhardNeugebauer, Fraunhofer Institute for IntegratedCircuits IIS, Erlangen, Germany

Unlike for loudspeakers, the optimal frequencyresponse for headphones is not flat. A perceptu-ally optimal target equalization curve for headphones was identified in a previous study.Moreover, strong variability in the frequencycharacteristics of 13 popular headphone modelswas observed. Model-specific equalization filterscan be implemented but would require the head-phone to be known. For most consumer applica-tions this seems impractical. Principal compo-nent analysis was applied to the measuredheadphone equalization data, followed by a re-duction of the degrees of freedom. The remain-ing error is identified objectively. It can be shownthat using only one principal component, thesum of a fixed and a weighted filter curve can replace model-specific equalization. Convention Paper 8869

17:00

P10-6 Effect of Elastic Gel Layer on Energy Transfer from the Source to Moving Part of Sound Transducer—Minsung Cho, ElenaProkofieva, Mike Barker, Edinburgh Napier University, Edinburgh, UK

The use of porous materials in sound is quite well-known. The materials can be used for absorptionof unwanted radiation, for dissipation, and re-distri-bution of the waves while traveling through thethickness of material and for enhancement of theoverall sound and therefore energy transfer withinthe material. A three layer system comprising oftwo rigid layers of material with a soft gel middlelayer between them was investigated in this research work to establish the effect of the gel ma-terial on the system’s energy performance. Therole of the gel layer in transferring the energy tothe panel was investigated. It was demonstratedby experiments that the gel layer between the toplayer and the last layer enhances the performanceof the overall construction and also minimizes themechanical distortion by absorbing its bendingwaves. This effect enables to extend the radiatingfrequency of the construction up to high and downto lower frequencies.Convention Paper 8849[Paper Presented by Elena Prokofieva]

Workshop 5 Sunday, May 514:30 – 16:30 Sala Manzoni

CINEMA SOUND—ACOUSTICAL AND CALIBRATIONISSUES


Panelists: David MurphyPhilip Newell

First, we will take a theoretical look at the problems of

cinema sound and shortcomings in the current SMPTEstandards and testing methodology. Then we will explorea practical look at the perils of on-site measurementsand the application of new standards to the calibration ofsound systems in cinemas.

Tutorial 9 Sunday, May 514:45 – 16:15 Auditorium Loyola

ELECTRIC GUITAR—WHAT A RECORDIST OUGHTTO KNOW

Presenter: Alex Case, University of Massachusetts—Lowell, Lowell, MA, USA

Musicians obsess about every detail of their instrument.Engineers do the same with every detail of their studio.This tutorial merges those obsessions, so that a record-ing engineer can be more fully informed about the keydrivers of tone for the electric guitar. Know the instru-ment first and let that drive your decisions for recordingand mixing the instrument.


Technical Committee Meeting on Signal Processing

Session P11 Sunday, May 515:30 – 17:00 Foyer

POSTERS: PERCEPTION AND EDUCATION

15:30

P11-1 Improve the Listening Ability Using E-Learning Methods—Bartlomiej Kruk, BartoszZawieja, Wroclaw University of Technology,Wroclaw, Poland

The main aim of this paper is to show the possi-bility of listening training using the methods of e-learning combined with the classical method ofteaching. The described technique uses all avail-able electronic media as well as traditionalteaching methods. Theory and practice exam-ples are discussed. However, e-learning allowsthe students to work independently and providetests at a convenient place and time. The e-learning exercises have been designed to develop skills including memorization, timbre description, and improving hearing sensitivity forchanges in sound. Convention Paper 8870

15:30

P11-2 Optimizing Teaching Room Acoustics: Investigating the Exclusive Use of a Distributed Electroacoustic Installation to Improve the Speech Intelligibility—Panagiotis Hatziantoniou,1 Nicolas Tatlas,2Stelios M. Potirakis21University of Patras, Patras, Greece2Technological Educational Institute of Piraeus, Athens, Greece

The possibility to improve speech intelligibility inclassrooms of inadequate acoustic design, exclusively by using an electroacoustic installa-tion is investigated in this paper. Measurementresults derived from an overall six-loudspeakerarrangement response in different locations

�


inside a test classroom are compared to thosederived from a one-loudspeaker electroacousticresponse in the speaker’s location. Preliminaryresults indicate that well-established parameterssuch as C-50 and D-50 automatically calculated,show no significant improvement. Extended investigation of the responses as well as othercriteria such as direct to reverb ratios (DRR)shown that there is noteworthy enhancement inaddition to the subjective acoustic perception.Moreover, DRR is shown to further improve fromthe employment of a room correction techniquebased on smoothed responses. Convention Paper 8871

15:30

P11-3 Software Techniques for Good Practice inAudio and Music Research—Luis Figueira,Chris Cannam, Mark Plumbley, Queen MaryUniversity of London, London, UK

In this paper we discuss how software develop-ment can be improved in the audio and musicresearch community by implementing tighter andmore effective development feedback loops. Wesuggest first that researchers in an academic environment can benefit from the straightforwardapplication of peer code review, even for ad-hocresearch software; and second, that researchersshould adopt automated software unit testingfrom the start of research projects. We discussand illustrate how to adopt both code reviewsand unit testing in a research environment. Finally, we observe that the use of a softwareversion control system provides support for thefoundations of both code reviews and automatedunit tests. We therefore also propose that researchers should use version control with alltheir projects from the earliest stage. Convention Paper 8872

15:30

P11-4 A Practical Step-by-Step Guide to the Time-Varying Loudness Model of Moore, Glasberg,and Baer (1997; 2002)—Andrew J. R. Simpson,Michael J. Terrell, Joshua D. Reiss, Queen MaryUniversity of London, London, UK

In this tutorial article we provide a condensed,practical step-by-step guide to the excitation pat-tern loudness model of Moore, Glasberg, andBaer [J. Audio Eng. Soc., vol. 45, 224–240(1997 Apr.); J. Audio Eng. Soc., vol. 50,331–342 (2002 May)]. The various componentsof this model have been separately described inthe well-known publications of Patterson et al. [J.Acoust. Soc. Am., vol. 72, 1788–1803 (1982)],Moore [Hearing, 161-205 (Academic Press1995)], Moore et al. (1997), and Glasberg andMoore (2002). This paper provides a consolidat-ed and concise introduction to the completemodel for those who find the disparate and com-plex references intimidating and who wish to understand the function of each of the compo-nent parts. Furthermore, we provide a consoli-dated notation and integral forms. This introduc-tion may be useful to the loudness theorybeginner and to those who wish to adapt and apply the model for novel, practical purposes. Convention Paper 8873

15:30

P11-5 Subjective Evaluation of Sound Quality ofMusical Recordings Transmitted via theDAB+ System —Maurycy Kin, Wroclaw University of Technology, Wroclaw, Poland

The results of research on the sound quality ofvarious kinds of music transmitted via Digital Audio Broadcasting using Absolute CategoryRating and Comparison Category Rating meth-ods of scaling are presented. The resultsshowed that bit-rate values influence significant-ly the results. A Spectral Band Replication pro-cessing of signals increases the sound qualityhigher for low bit-rates than for higher values,dependently on a kind of music. The spatial attributes of sound, as a perspective, spacious-ness, localization stability and an accuracy ofphantom source, also are dependent on the bit-rate, but these relations are different. It was alsofound that a method of evaluation gives differentresults, and a CCR method is more accurate forsound assessment for higher bit-rates. Convention Paper 8874

15:30

P11-6 Music and Emotions: A Comparison of Measurement Methods—Judith Liebetrau,1,2Sebastian Schneider21Fraunhofer IDMT, Ilmenau, Germany2Ilmenau University of Technology, Ilmenau, Germany

Music emotion recognition (MER) as a part ofmusic information retrieval (MIR), examines thequestion which parts of music evoke what emo-tions and how can they be automatically classi-fied. Classification systems need to be trained interms of feature selection and prediction. Due tothe subjectivity of emotions, the generation ofappropriate ground truth data poses challengesfor MER. This paper describes obstacles ofdefining and measuring emotions evoked by music. Two methods, in principle able to over-come problems in measuring affective states induced by music, are outlined and their resultsare compared. Although the results of bothmethods are in line with psychological theoriesof emotions, the question remains how good theperceived emotions are captured by eithermethod and if these methods are sufficient forground truth generation. Convention Paper 8875

15:30

P11-7 Multidimensional Scaling Analysis Applied toMusic Mood Recognition—Magdalena Plewa,Bozena Kostek, Gdansk University of Technology, Gdansk, Poland

The paper presents two experiments aimed atcategorizing mood associated with music. Twoparts of a listening test were designed and car-ried out with a group of students, most of whomwere users of online social music services. Theinitial experiment was designed to evaluate theextent to which a given label describes the moodof the particular music excerpt. The second sub-jective test was conducted to collect the similari-ty data for the MDS (Multidimensional Scaling)

Technical Program


analysis. Results were subject of various MDSand correlation analysis. Obtained MDS repre-sentation is relevant and remains coherent withacclaimed 2-dimensional Thayer’s model as wellas with evaluation using six mood labels. Convention Paper 8876

15:30

P11-8 Artificial Stereo Extension Based on Gaussian Mixture Model—Nam In Park,Kwang Myung Jeon, Chan Jun Chun, HongKook Kim, Gwangju Institute of Science andTechnology (GIST), Gwangju, Korea

In this paper an artificial stereo extensionmethod is proposed to provide the stereophonicsound. The proposed method employs a mini-mum mean squared error (MMSE) estimatorbased on a Gaussian mixture model (GMM) toproduce stereo signals from a mono signal. Theperformance of the proposed stereo extensionmethod is evaluated using a multiple stimuli witha hidden reference and anchor (MUSHRA) testand compared with that of the parametric stereomethod. It is shown from the test that the meanopinion score of the signals extended by the proposed stereo extension method is around 5%higher than that of the conventional stereo extension method based on inter-channel coher-ence (ICC). Convention Paper 8877


Standards Committee Meeting on SC-05-05, EMC

Workshop 6 Sunday, May 516:30 – 18:30 Sala Manzoni

DESIGN AND BUILD OF FIRST ATMOS-EQUIPPED DUBBING THEATER


Panelist: Philip Newell

With immersive sound being talked about as the "nextgeneration sound" for cinema and TV, this workshop willdiscuss in detail the design, construction, and commis-sioning of the world’s first Atmos-equipped dubbingstage in Moscow, Russia.


GIVE PEAKS A CHANCE

Presenter: Thomas Lund, TC Electronic A/S, Risskov, Denmark

Hearing is our most acute temporal sense by far, but theterms we have for describing dynamic changes in audioare few and not well defined. This session is about micro-dynamics and macro-dynamics in music and in speech,what effect they have, and what it takes to actually regis-ter them as a listener. Engineers be warned. Though audio examples are given, the presentation will primarilybe based on anatomy, physiology, and psychology.

[A follow-up session to the Loudness War panel.]

Student/Career EventRECORDING COMPETITION—PART 1Sunday, May 5, 16:30 – 18:30Auditorium Loyola

The Student Recording Competition is a highlight at eachconvention. A distinguished panel of judges participatesin critiquing finalists of each category in an interactivepresentation and discussion. This event presents stereoand surround recordings in these categories:

• Traditional Studio Recording

• Modern Recording

The top three finalists in each category, as identified byour judges, present a short summary of their productionintentions and the key recording and mix techniquesused to realize their goals. They then play their projectsfor all who attend. Meritorious awards are determinedhere and will be presented at the closing Student Dele-gate Assembly Meeting (SDA-2). The competition is agreat chance to hear the work of your fellow students atother educational institutions. Everyone learns from thejudges’ comments even if your project isn’t one of the fi-nalists, and it’s a great chance to meet other studentsand faculty members.


Technical Committee Meeting on Electro MagneticCompatibility

Professional Training Session 3 Sunday, May 517:30 – 18:30 Sala Foscolo

WIDE BANDWIDTH EQUALIZATION IN THE ANALOG DOMAIN

Based on research related to the sensitivity of our earsthroughout the full bandwidth of our hearing, the Char-terOak PEQ-1 provides extremely musical equalization ina stereo tone control of unparalleled musicality. Utilizinga unique circuit, and power distribution, along with aunique configuration with respect to bandwidth and cutand boost around the center frequency choices, thePEQ-1 achieves extremely powerful and musical equal-ization with minimal phase shift at the high frequencies,and extremely high headroom and low distortion.


LOUDNESS IN RADIO—THE NEXT STEP

Presenter: Florian Camerer, ORF - Austrian TV, Vienna, Austria

After the start of the switchover from peak normalizationto loudness levelling in TV, the logical progression is themove into Radio. One could argue that due to the hyper-compression used in music production and Pop/Rock-stations over the last years, loudness differences are notan issue in Radio.... but that somewhat cynical view isdefinitely not an excuse to leave out the vast world ofRadio programming. On the contrary! Albeit from a differ-ent angle and with other strategies, loudness productionhas many benefits and advantages also for supercom-pressed Radio stations. In this session those differencesand challenges will be examined, and an outlook on the

�


forthcoming work of the EBU-loudness group PLOUD inthat area will be given.

Special EventBANQUETSunday, May 5, 20:30 – 23:00Marriott Grand Hotel Floravia Veneto, Rome

This year, the Banquet will be held on the Roof Gardenof the charming and prestigious Marriott Grand Hotel Flo-ra, located on via Veneto, at the very center of Romeand walking distance from the convention centee. Theview is breathtaking, covering Villa Borghese and severalnoteworthy places of interest. A relaxed atmosphere anda carefully selected menu will provide the proper contextto meet with AES friends after an intense day of scientificcontributions (or tourist walking). And if you arrive early,you can have a seat at Harry’s Bar, that has recentlyopened in Rome and is located on the opposite side ofthe street.

60 Euros per person AES members and nonmembersTickets will be available at the Registration desk.

Session P12 Monday, May 609:00 – 13:00 Sala Carducci

SPATIAL AUDIO—PART 1: BINAURAL, HEAD-RELATED-TRANSFER-FUNCTIONS

Chair: Michele Geronazzo, University of Padova, Padova, Italy

09:00

P12-1 Binaural Ambisonic Decoding with EnhancedLateral Localization—Tim Collins, University of Birmingham, Birmingham, UK

When rendering an ambisonic recording, a uni-form speaker array is often preferred with thenumber of speakers chosen to suit the ambison-ic order. Using this arrangement, localization inthe lateral regions can be poor but can be improved by increasing the number of speakers.However, in practice this can lead to undesirablespectral impairment. In this paper a time-domainanalysis of the ambisonic decoding problem ispresented that highlights how a non-uniformspeaker distribution can be used to improve localization without incurring perceptual spectralimpairment. This is especially relevant to binaur-al decoders, where the locations of the virtualspeakers are fixed with respect to the head,meaning that the interaction between speakerscan be reliably predicted. Convention Paper 8878

09:30

P12-2 A Cluster and Subjective Selection-BasedHRTF Customization Scheme for ImprovingBinaural Reproduction of 5.1 Channel Surround Sound —Bosun Xie, ChengyunZhang, Xiaoli Zhong, South China University of Technology, Guangzhou, China

This work proposes a cluster and subjective selection-based HRTF customization scheme for

improving binaural reproduction of 5.1 channelsurround sound. Based on similarity of HRTFsfrom an HRTF database with 52 subjects, a clus-ter analysis on HRTF magnitudes is applied. Results indicate that HRTFs of most subjectscan be classified into seven clusters and repre-sented by the corresponding cluster centers.Subsequently, HRTFs used in binaural 5.1 chan-nel reproduction are customized from the sevencluster centers by means of subjective selection,i.e., a subjective selection-based customizationscheme. Psychoacoustic experiments indicatethat the proposed scheme partly improves thelocalization performance in the binaural 5.1channel surround sound. Convention Paper 8879

10:00

P12-3 Spatially Oriented Format for Acoustics: AData Exchange Format Representing Head-Related Transfer Functions—Piotr Majdak,1Yukio Iwaya,2 Thibaut Carpentier,3 RozennNicol,4 Matthieu Parmentier,5 Agnieszka Roginska,6 Yoiti Suzuki,7 Kankji Watanabe,8Hagen Wierstorf,9 Harald Ziegelwanger,1Markus Noisternig31Austrian Academy of Sciences, Vienna, Austria2Tohoku Gakuin University, Tagajo, Japan3UMR STMS IRCAM-CNRS-UPMC, Paris, France4Orange Labs, France Telecom, Lannion, France5France Television, Paris, France6New York University, New York, NY, USA7Tohoku University, Sendai, Japan8Akita Prefectural University, Yuri-Honjo, Japan9Technische Universität Berlin, Berlin, Germany

Head-related transfer functions (HRTFs) describe the spatial filtering of the incomingsound. So far available HRTFs are stored in var-ious formats, making an exchange of HRTFs dif-ficult because of incompatibilities between theformats. We propose a format for storing HRTFswith a focus on interchangeability and extend-ability. The spatially oriented format for acoustics(SOFA) aims at representing HRTFs in a gener-al way, thus, allowing to store data such as directional room impulse responses (DRIRs)measured with a microphone-array excited by aloudspeaker array. SOFA specifications considerdata compression, network transfer, a link tocomplex room geometries, and aim at simplify-ing the development of programming interfacesfor Matlab, Octave, and C++. SOFA conventionsfor a consistent description of measurement setups are provided for future HRTF and DRIRdatabases. Convention Paper 8880

10:30

P12-4 Head Movements in Three-Dimensional Localization —Tommy Ashby, Russell Mason,Tim Brookes, University of Surrey, Guildford,Surrey, UK

Previous studies give contradicting evidence asto the importance of head movements in local-ization. In this study head movements wereshown to increase localization response accura-cy in elevation and azimuth. For elevation, it wasfound that head movement improved localization

Technical Program


accuracy in some cases and that when pinnacues were impeded the significance of headmovement cues was increased. For azimuth localization, head movement reduced front-backconfusions. There was also evidence that headmovement can be used to enhance static cuesfor azimuth localization. Finally, it appears thathead movement can increase the accuracy oflisteners’ responses by enabling an interactionbetween auditory and visual cues. Convention Paper 8881

11:00

P12-5 A Modular Framework for the Analysis and Synthesis of Head-Related TransferFunctions —Michele Geronazzo, SimoneSpagnol, Federico Avanzini, University of Padova, Padova, Italy

The paper gives an overview of a number oftools for the analysis and synthesis of head-related transfer functions (HRTFs) that we havedeveloped in the past four years at the Depart-ment of Information Engineering, University ofPadova, Italy. The main objective of our study inthis context is the progressive development of acollection of algorithms for the construction of atotally synthetic personal HRTF set replacingboth cumbersome and tedious individual HRTFmeasurements and the exploitation of inaccuratenon-individual HRTF sets. Our research method-ology is highlighted, along with the multiple pos-sibilities of present and future research offeredby such tools. Convention Paper 8882

11:30

P12-6 Measuring Directional Characteristics of In-Ear Recording Devices—Flemming Christensen, Pablo F. Hoffmann, Dorte Hammershøi, Aalborg University, Aalborg, Denmark

With the availability of small in-ear headphonesand miniature microphones it is possible to con-struct combined in-ear devices for binauralrecording and playback. When mounting a microphone on the outside of an insert earphonethe microphone position deviates from ideal positions in the ear canal. The pinna and therebyalso the natural sound transmission are alteredby the inserted device. This paper presents amethodology for accurately measuring the direc-tional dependent transfer functions of such in-ear devices. Pilot measurements on a com-mercially available device are presented andpossibilities for electronic compensation of thenon-ideal characteristics are considered. Convention Paper 8883

12:00

P12-7 Modeling the Broadband Time-of-Arrival of the Head-Related Transfer Functions forBinaural Audio—Harald Ziegelwanger, PiotrMajdak, Austrian Academy of Sciences, Vienna,Austria

Binaural audio is based on the head-relatedtransfer functions (HRTFs) that provide direc-tional cues for the localization of virtual sound

sources. HRTFs incorporate the time-of-arrival(TOA), the monaural timing information yieldingthe interaural time difference, essential for therendering of multiple virtual sound sources. Inthis study we propose a method to robustly esti-mate spatially continuous TOA from an HRTFset. The method is based on a directional outlierremover and a geometrical model of the HRTFmeasurement setup. We show results for HRTFsof human listeners from three HRTF databases.The benefits of calculating the TOA in the light ofthe HRTF analysis, modifications, and synthesisare discussed. Convention Paper 8884

12:30

P12-8 Multichannel Ring Upmix—Christof Faller,1Lutz Altmann,2 Jeff Levison,2 Markus Schmidt11Illusonic GmbH, Uster, Switzerland2IOSONO GmbH, Erfurt, Germany

Multichannel spatial decompositions and upmix-es have been proposed, but these are usuallybased on an unrealistically simple direct/ambientsound model, not capturing the full diversity of-fered by N discrete audio channels, where in anextreme case each channel can contain an inde-pendent sound source. While it has been arguedthat a simple direct/ambient model is sufficient,in practice such is limiting the achievable audioquality. To circumvent the problem of capturingmultichannel signals with a single model, theproposed “ring upmix" applies a cascade of 2-channel upmixes to surround signals to gener-ate channels for setups with more loudspeakersfeaturing full support for 360-degree panningwith high channel separation. Convention Paper 8908

Session EB2 Monday, May 609:00 – 10:45 Sala Foscolo

ENGINEERING BRIEFS—PAPERS: PART 1

Chair: Etienne Corteel, Sonic Emotion Labs, Paris, France

09:00

EB2-1 Measurement of Sound Quality Differences in Individual CD Media Using Residual Waveform Comparison—Akira Nishimura,Tokyo University of Information Sciences, Chiba-shi, Japan

To measure miniscule differences of sound quali-ty that might exist between different CD media wecompared residuals of two DA- and AD-convertedwaveforms from different discs on which the samedata were recorded under clock synchronizationconditions between DA and AD converters. Thismethod clarifies the existence of differences insound quality, except for sampling clock fluctua-tion. The results showed no media-dependent dif-ference in sound quality. The main source of theresidual waveform was change in the audio circuittransfer function associated with time since turn-ing on the CD player.Engineering Brief 83 �


09:15

EB2-2 Binaural Room Simulation for Acoustic Testing—Scott Levine,1,2 Brett Leonard,1,2Richard King1,21McGill University, Montreal, Quebec, Canada2The Centre for Interdisciplinary Research in Music Media and Technology, Montreal, Quebec, Canada

Often, in testing with acoustic conditions as theindependent variable, challenges arise with theease and speed of altering acoustic conditions.This study compares two possibilities for testingdifferent acoustic conditions. In this test, physi-cally varied acoustic treatment is compared tobinaural room simulation. Explorations of thesetwo methods are conducted employing an insitu, task-based paradigm presented to highlytrained listeners. Results indicate significant dif-ferences in acoustic conditions within binauralsimulations; however do not provide correspond-ing data to actual acoustic alteration. Engineering Brief 84

09:30

EB2-3 The Advantages of Using Active Crossoversin High-End Wireless Speakers—David Jones,CSR Limited, Manchester, UK

With the availability of standardized wireless interfaces and high performance codecs, wire-less loudspeakers can be designed that suit theconsumer demands of compactness and ease ofuse. This paper will examine the performancebenefits of using active crossovers and digitalequalization in an amplification subsystembased on a high performance digital inputswitching amplifier. Measurements of distortionand damping factors will be compared in an example signal chain and the influence theseparameters have on the perceived audio qualityof the speaker system will be discussed. Engineering Brief 85

09:45

EB2-4 Comparative Analysis of Different LoudnessMeters Based on Voice Detection and Gating—Alessandro Travaglini, Fox International Channels Italy, Guidonia Montecelio (RM), Italy

After decades of extensive investigation, the international broadcasting community, repre-sented by technical associations and bodies,has set precise standards aimed to objectivelyassess loudness levels of programs. Although allstandards rely on the same algorithm as described in ITU-R BS1770, there are still twopossible ways to implement such metering, including voice detection and gating. These twodifferent implementations might, in some cases,provide measurements that significantly differfrom each other. Furthermore, while the gatingfeature is uniquely defined in the updated ver-sion of BS1770-3, voice detection is not current-ly specified in any standard and its implementa-tion is the independent choice of manufacturers.This paper analyses this scenario by comparingthe results and robustness provided by three dif-ferent loudness meters based on voice detec-tion. In addition, those values are compared with

measurements obtained by using BS1770-3compliant loudness meters, including tables,comments, and conclusions. Engineering Brief 86

10:00

EB2-5 Assessing the Standardization of an ExistingiOS Control Application to AES64-2012 Network Protocol—Joan Amate, Master Audio,Barcelona, Spain

The recent publication of AES64-2012 standardhas motivated the comparison of Master Audio’sown IP network control protocol against the newstandard, in order to assess its interoperability oradaptability. This brief analyzes what AES64means for manufacturers with existing controlprotocols who are willing to seek standardiza-tion. The protocol used for this assessment wasdeveloped for controlling self-amplified PA sys-tems (built-in amplifier and processing), and isfully functional under Windows and iOS (iPad).Finally, a brief guide on how to face standardiza-tion is given from the manufacturer point of view. Engineering Brief 87

10:15

EB2-6 Innovation in Audio: Update on Patent Activity in the Audio Field—Elliot Cook,Joseph E. Palys, Finnegan, Henderson, Farabow,Garrett & Dunner, LLP, Reston, VA, USA

This paper provides a sampling of recent patentsrelating to innovations in the audio field. The innovations come from a range of AES membercompanies and cover a diverse spectrum oftechnologies, such as music composition soft-ware, loudspeaker design, headphones, digitalsignal processing, microphones, and musicalcomprehension. In addition, unique statistical information regarding patent litigation in the au-dio field is provided. This information is based onoriginal research regarding litigation involvingaudio patents. AES members may find this infor-mation helpful to better understand how audioinnovations play a role in their industry.

Engineering Brief 89

10:30

EB2-7 An Examination of Early Analog and DigitalSampling—The Robb Wave Organ circa 1927—Michael Murphy, Eric Kupp, Ryerson University, Toronto, ON, Canada

This paper examines Frank Morse Robb's workin the late 1920s and early 1930s on his WaveOrgan, the first successful electronic organ. TheRobb Wave Organ originally functioned by creat-ing a visual representation of an analog pipe organ waveform through means of an oscillo-scope and engraving that representation ontometal tone wheels. Later versions of the organfeatured a digital, almost PCM-style, waveformrepresentation on the tone wheels. This pre-dates the theoretical description of PCM by AlecReeves, as well as the PCM patent filed by Oliv-er and Shannon in 1946. These sample-basedmethods of tone generation were unique to theRobb Wave Organ, and this paper serves to

Technical Program


place the organ within its contemporaries of thattime period, most notably its primary competitor,the Hammond organ, launched in 1935. Engineering Brief 90

Tutorial 10 Monday, May 609:00 – 10:30 Auditorium Loyola

CREATIVE DISTORTION—YOU ARE IN THE OVER-DRIVER'S SEAT

Presenter: Alex Case, University of Massachusetts—Lowell, Lowell, MA, USA

Distortion, strategically applied to elements of your mix,is a source of energy that lifts tracks up out of a crowdedarrangement and adds excitement to the performance.Accidental distortion, on the other hand, is a certain signthat the production is unprofessional, dragging down itschance for success. Amps, stomp boxes, tubes, trans-formers, tape machines, the plug-ins that emulate them,and the plug-ins that create wholly new forms of distor-tion all offer a rich palette of distortion colors. Mix engi-neers must know how to choose among them, and howto tailor them to the music. This tutorial takes a closelook at distortion, detailing the technical goings-on whenthings break-up, and defining the production potential ofthis, the caffeine of effects.

Workshop 7 Monday, May 609:00 – 12:00 Sala Manzoni

MULTICHANNEL IMMERSIVE AUDIO FORMATS FOR 3-D CINEMA AND HOME THEATER

Chair: Christof Faller, Illusonic GmbH, Uster, Switzerland

Panelists Brian ClaypoolKimio HamasakiCharles RobinsonWilfried Van Baelen

Several new immersive sound formats are under activeconsideration for cinema soundtrack production. Eachwas developed to create realistic sound “motion” and“immerse” the audience in a more realistic soundfield.This workshop is a repeat of the program presented atAES 133 in San Francisco, with the proponents of four ofthe leading immersive sound systems to discuss theirspecific technologies.

Monday, May 6 09:00 Sala Saba

Technical Committee Meeting on High ResolutionAudio

Monday, May 6 09:00 Sala Montale

Standards Committee Meeting on SC-04-08, SoundSystems in Rooms

Session P13 Monday, May 610:00 – 11:30 Foyer

POSTERS: ROOM ACOUSTICS

10:00

P13-1 The Effect of Playback System on Reverberation Level Preference—BrettLeonard,1,2 Richard King,1,2 Grzegorz Sikora31McGill University, Montreal, Quebec, Canada2The Centre for Interdisciplinary Research in Music Media and Technology, Montreal, Quebec, Canada 3Bang & Olufsen Deutschland GmbH, Pullach, Germany

The critical role of reverberation in modernacoustic music production is undeniable. Unlikemany other effects, reverberation’s spatial nature makes it extremely dependent upon theplayback system over which it is experienced.While this characteristic of reverberation hasbeen widely acknowledged among recording engineers for years, the increase in headphonelistening prompts further exploration of these effects. In this study listeners are asked to addreverberation to a dry signal as presented overtwo different playback systems: headphones andloudspeakers. The final reverberation levels setby each subject are compared for the two moni-toring systems. The resulting data show signifi-cant level differences across the two monitoringsystems. Convention Paper 8886

10:00

P13-2 Adaptation of a Large Exhibition Hall as aConcert Hall Using Simulation and Measurement Tools— Marco Facondini,1Daniele Ponteggia21TanAcoustics Studio, Pesaro (PU), Italy2Studio Ing. Ponteggia, Terni (TR), Italy

Due to the growing demand of multifunctionalperforming spaces, there is a strong interest inthe adaptation of non-dedicated spaces to hostmusical performances. This leads to new chal-lenges for the acousticians with new design con-straints and very tight time frames. This papershows a practical example of the adaptation ofthe “Sala della Piazza” of the Palacongressi ofRimini. Thanks to the combined use of predictionand measurement tools it has been possible todesign the acoustical treatments with a high degree of accuracy, reaching all targets and atthe same time respecting the tight deadlines. Convention Paper 8887

10:00

P13-3 Digital Filter for Modeling Air Absorption in Real Time—Carlo Petruzzellis, UmbertoZanghieri, ZP Engineering S.r.L., Rome, Italy

Sound atmospheric attenuation is a relevant aspect of realistic space modeling in 3-D audiosimulation systems. A digital filter has been developed on commercial DSP processors tomatch air absorption curves. This paper focuseson the algorithm implementation of a digital filterwith continuous roll-off control, to simulate highfrequency damping of audio signals in various atmospheric conditions, along with rules to allowa precise approximation of the behavior described by analytical formulas. Convention Paper 8888

�


10:00

P13-4 Development of Multipoint Mixed-Phase Equalization System for Multiple Environments—Stefania Cecchi,1 Marco Virgulti,1 Stefano Doria,2 Ferruccio Bettarelli,2Francesco Piazza11Universitá Politecnica della Marche, Ancona, Italy2Leaff Engineering, Ancona, Italy

The development of a mixed-phase equalizer issti l l an open problem in the field of room response equalization. In this context, a multipoint mixed-phased impulse responseequalization system is presented taking into con-sideration a magnitude equalization procedurebased on a time-frequency segmentation of theimpulse responses and a phase equalizationtechnique based on the group delay analysis.Furthermore, an automatic software tool for themeasurement of the environment impulse responses and for the design of a suitable equal-izer is presented. Taking advantage of this tool,several tests have been performed consideringobjective and subjective analysis applied in areal environment and comparing the obtained results with different approaches. Convention Paper 8889[Paper presented by Marco Virgulti]

10:00

P13-5 Acoustics Modernization of the RecordingStudio in Wroclaw University of Technology—Magdalena Kaminska, Patryk Kobylt, Bart-lomiej Kruk, Jan Sokolnicki, Wroclaw Universityof Technology, Wroclaw, Poland

The aim of this paper is to present results of theacoustic modernization at the Wroclaw Universityof Technology recording studio. During the projectrealization, the focus is on the problem arising inone part of the recording studio—the so-called flut-ter echoes phenomenon. To minimize this effectwe present a several-stage process in which thestudio is accommodated to expect this occurrence.The first step was to make some measurements ofacoustic properties in the room with the concentra-tion on the previously mentioned effect. Next, aone-dimension diffuser was designed and placedin the phenomenon incidence. The last stage ofthe research was an acoustic measurement aftermodification and comparison with the propertiesbefore the changes. Convention Paper 8890

10:00

P13-6 Accurate Acoustic Echo Reduction withResidual Echo Power Estimation for LongReverberation —Masahiro Fukui,1 Suehiro Shimauchi,1 Yusuki Hioka,2 Hitoshi Ohmuro,1Yoichi Haneda31NTT Corporation, Musashino-shi, Tokyo, Japan2University of Canterbury, Christchurch, New Zealand3The University of Electro-Communications, Chofu-shi, Tokyo, Japan

This paper deals with the problem of estimatingand reducing residual echo components that result from reverberant components beyond the

length of FFT block. The residual echo reductionprocess suppresses the residual echo by apply-ing a multiplicative gain calculated from the estimated echo power spectrum. However, theestimated power spectrum reproduces only afraction of the echo-path impulse response andso all the reverberant component are not consid-ered. To address this problem we introduce a finite nonnegative convolution method by whicheach segment of echo-impulse response is convoluted with a received signal in a powerspectral domain. With the proposed method, thepower spectra of each segment of echo-impulseresponse are collectively estimated by solvingthe least-mean-squares problem between themicrophone and the estimated-residual-echopower spectra. The performance of this methodwas demonstrated by simulation results in whichspeech distortions were decreased comparedwith the conventional method. Convention Paper 8891


Standards Committee Meeting on SC-04-03, Loudspeaker Modeling and Measurement

Tutorial 11 Monday, May 611:30 – 13:00 Sala Foscolo

OPTIMIZING AND MEASURING THE SPEECH INTELLIGIBILITY OF SOUND SYSTEMS

Presenters: Ben Kok, BEN KOK acoustic consulting, Uden, The NetherlandsPeter Mapp, Peter Mapp Associates, Colchester, Essex, UK

Intelligibility is the most important parameter of anysound system—albeit a high-quality, theater sound rein-forcement system, a church, or an emergency PA sys-tem operating deep underground in a road tunnel or aninformation system at an airport or rail station. The tutori-al will discuss how to design intelligible sound systemsfor both sound reinforcement and emergency systemsuse. It will show how reverberation and noise and otheracoustic characteristics of a building or space can affectspeech and sound system intelligibility. We will alsoshow how the choice and positioning of loudspeakerscan have a significant impact on the potential intelligibili-ty. The techniques will be illustrated with case historiesand practical examples. Measurement of the potential intelligibility performance of a sound system will also becovered along with the equipment needed and the practi-cal issues that need to be dealt with.


Technical Committee Meeting on Coding of AudioSignals

Tutorial 12 Monday, May 612:15 – 14:15 Auditorium Loyola

TURNTABLE TECHNIQUE: THE ART OF THE DJ

Presenter: Stephen Webber, Berklee College of Music, Valencia, Spain

Mastering faders, pots, and switches as musical instru-

Technical Program


ments? Turntable technique is where the audio engineerand the artist become one. Stephen Webber is the Direc-tor of Music Technology Innovation at Berklee College ofMusic in Valencia Spain, the composer or the "StylusSymphony," and the author of Turntable Technique: TheArt of the DJ. His tutorial, which has been featuredaround the world, challenges traditional notions of musicand technology.

Tutorial 13 Monday, May 612:30 – 13:30 Sala Alighieri

DESIGNING FOR ULTRA-LOW THD+N IN ANALOG CIRCUITS

Presenter: Bruce E. Hofer, Audio Precision

A number of factors influence the distortion perfor-mance of analog circuits. Among the most important isthe quality of components. Models of non-linearity andtechniques for estimating distortion and noise will bediscussed along with some tips on minimizing theircontributions. This presentation is an updated subsetof the design track seminar titled “Designing Audio inthe 2010s” that was presented at the 2011 New YorkAES convention. It contains some significant new ma-terial that should be of interest to the serious analogdesign engineer.


Standards Committee Meeting on SC-04-04, Microphone Measurement and Characterization


Technical Committee Meeting on Fiber Optics for Audio


Technical Committee Meeting on Transmission and Broadcasting

Tutorial 14 Monday, May 614:15 – 16:15 Sala Alighieri

RUB & BUZZ AND OTHER IRREGULAR LOUDSPEAKER DISTORTION

Presenter: Wolfgang Klippel, Klippel GmbH, Dresden, Germany

Loudspeaker defects caused by manufacturing, aging,overload, or climate impact generate a special kind of irregular distortion commonly known as rub & buzz,which are highly audible and intolerable for the humanear. Contrary to regular loudspeaker distortions definedin the design process, the irregular distortions are hardlypredictable and are generated by an independentprocess triggered by the input signal. Traditional distor-tion measurements such as THD fail in the reliable detection of those defects. This tutorial discusses themost important defect classes, new measurement tech-niques, audibility, and the impact on perceived soundquality.

Session P14 Monday, May 614:30 – 17:30 Sala Carducci

APPLICATIONS IN AUDIO

Chair: Juha Backmann, Nokia Corporation, Espoo, Finland

14:30

P14-1 Implementation of an Intelligent EqualizationTool Using Yule-Walker for Music Mixing andMastering —Zheng Ma, Joshua D. Reiss, DawnA. A. Black, Queen Mary University of London,London, UK

A new approach for automatically equalizing anaudio signal toward a target frequency spectrumis presented. The algorithm is based on theYule-Walker method and designs recursive IIRdigital filters using a least-squares fitting to anydesired frequency response. The target equal-ization curve is obtained from the spectral distri-bution analysis of a large dataset of popularcommercial recordings. A real-time C++ VSTplug-in and an off-line Matlab implementationhave been created. Straightforward objectiveevaluation is also provided, where the output fre-quency spectra are compared against the targetequalization curve and the ones produced by analternative equalization method. Convention Paper 8892

15:00

P14-2 On the Informed Source Separation Approach for Interactive Remixing in Stereo—Stanislaw Gorlow,1 Sylvain Marchand21University of Bordeaux, Talence, France2Université de Brest, Brest, France

Informed source separation (ISS) has become apopular trend in the audio signal processingcommunity over the past few years. Its purposeis to decompose a mixture signal into its con-stituent parts at the desired or the best possiblequality level given some metadata. In this paperwe present a comparison between two ISS sys-tems and relate the ISS approach in variousconfigurations with conventional coding of sepa-rate tracks for interactive remixing in stereo. Thecompared systems are Underdetermined SourceSignal Recovery (USSR) and Enhanced AudioObject Separation (EAOS). The latter forms apart of MPEG’s Spatial Audio Object Codingtechnology. The performance is evaluated usingobjective difference grades computed withPEMO-Q. The results suggest that USSR per-forms perceptually better than EOAS and has alower computational complexity. Convention Paper 8893

15:30

P14-3 Scene Inference from Audio—Daniel Arteaga,1,2 David García-Garzón,2 Toni Mateos,3 John Usher41Fundacio Barcelona Media, Barcelona, Spain2Universitat Pompeu Fabra, Barcelona, Spain3imm sound, Barcelona, Spain4Hearium Labs, San Francisco, CA, USA

�


We report on the development of a system tocharacterize the geometric and acoustic prop-erties of a space from an acoustic impulse response measured within it. This can bethought of as the inverse problem to the com-mon practice of obtaining impulse responsesfrom either real-world or virtual spaces. Start-ing from an impulse response recorded in anoriginal scene, the method described hereuses a non-linear search strategy to select ascene that is perceptually as close as possibleto the original one. Potential applications ofthis method include audio production, non-intrusive acquisition of room geometry, and audio forensics.Convention Paper 8894

16:00

P14-4 Continuous Mobile Communication withAcoustic Co-Location Detection—Robert Albrecht,1 Sampo Vesa,2 Jussi Virolainen,3Jussi Mutanen,4 Tapio Lokki11Aalto University, Espoo, Finland2Nokia Research Center, Nokia Group, Finland3Nokia Lumia Engineering, Nokia Group, Finland4JMutanen Software, Jyväskylä, Finland

In a continuous mobile communication sce-nario, e.g., between co-workers, participantsmay occasionally be located in the same spaceand thus hear each other naturally. To avoidhear ing echoes, the audio t ransmission between these participants should be cut off.In this paper an acoustic co-location detectionalgorithm is proposed for the task, groupingparticipants together based solely on their mi-crophone signals and mel-frequency cepstralcoefficients thereof. The algorithm is testedboth on recordings of different communicationsituations and in real time integrated into avoice-over-IP communication system. Tests onthe recordings show that the algorithm worksas intended, and the evaluation using thevoice-over-IP conferencing system concludesthat the algorithm improves the overall clarityof communication compared with not using thealgorithm. The acoustic co-location detectionalgorithm thus proves a useful aid in continu-ous mobile communication systems.Convention Paper 8895

16:30

P14-5 Advancements and Performance Analysis onthe Wireless Music Studio (WeMUST) Framework—Leonardo Gabrielli, StefanoSquartini, Francesco Piazza, Universitá Politecnica della Marche, Ancona (AN), Italy

Music production devices and musical instru-ments can take advantage of IEEE 802.11 wire-less networks for interconnection and audio datasharing. In previous works such networks havebeen proved able to support high-quality audiostreaming between devices at acceptable laten-cies in several application scenarios. In this worka prototype device discovery mechanism is described to improve ease of use and flexibility.A diagnostic tool is also described and providedto the community that allows to characterize average network latency and packet loss. Lower

latencies are reported after software optimizationand sustainability of multiple audio channels isalso proved by means of experimental tests. Convention Paper 8896

17:00

P14-6 Acoustical Characteristics of Vocal Modes inSinging—Eddy B. Brixen,1 Cathrine Sadolin,2Henrik Kjelin21EBB-consult, Smorum, Denmark2Complete Vocal Institute, Copenhagen, Denmark

According to the Complete Vocal Technique fourvocal modes are defined: Neutral, Curbing,Overdrive, and Edge. These modes are valid forboth the singing voice and the speaking voice.The modes are clearly identified both from listen-ing and from visual laryngograph inspection ofthe vocal cords and the surrounding area of thevocal tract. In a recent work a model has beendescribed to distinguish between the modesbased on acoustical analysis. This paper looksfurther into the characteristics of the voicemodes in singing in order to test the model al-ready provided. The conclusion is that the modelis too simple to cover the full range. The workhas also provided information on singers’ SPLand formants’ repositioning in dependence ofpitch. Further work is recommended. Convention Paper 8897

Workshop 8 Monday, May 614:30 – 17:30 Sala Foscolo

CURRENT REFERENCE LISTENING ROOMSTANDARDS: ARE THEY MEANINGFUL?

Chair: Todd Welti, Harman International, Northridge, CA, USA

Panelists: Sean OliveFrancis RumseyAndreas SilzleThomas Sporer

The ITU BS.1116 standard “Methods for the SubjectiveAssessment of Small Impairments in Audio Systems … ”contains guidelines for standard listening environmentsused for assessing subjective quality of audio systems.This includes factors like dynamics and frequency/direc-tivity response for loudspeakers, physical, and acousticalproperties of the room. This standardized playback envi-ronment should be consistent and representative of highquality systems, yet representative of systems that actu-ally exist. This workshop takes a fresh look at the ITUBS.1116 standard and how it could be improved. In factsome changes that in theory might improve the specifica-tion might not actually be practical. Many aspects of thediscussion would be relevant to reference listening roomstandards other then the BS.1116 standard, thus themore general workshop title.

Student/Career EventEDUCATION FORUM Monday, May 6, 14:30 – 16:30Auditorium Loyola

Moderators:Philip Waldenberger, Chair, AES SDA,

Technical Program


Europe and International RegionsColin Pfund, Chair, AES SDA, North and Latin American RegionsMarija Kovacina, Vice Chair, AES SDA, Europe and International Regions

Student Roundtable

Come share your unique experience as a student of audio. Bring your thoughts and perspectives to an opendiscussion to be moderated by the AES Student Dele-gate Assembly Officers who want to encourage this dia-log. How are you learning about audio? What is uniqueabout your program and its facilities? How do co-curricu-lar activities like the ones sponsored by AES and otherorganizations contribute to your experience? Explore strategies for making connections with the

professional world and discuss the curriculums andphilosophies of your programs. Students, faculty, alumni,industry professionals, and anyone interested in com-menting on the state of audio education are welcome toparticipate.

Session P15 Monday, May 615:00 – 16:30 Foyer

POSTERS: SPATIAL AUDIO

15:00

P15-1 Intelligent Acoustic Interfaces for ImmersiveAudio —Danilo Comminiello, Michele Scarpiniti,Raffaele Parisi, Aurelio Uncini, Sapienza Univer-sity of Rome, Rome, Italy

Oncoming audio technologies privilege the per-ceptive quality of audio signals, thus offeringusers an immersive audio experience, which involves listening and acquisition of audio sig-nals. In such a scenario a fundamental role isplayed by intelligent acoustic interfaces that aimat acquiring audio information, processing it, andreturning the processed information under thefulfillment of quality requirements demanded byusers. In this paper we introduce intelligentacoustic interfaces for immersive audio experi-ence and we prove their effectiveness within thecontext of immersive speech communications. Inparticular, we introduce an intelligent acoustic interface composed of a combined adaptivebeamforming scheme in conjunction with a microphone array, which is able to enhance theprocessed signals in immersive scenarios. Convention Paper 8898

15:00

P15-2 The Effects of Spatial Depth in the Combinations of 3-D Imagery and 7-ChannelSurround with Height Channels—ToruKamekawa,1 Atsushi Marui,1 Toshihiko Date,2Masaaki Enatsu31Tokyo University of the Arts, Tokyo, Japan2AVC Networks Company, Panasonic Corporation, Osaka, Japan 3marimoRECORDS, Inc., Tokyo, Japan

The effect of the speakers of the height directionin a 3-D imagery focused on the spatial depthwere studied and conducted. The first experi

ment was carried out using a method of magni-tude estimation asking how near or far the combination of perceived visual and auditoryevent is. In the second experiment, the subjectswere asked to rate on suitability of the sound tothe image using the same materials as the previ-ous experiment. The results show that 7ch and5ch surround were felt closely and 2ch stereowas felt far under the condition that there is noimage. Regarding suitability of sound to an image, 3-D imagery with 7ch surround giveshigher score in the near distance. Convention Paper 8899

15:00

P15-3 Comparative Analysis on Compact Representation for Spatial Variation of Individual Head-Related Transfer FunctionsBased on Singular Value Decomposition—Shouichi Takane, Akita Prefectural University,Yurihonjo, Akita, Japan

In this paper the compact representation of thehead-related transfer functions (HRTFs) or theHead-Related Impulse Responses (HRIRs)based on singular value decomposition (SVD)was investigated, focusing on the difference inthe parameters to construct the average vectorsand the covariance matrices in two points. One ofthem is on what the derived eigenvectors reflectthe properties of HRTFs and/or HRIRs. As a result, high correlation between the parametersconcerning SVD and the amplitude of HRTFswas found, and the high correlation was obtainedin the wide region except the contralateral side.The second investigation is on the required num-ber of the HRTFs or the HRIRs to construct theaverage vectors and the covariance matrices. Itwas found that the number of HRTFs are decreased to about 1/4 of the whole directions forthe used HRTFs. Among three conditions of theHRIRs, the amplitude of HRTFs, and the log-amplitude of HRTFs, it was also shown that theamplitude of HRTFs is the most effective to construct the parameters of the SVD.Convention Paper 8900

15:00

P15-4 Calculation of Individualized Near-FieldHead-Related Transfer Function DatabaseUsing Boundary Element Method—YuanqingRui, Guangzheng Yu, Bosun Xie, Yu Liu, SouthChina University of Technology, Guangzhou,China

Measurement is a common method to obtain thefar-field HRTFs. Due to the difficulties in mea-surement, near-field HRTF databases for an artificial head are rare and an individualizeddatabase for human subjects is now unavailable.The present work adopts a laser 3-D scanner toacquire geometrical surfaces of human subjectsand then uses boundary element methods tocalculate the near-field HRTFs. At last, an indi-vidualized near-field HRTF database with 56 human subjects is established. To evaluate theaccuracy of the database, the HRTFs for KEMAR are also calculated and compared to themeasured ones. Convention Paper 8901

�


15:00

P15-5 A Standardized Repository of Head-Relatedand Headphone Impulse Response Data—Michele Geronazzo, Fabrizio Granza, SimoneSpagnol, Federico Avanzini, University of Padova, Padova, Italy

This paper proposes a repository for storing full-and partial-body Head-Related Impulse Responses (HRIRs/pHRIRs) and HeadphoneImpulse Responses (HpIRs) from several data-bases in a standardized environment. The maindifferences among the available databases con-cern coordinate systems, sound source stimuli,sampling frequencies, and other important speci-fications. The repository is organized so as toconsider all these differences. The structure ofour repository is an improvement with respect tothe MARL-NYU data format, born as an attemptto unify HRIR databases. The introduced infor-mation supports flexible analysis and synthesisprocesses and robust headphone equalization. Convention Paper 8902

15:00

P15-6 Influence of Different Microphone Arrays on IACC as an Objective Measure of Spaciousness—Marco Conceição,1,2Dermot Furlong11Trinity College, Dublin, Ireland2Instituto Politécnico do Porto, Porto, Portugal

Inter-Aural Cross Correlation measurements areused as physical measures that relate to listenerspaciousness experience in a comparative studyof the influence on spaciousness of different microphone arrays, thus allowing an objectiveapproach to be adopted in the exploration ofhow microphone arrays affect the perceived spa-ciousness for stereo and surround sound recon-structions. The different microphone arraysrecorded simulated direct and indirect soundcomponents. The recorded signals were playedback in three different rooms and IACC mea-surements were made for the reconstructedsound fields using a dummy head microphonesystem. The results achieved show how micro-phone array details influence the IACC peak andlead to a better understanding of how spacious-ness can be controlled for 2 channel stereo and5.1 presentations. Parametric variation of micro-phone arrays can therefore be employed to facil-itate spaciousness control for reconstructedsound fields. Convention Paper 8885


Technical Committee Meeting on Sound for DigitalCinema and Television

Student/Career EventRECORDING COMPETITION—PART 2 Monday, May 6, 16:30 – 18:30Auditorium Loyola

The Student Recording Competition is a highlight at eachconvention. A distinguished panel of judges participates

in critiquing finalists of each category in an interactivepresentation and discussion. This event presents stereoand surround recordings in these categories:

• Sound for Visual Media

• Traditional Acoustic Recording

The top three finalists in each category, as identified byour judges, present a short summary of their productionintentions and the key recording and mix techniquesused to realize their goals. They then play their projectsfor all who attend. Meritorious awards are determinedhere and will be presented at the closing Student Dele-gate Assembly Meeting (SDA-2). The competition is agreat chance to hear the work of your fellow students atother educational institutions. Everyone learns from thejudges’ comments even if your project isn’t one of the finalists, and it’s a great chance to meet other studentsand faculty members.

Professional Training 4 Monday, May 616:30 – 18:30 Sala Alighieri

RECORDING AND MIXING WITH UNIVERSAL AUDIO

Presenter: Wesley Bendall

An overview is given of Universal Audio history from the1950s to the present, detailing various well-known ana-log gear and hints on how and why you would, andshould use them. Wesley will talk about the benefits thatUA plugins can bring to a recording and a mix, and willexplain and demonstrate the Apollo platform, especiallynew Apollo 16.


Standards Committee Meeting on SC-02-12, AudioApplications of Networks

Special EventAN IMMERSAV JAZZ CONCERTFEATURING THE GREG BURK TRIOMonday, May 7, 20:00 – 21:00Saint Louis College of MusicVia Urbana 49/a00184 Roma

A special music concert will be held Monday eveningfeaturing a performance of the Greg Burk Jazz Trio. Asa unique aspect of the concert, Robert Schulein will bemaking a combined binaural and HD video recordingfor downloading from the AES web site. following theconvention. The audio/video recordings will be format-ted for both headphone and loudspeaker listening (viacross talk cancellation) for viewing by AES members.This will allow attendees an opportunity to attend theevent live and recreate the experience later by meansof a surround audio with video recording process compatible with today’s consumer viewer / listeningtechnology.

Pianist/composer/educator Greg Burk has lived, stud-ied, educated, and performed around the world. Overthe last decade, living and working in Detroit, Boston,and now Rome, he has established himself not only asa vital sideman, but also as a leader with the rare abili-ty to combine original compositions with memorablemelodies and a desire to push the creative envelopeas an improviser.

Technical Program


Session P16 Tuesday, May 709:00 – 12:30 Sala Carducci

SPATIAL AUDIO—PART 2: 3-D MICROPHONE AND LOUDSPEAKER SYSTEMS

Chair: Filippo Maria Fazi, University of Southampton, Southampton, Hampshire, UK

09:00

P16-1 Recording and Playback Techniques Employed for the “Urban Sounds” Project—Angelo Farina, Andrea Capra, Alberto Amen-dola, Simone Campanini, University of Parma,Parma, Italy

The “Urban Sounds” project, born from a coop-eration of the Industrial Engineering Departmentat the University of Parma with the municipal institution La Casa della Musica, aims to recordcharacteristic soundscapes in the town of Parmawith a dual purpose: delivering to posterity anarchive of recorded sound fields to documentParma in 2012, employing advanced 3-D sur-round recording techniques and creation of a“musical” Ambisonics composition for leadingthe audience through a virtual tour of the town.The archive includes recordings of various“soundscapes,” such as streets, squares,schools, churches, meeting places, public parks,train station, and airport, and everything wasconsidered unique to the town. This paper details the advanced digital sound processingtechniques employed for recording, processing,and playback. Convention Paper 8903

09:30

P16-2 Robust 3-D Sound Source Localization UsingSpherical Microphone Arrays—Carl-IngeColombo Nilsen,1,2 Ines Hafizovic,2 Sverre Holm11University of Oslo, Oslo, Norway2Squarehead Technology AS, Oslo, Norway

Spherical arrays are gaining increased interestin spatial audio reproduction, especially in High-er Order Ambisonics. In many audio applicationsthe sound source detection and localization is ofcrucial importance, urging for robust localizationmethods suitable for spherical arrays. The well-known direction-of-arrival estimator, the ESPRITalgorithm, is not directly applicable to sphericalarrays for 3-D applications. The eigenbeam ESPRIT (EB-ESPRIT) is based on the sphericalharmonics framework and is especially derivedfor spherical arrays. However, the ESPRITmethod is in general susceptible to errors in thepresence of correlated sources and spatialdecorrelation is not possible for spherical arrays.In our new implementation of spherical harmon-ics-based ESPRIT (SA2ULA-ESPRIT) the robustness against correlation is achieved byspatial decorrelation incorporated directly in thealgorithm formulation. The simulated perfor-mance of the new algorithm is compared to EB-ESPRIT. Convention Paper 8904

10:00

P16-3 Parametric Spatial Audio Coding for SpacedMicrophone Array Recordings—ArchontisPolitis,1 Mikko-Ville Laitinen,1 Jukka Ahonen,2Ville Pulkki11Aalto University, Espoo, Finland2Akukon Ltd., Helsinki, Finland

Spaced microphone arrays for multichannelrecording of music performances, when repro-duced in a multichannel system, exhibit reducedinter-channel coherence that translates percep-tually to a pleasant “enveloping” quality, at theexpense of accurate localization of soundsources. We present a method to process thespaced microphone recordings using the princi-ples of Directional Audio Coding (DirAC), basedon the knowledge of the array configuration andthe frequency-dependent microphone patterns.The method achieves equal or better quality tothe standard high-quality version of DirAC and itimproves the common one-to-one channel play-back of spaced multichannel recordings by offer-ing improved and stable localization cues. Convention Paper 8905

10:30

P16-4 Optimal Directional Pattern Design UtilizingArbitrary Microphone Arrays: A Continuous-Wave Approach—Symeon Delikaris-Manias,Constantinos A. Valagiannopoulos, Ville Pulkki,Aalto University, Aalto, Finland

A frequency-domain method is proposed for designing directional patterns from arbitrary microphone arrays employing the complexFourier series. A target directional pattern is defined and an optimal set of sensor weights isdetermined in a least-squares sense, adopting acontinuous-wave approach. It is based on dis-crete measurements with high spatial samplingratio, which mitigates the potential aliasing ef-fect. Fourier analysis is a common method foraudio signal decomposition; however in this approach a set of criteria is employed to definethe optimal number of Fourier coefficients andmicrophones for the decomposition of the micro-phone array signals at each frequency band.Furthermore, the low-frequency robustness is increased by smoothing the target patterns atthose bands. The performance of the algorithmis assessed by calculating the directivity indexand the sensitivity. Applications, such as synthe-sizing virtual microphones, beamforming, binaur-al, and loudspeaker rendering are presented. Convention Paper 8906

11:00

P16-5 Layout Remapping Tool for Multichannel Audio Productions—Tim Schmele,1 DavidGarcía-Garzón,2 Umut Sayin,1 Davide Scaini,1,2Daniel Arteaga1,21Fundació Barcelona Media, Barcelona, Spain2Universitat Pompeu Fabra, Barcelona, Spain

Several multichannel audio formats are presentin the recording industry with reduced interoper-ability among the formats. This diversity of formats leaves the end user with limited accessi-bility to content and/or audience. In addition, thepreservation of recordings—that are made for a �


particular format—comes under threat, shouldthe format become obsolete. To tackle such issues, we present a layout-to-layout conversiontool that allows converting recordings that aredesignated for one particular layout to any otherlayout. This is done by decoding the existentrecording to a layout independent equivalent andthen encoding it to desired layout through differ-ent rendering methods. The tool has proven useful according to expert opinions. Simulationsdepict that after several consecutive conversionsthe results exhibit a decrease in spatial accuracyand increase in overall gain. This suggests thatconsecutive conversions should be avoided andonly a single conversion from the originally ren-dered material should be done. Convention Paper 8907

11:30

P16-7 Discrimination of Changing Loudspeaker Positions of 22.2 Multichannel Sound SystemBased on Spatial Impressions—IkukoSawaya, Kensuke Irie, Takehiro Sugimoto, AkioAndo, Kimio Hamasaki, Science & TechnologyResearch Laboratories, Japan BroadcastingCorp., Setagaya, Tokyo, Japan

On 22.2 multichannel reproduction, we some-times listened to the sounds reproduced by aloudspeaker arrangement different from that onproduction, and we did not recognize the differ-ence in spatial impression between them defi-nitely. In this paper we discuss the effects ofchanging some of the loudspeaker positionsfrom the reference on the spatial impressionsin a 22.2 multichannel sound system. Subjec-tive evaluation tests were carried out by alter-ing the azimuthal and elevation angles fromthe reference in each condition. Experimentalresults showed that the listeners did not recog-nize the difference in spatial impression fromthe reference with some loudspeaker arrange-ments. On the basis of these results, theranges keeping the equivalent quality of thespatial impressions to the reference are dis-cussed when the reproduced material has thesignals of all the channels of the 22.2 multi-channel sound system.Convention Paper 8909

12:00

P16-8 Modeling Sound Localization of Amplitude-Panned Phantom Sources in SagittalPlanes—Robert Baumgartner, Piotr Majdak,Austrian Academy of Sciences, Vienna, Austria

Localization of sound sources in sagittal planes(front/back and top/down) relies on listener-spe-cific monaural spectral cues. A functional modelapproximating human processing of spectro-spatial information is applied to assess localiza-tion of phantom sources created by amplitudepanning of coherent loudspeaker signals. Basedon model predictions we investigated the local-ization of phantom sources created by loud-speakers positioned in the front and in the back,the effect of loudspeaker span and panning ratioon localization performance in the median plane,and the amount of spatial coverage provided bycommon surround sound systems. Our findings

are discussed in the light of previous resultsfrom psychoacoustic experiments. Convention Paper 8910

Tutorial 15 Tuesday, May 709:00 – 11:00 Sala Foscolo

PARALLELIZATION OF AUDIO SIGNAL PROCESSING ALGORITHMS ON MULTICORE DSPS

Presenters: Carsten Tradowsky, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany; CTmusic, Karlsruhe, GermanyJochen Schyma, Texas Instruments

Today, digital audio systems for music production are restricted because they do not exhaust the possibilitiesgiven by modern hardware. One possibility to close theproductivity gap is to use a high-level model-based development approach. This tutorial presents a concept on using a model-

based development approach to describe audio signalprocessing algorithms. This description is used to com-pile C-code out of the model to evaluate on the targetplatform. The generated C-code is then parallelized using the standard OpenMP compiler in TI’s Code Com-poser Studio (CCS) development environment. The goal of this tutorial is to present the development

of audio signal processing algorithms using Matlab,CCS, and TMS320C6678—the eight-core Digital SignalProcessor (DSP) from Texas Instruments (TI).

Workshop 9 Tuesday, May 709:00 – 11:30 Auditorium Loyola

ACOUSTIC ENHANCEMENTS SYSTEMS—IMPLEMENTATIONS

Chair: Ben Kok, SCENA acoustic consultants, Uden, The Netherlands

Panelists: Steve BarbarPeter MappThomas SporerTakayuki WatanabeWieslaw Woszczyk

Acoustic enhancement systems offer the possibility tochange the acoustics of a venue by electronic means.The way how this is achieved varies by the working prin-ciple and philosophy of the system implemented. In thisworkshop various researchers, consultants, and suppli-ers active in the field of enhancement systems will dis-cuss working principles and implementations. This workshop is in close relation with the tutorial on

acoustic enhancement systems; people not yet too famil-iar with the applications and working principles of thesesystems are recommended to attend the tutorial beforeattending the workshop.

Tuesday, May 7 09:00 Sala Saba

Technical Committee Meeting on Archiving, Restoration, and Digital Libraries

Technical Program


Session P17 Tuesday, May 710:00 – 11:30 Foyer

POSTERS: MEASUREMENTS AND MODELING

10:00

P17-1 Estimation of Overdrive in Music Signals—Lasse Vetter, Michael J. Terrell, Andrew J. R.Simpson, Andrew McPherson, Queen Mary University of London, London, UK

In this paper we report experimental and modelingresults from an investigation of listeners’ ability toestimate overdrive in a signal. The term overdriveis used to characterize the result of systematic,level-dependent nonlinearity typical of audio equip-ment and processors (e.g., guitar amplifiers). Lis-teners (N=7) were given the task of estimating thedegree of overdrive in music signals that had beenprocessed with a static, saturating nonlinearity tointroduce varying degrees of nonlinear distortion.A statistical model is proposed to account for thedata, which is based on a measure of time-vari-ance in the summed frequency-response deviationintroduced by the nonlinearity. This provides auseful “black-box” metric that describes the per-ceived amount of overdrive introduced by an audioprocessing device.Convention Paper 8911

10:00

P17-2 Mobile Audio Measurements Platform: Toward Audio Semantic Intelligence intoUbiquitous Computing Environments—Lazaros Vrysis, Charalampos A. Dimoulas,George M. Kalliris, George Papanikolaou, Aristotle University of Thessaloniki, Thessaloniki,Greece

The current paper presents the implementationof a mobile software environment that provides asuite of professional-grade audio and acousticanalysis tools for smartphones and tablets. Thesuite includes sound level monitoring, real-timetime-frequency analysis, reverberation time, andimpulse response measurements, whereas feature-based intelligent content analysis is de-ployed in terms of long-term audio events detec-tion and segmentation. The paper investigatesthe implementation of a flexible and user-friendlyenvironment, which can be easily used by non-specialists, providing professional functionalityand fidelity of specific-purpose devices and elim-inating the mobile-interfacing and hardware limi-tations. Emphasis is given to the integration ofadditional capabilities that will offer valuableamenities to the user, having to do with the man-agement of measurement sessions and intelli-gent cloud-based semantic analysis. Convention Paper 8912

10:00

P17-3 System Identification Based on HammersteinModels Using Cubic Splines—Michele Gasparini, Andrea Primavera, Laura Romoli,Stefania Cecchi, Francesco Piazza, UniversitáPolitecnica della Marche, Ancona (AN), Italy

Nonlinear system modeling plays an important role

in the field of digital audio systems whereas mostof the real-world devices show a nonlinear behav-ior. Among nonlinear models, Hammerstein sys-tems are particular nonlinear systems composedof a static nonlinearity cascaded with a linear filter.In this paper a novel approach for the estimation ofthe static nonlinearity is proposed based on the in-troduction of an adaptive CatmullRom cubic splinein order to overcome problems related to theadaptation of high-order polynomials necessary foridentifying highly nonlinear systems. Experimentalresults confirm the effectiveness of the approach,making also comparisons with existing techniquesof the state of the art.Convention Paper 8913

10:00

P17-4 Impulse Responses Measured with MLS orSwept-Sine Signals: A Comparison betweenthe Two Methods Applied to Noise BarrierMeasurements —Paolo Guidorzi, MassimoGarai, University of Bologna, Bologna, Italy

A sound source and a microphone grid are usedfor measuring a set of impulse responses withthe purpose of estimating the in-situ acousticalcharacteristics of noise barriers (sound reflectionand airborne sound insulation) following theCEN/TS 1793-5 European standard guidelinesas improved by the European project QUIESST.The impulse responses are measured usingMLS (Maximum Length Sequence) and Swept-sine signals. The acoustical characteristics ofthe noise barrier, obtained using the two signals,are generally equivalent, but in some criticalmeasurement conditions a discrepancy can befound. Differences and advantages betweenmeasurements, obtained by means of MLS orSwept-sine methods are analyzed and dis-cussed in this paper. Convention Paper 8914

10:00

P17-5 Polar Measurements of Harmonic and Multitone Distortion of Direct Radiating and Horn Loaded Transducers—Mattia Cobianchi,1 Fabrizio Mizzoni,2 Aurelio Uncini21Lavoce Italiana, Colonna (RM), Italy2Sapienza University of Rome, Rome, Italy

While extensive literature is available on the top-ic of polar pattern measurements and predic-tions of loudspeakers’ fundamental SPL, only asingle paper to our knowledge deals with the polar pattern of nonlinear distortions, in particu-lar with harmonic distortion products of conetype loudspeakers. This paper contains the firstresults of a more thorough study intended as acomplement to fill the gap both in measurementtechniques and loudspeaker type. Relative andabsolute harmonic distortion as well as relativeand absolute multitone distortion, indeed, havebeen measured for cone, dome, and hornloaded transducers. Convention Paper 8915

10:00

P17-6 An Efficient Nonlinear Acoustic Echo Canceller for Low-Cost Audio Devices—Danilo Comminiello,1 Antonio Grosso,2 Fabio


Cagnetti,2 Aurelio Uncini11Sapienza University of Rome, Rome, Italy2bdSound, Milan, Italy

One of the most challenging problems to address in the modeling of acoustic channels isthe presence of nonlinearities generated byloudspeakers. This problem has become evenmore widespread due to the growing availabilityof low-cost devices that introduce a largeramount of distortions and decrease the quality ofhands-free speech communication. In order toreduce the effect of loudspeaker nonlinearitieson the speech quality, nonlinear acoustic echocancellers are adopted. In this paper we presenta new adaptive filtering structure for the reduc-tion of nonlinearities in the acoustic path, basedon a nonlinear transformation of the input audiosignal by means of functional links. We use sucha nonlinear model in conjunction with a linear fil-ter providing a nonlinear adaptive architecturefor acoustic applications. Experimental resultsprove the effectiveness of the proposed model inreducing loudspeaker nonlinearities affectingspeech signals. Convention Paper 8916

10:00

P17-7 Radiation Pattern Differences between Electric Guitar Amplifiers—Justin Mathew,Stephen Blackmore, New York University, NewYork, NY, USA

Various aspects of electric guitar amplifiers candifferentiate them from one another. Two of themajor differentiating characteristics are frequen-cy response and unique radiation patterns.These characteristics are directly related to dif-ferences in shape, size, and circuit configurationbetween different guitar amplifier models. In thispaper the differences in radiation patterns ofmultiple guitar amplifiers will be presented aswell a method of classifying the differences. Convention Paper 8917

Tuesday, May 7 10:00 Sala Saba

Technical Committee Meeting on Audio Recordingand Mastering Systems

Tuesday, May 7 10:00 Sala Montale

Standards Committee Meeting, AESSC Plenary

Professional Training Session 5 Tuesday, May 710:30 – 12:30 Sala Manzoni

AUDIO DESIGN WORKSHOP LIVE—PART 1

Presenters: Peter Larsen, LOUDSOFTJohn Richards, Oxford Digital

10:30 – 11:30 Designing Transducers for Compact Ac-tive Speakers, Peter Larsen, LOUDSOFT

The LOUDSOFT FINE software suite for LoudspeakerDesign and Test & Measurement is a circle of designtools, where each component supports the other. Withthese powerful FINE programs you can design all partsof the speaker system. The FINE R+D analysis/measure-ment system and the FINE QC control system supple-

ment the design software. The Rub & Buzz function isthe best on the market finding 100% failures of both active USB speakers and micro speakers. These toolsare rapid and effective in modern transducer and speak-er system development and the simulations very closelymatches measurements of the devices when built.

11:30 – 12:30 Optimizing Compact Loudspeaker Per-formance—The Role of DSP, John Richards, OxfordDigital

The form factor of all consumer devices from cellphones to flat panel TVs is shrinking and often the design is under the control of a stylist rather than anyonewho understands acoustics. This, together with pressureon the Bill of Materials cost and increasing customer ex-pectations, creates real challenges in obtaining good, dif-ferentiating audio quality in the product. DSP can miti-gate many of the inherent deficiencies present. Thetutorial shows how a combination of an integrated suiteof audio effects together with automated frequency cor-rection can be used to radically improve sound qualitywhilst providing rapid time to market.

Session EB3 Tuesday, May 711:15 – 13:00 Sala Foscolo

ENGINEERING BRIEFS—PAPERS: PART 2

Chair: Brett Leonard, McGill University, Montreal, Quebec, Canada

11:15

EB3-1 Spatial Sound Reinforcement Using WaveField Synthesis—Etienne Corteel,1 HubertWestkemper,2 Cornelius Ihssen,3 Khoa-VanNguyen11Sonic Emotion Labs, Paris, France2Independent Tonmeister, Naples, Italy3Sonic Emotion Labs, Oberglatt, Switzerland

Spatial audio in sound reinforcement remains anopen topic, requiring good level coverage and atthe same time good localization accuracy over avery large listening area, typically the entire audience. Wave Field Synthesis offers high localization accuracy over an extended listeningarea but the number of required loudspeakers,their placement on stage, and the level coveragethat results from it can be problematic. The paper addresses these issues, presenting acase study of a sound reinforcement systembased on Wave Field Synthesis for sound rein-forcement for the play “The Panic” written byRafael Spregelburd and directed by Luca Ron-coni. The paper describes an improved WaveField Synthesis rendering for sound reinforce-ment involving two arrays of loudspeakers at dif-ferent heights. The paper addresses the practi-cal implementation of the system in a theaterand the overall installation: miking, real timetracking of actors, and loudspeakers used. Engineering Brief 91

11:30

EB3-2 Sync-AV–Workflow Tool for File-Based VideoShootings—Andreas Fitza, University of Applied Science Mainz, Mainz, Germany

The Sync-AV workflow tool eases the sorting

Technical Program


and synchronization of video and audio footagewithout the need of expensive special hardware.It supports the preproduction, shooting, andpostproduction. It consists of three elements: ascript-information and metadata-gathering iOSapp that is synchronized with a server-back-endand that can be used to exchange informationon-set; a server database with a web-front-endthat can sort files by their metadata and showdailies and that can be used to distribute andmanage information during the pre-production;and a local import client that manages thefootage ingest and sorts the files together. Theclient also takes care of the synchronization ofthe video that contains audio and separatelyrecorded audio files and it renames the files andimplements the metadata. Engineering Brief 92

11:45

EB3-3 On the Optimum Microphone Array Configuration for Height Channels—Hyunkook Lee, Christopher Gribben, Universityof Huddersfield, Huddersfield, West Yorkshire, UK

To date no experimental data have been pre-sented on the optimum microphone array config-uration for new surround formats employingheight channels. A series of subjective listeningtests were conducted to investigate how thespacing between base and height microphonesaffects perceived spatial impression and overallpreference. Four different spacings of 0, 0.5, 1,and 1.5 m were compared for various soundsources using a 9-channel loudspeaker setup.For sources with more continuous temporalcharacteristics, the spacing between the layersdid not have any significant effect on spatial impression, whereas for more transient sourcesthe 0 m layer appeared to produce a greaterspatial impression than more spaced layers. Fur-thermore, the 0 m layer was more or similarlypreferred to the spaced layers depending onsource type. Engineering Brief 93

12:00

EB3-4 Free Improv—The Hard Way—Justin Paterson,London College of Music, University of WestLondon, London, UK

As a fringe genre, “Free Improvisation” does notnormally attract large production budgets. Oftentime-constrained, the subsequent technologicalapproach to the production tends to emphasizethe naturalistic and neglects many of the toolsand techniques that are commonplace in con-temporary popular music. The author producedthe album, The Making of Quiet Things by TheNumber (featuring Keith Tippett). This albumconsciously employed a range of contemporaryapproaches such as creative and corrective automation, reverberation-matching, audio edit-ing, and extreme compression, while maintainingan overall impression of minimal mediation. Thispaper considers and contextualizes such an approach, reflecting on the practice and its impli-cations for the genre. Engineering Brief 94

12:15

EB3-5 International Experiences from a New SoundSystem Approach—Thomas Lagö, Alan Boyer,QirraSound Technologies LLC, Las Vegas, NV,USA

QirraSound's new sound system approach bene-fits from a high level of intelligibility and substan-tially lower feedback. These properties help inplacing loudspeakers behind the performers andthus minimizing the need for monitor speakers.Substantial empiric testing has been done in appli-cations in multiple countries and applications andresults from these tests will be presented. Listen-ers and performers report increased feel and intel-ligibility and even people with hearing loss and/orsensitivity to high sound levels can enjoy thesound. It has also been noticed that the ability totalk while music is playing is much better than withclassical systems. An overall outline of these testresults and feedback will be reported.Engineering Brief 95

12:30

EB3-6 Nonlinear Guitar Loudspeaker Simulation—Thomas Schmitz, Jean J. Embrechts, Universityof Liege, Liege, Belgium

In this study we simulated in real time the soundof a guitar amplifier loudspeaker, including itsnon-linear behavior. The simulation method isbased on a non-linear convolution of the signalemitted by the instrument with the Volterra ker-nels, which were measured in anechoic condi-tions with a sine-sweep technique. The modelhas been implemented in a “VST” (Virtual Studio Technology) audio plugin. The loud-speaker simulation can be performed in real timewith the Volterra kernels up to the third orderand offers a good accuracy. Informal tests revealed that the simulated and the real soundwere very close, although approximately 50 per-cent of the tested musicians were still able tohear a small difference. Engineering Brief 96

12:45

EB3-7 Using Low-Latency Net-Based Solutions toExtend the Audio and Video Capabilities of aStudio Complex—Paul Ferguson, EdinburghNapier University, Edinburgh, UK

Two low-latency IP-based systems, RedNet andLOLA, were selected by Edinburgh Napier Uni-versity to link a new music building with existingbroadcast and drama facilities and to allow low-latency audio and video collaboration with otherinstitutions/organizations around the world. Firstto be examined will be their use of FocusriteRedNet and Audinate Dante to expand existingAES10 (MADI) point to point links. Second, anoverview will be provided of the University’s ongoing research with JANET and GARR (theUK and Italy National Research and EducationNetworks) into the use of the Italian LOLA sys-tem (LOw LAtency audio visual streaming sys-tem) to provide long-distance audio and videolinks for rehearsal and performance involvingmusicians in different countries. Engineering Brief 97 �


Tutorial 16 Tuesday, May 711:45 – 13:15 Auditorium Loyola

MICROPHONES: THE PHYSICS, METAPHYSICS, AND PHILOSOPHY

Presenter: Ron Streicher, Pacific Audio-Visual Enterprises, Pasadena, CA, USA

Before you can place the first microphone in the studio, youneed to develop a clear understanding of the sound thatyou want to emanate from the loudspeakers when the pro-ject is finished. To do this, you need to determine what arethe elements that are essential for creating the “sonic illu-sion,” and then decide how to balance the often conflictingelements and competing demands of technology vs. art.Microphone techniques—although critical—are only a partof this process. Equally important are the criteria for moni-toring and evaluating the results. Using recorded examplesand practical demonstrations, the various aspects of thiscreative process are developed and brought into focus.

Professional Training Session 6 Tuesday, May 713:30 – 15:30 Sala Manzoni

AUDIO DESIGN WORKSHOP LIVE—PART 2

Presenters: Anthony Waldron, Audio EMCSimon Wollard, Prism Sound

13:30 – 14:30 Audio Power Amplifiers; EMC Best Prac-tice Revealed, Anthony Waldron, Audio EMC

The performance of analog and (now) digital audio sys-tems is approaching an all time high—yet we still sufferbad bouts of noise, distortion, and interference on a regular basis. The physical reasons for poor audio per-formance is provided, along with practical solutions fordesigning audio amplifiers and systems to maximize sig-nal integrity and be immune to noise and interference.

14:30 - 15:30 Audio System Analysis—Tips and Tricksto Verify Your Designs, Simon Wollard, Prism Sound

Active loudspeakers are becoming ever more complex interms of their underlying architecture. The number of input interfaces has grown significantly in recent years;sophisticated signal processors have evolved to greatlyenhance performance; amplifiers have grown more ca-pable in terms of self-protection, output power and effi-ciency; transducers are ever-improving in terms ofsmoother response and lower distortion. This practicaldiscussion will teach the audience how to take advan-tage of modern digital audio analysis techniques to provide far greater and more rapid insight into the perfor-mance of active loudspeakers than was previously possi-ble with classical audio measurement techniques.

Session P18 Tuesday, May 714:30 – 16:30 Sala Carducci

SPATIAL AUDIO—PART 3: AMBISONICS, WFS

Chair: Symeon Delikaris-Manias, Aalto University, Helsinki, Finland

14:30

P18-1 An Ambisonics Decoder for Irregular 3-DLoudspeaker Arrays—Daniel Arteaga, Fundacio Barcelona Media, Barcelona, Spain;Universitat Pompeu Fabra, Barcelona, Spain

We report on the practical implementation of anAmbisonics decoder for irregular 3-D loudspeakerlayouts. The developed decoder, which uses anon-linear search algorithm to look for the optimalAmbisonics coefficients for each loudspeaker, hasa number of features specially tailored for repro-duction in real-world 3-D audio venues (for exam-ple, special 3-D audio installations, concert halls,audiovisual installations in museums, etc.). In par-ticular, it performs well even for highly irregularspeaker arrays, giving an acceptable listening experience over a large audience area. Convention Paper 8918

15:00

P18-2 The Effect of Low Frequency Reflections onStereo Images—Jamie A. S. Angus, Universityof Salford, Salford, Greater Manchester, UK

This paper looks at the amount of absorption required to adequately control early reflections inreflection-controlled environments at low fre-quencies (< 700 Hz). This is where the Inter-aur-al Time Delay Cue (ITD) is important. Most workhas focused on wideband energy time perfor-mance. In particular, it will derive the effect a giv-en angle and strength of reflection has on thephantom image location using the Blumleinequations. These allow the effect of reflectionsas a function of frequency to be quantified. It willshow that the effect of reflections are compara-tively small for floor and ceiling reflections, butthat lateral reflections depend on the phantomimage location and get worse the more off-cen-ter the phantom image becomes. Convention Paper 8919 [This paper was not presented but is available for purchase]

15:30

P18-3 Parametric Spatial Audio Reproduction withHigher-Order B-Format Microphone Input—Ville Pulkki,1 Archontis Politis,1 Giovanni DelGaldo,2 Achim Kuntz31Aalto University, Aalto, Finland2Ilmenau University of Technology, Ilmenau, Germany3Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany

A time-frequency-domain non-linear parametricmethod for spatial audio processing is presentedhere, which can utilize microphone input havingdirectional patterns of any order. The method isbased on dividing the sound field into overlap-ping or non-overlapping sectors. Local pressureand velocity signals are measured within eachsector, and an individual Directional Audio Cod-ing (DirAC) processing is performed for eachsector. It is shown, that in certain acousticallycomplex conditions the sector-based processingenhances the quality compared to traditionalfirst-order DirAC processing. Convention Paper 8920

16:00

P18-4 Wave Field Synthesis of Virtual SoundSources with Axisymmetric Radiation PatternUsing a Planar Loudspeaker Array—Filippo

Technical Program


Maria Fazi,1 Ferdinando Olivieri,1 Thibaut Carpentier,2 Markus Noisternig21University of Southampton, Southampton, Hampshire, UK2UMR STMS IRCAM-CNRS-UPMC, Paris, France

A number of methods have been proposed forthe application of Wave Field Synthesis to thereproduction of sound fields generated by pointsources that exhibit a directional radiation pat-tern. However, a straightforward implementationof these solutions involves a large number ofreal-time operations that may lead to very highcomputational load. This paper proposes a sim-plified method to synthesize virtual sources withaxisymmetric radiation patterns using a planarloudspeaker array. The proposed simplificationrelies on the symmetry of the virtual source radi-ation pattern and on the far-field approximation,although a near-field formula is also derived.The mathematical derivation of the method ispresented and numerical simulations validatethe theoretical results. Convention Paper 8921

Session EB4 Tuesday, May 714:30 – 16:00 Foyer

ENGINEERING BRIEFS—POSTERS: PART 2

14:30

EB4-1 Spatial Acoustic Synthesis—CelambarasanRamasamy, Mind Theatre, Pasadena, CA, USA

This method leverages binaural sound synthesisto present a novel way for listeners to experi-ence musical performances on a headphone. Byseparating out the notes from a musical instru-ment and synthesizing them individually in thespace around the listener, the musical instru-ment is turned from a point source into a flowingmusical volume that engulfs the listener. Thiscan lead to interesting ways of thinking about amusical performance in terms of the distributionof the individual musical notes around the listen-er without being tied down to the notion of an individual musical instrument. Engineering Brief 98

14:30

EB4-2 Enhancing the Learning of Stereo Microphone Techniques through the Use of a Simulated Learning Environment—ColinDodds, Perth College, University of the Highlands and Islands, Perth, Scotland, UK

A key set of skills for aspiring recording engi-neers to acquire is that of good stereo micro-phone techniques. Within the world of educationit is relatively straightforward to present the underpinning theory, but helping students gainthe tacit knowledge necessary to achieve qualityresults can be difficult when access to suitablegroups of musicians, spaces, and equipment islimited. A suitable learning environment wassimulated within a computer program and tasksset to support the learning of stereo microphonetechniques. A trial carried out on first year under-graduate sound production students revealedthat using the simulated learning environment

enhanced both the students’ knowledge of andability to apply stereo microphone techniques. Engineering Brief 99

14:30

EB4-3 The Good Vibrations Problem—Derry FitzGerald, Dublin Institute of Technology,Dublin, Ireland

Many of the Beach Boys’ records were monoonly as this was Brian Wilson's preferred format.However, starting in the mid-1990s, stereo mix-es of many of these classics were created bysynchronizing the tracks from the instrumentalmultitrack with those of the vocal multitrack. Unfortunately, for a number of tracks, including“Good Vibrations,” elements of the multitrackswere missing, making a true stereo mix impossi-ble. This paper deals with how stereo extractionmixes were created for a number of Beach Boys’songs using sound source separation tech-niques to separate sources from the originalmono recordings, which were then panned tocreate stereo mixes. These mixes were used inreissues of Beach Boys albums in 2012. Engineering Brief 100

14:30

EB4-4 Architectural Acoustics and Electroacousticsin the Asisium Theatre: An Integrated Con-struction Work—Luca Quaranta, CP ProgettiS.r.l., Rome, Italy

Based on the multi-functional needs, from theconference room to the cinema with Dolby Sur-round sound effects, we performed an analysisof the room’s acoustic response based on themodel of a 3-D simulation, referenced onacoustic parameters according to ISO-3382,thus directing the choices of architectural designto obtain optimum acoustical conditions both onstage and in the audience. Design of the elec-troacoustic provides various control technologiesfor the equalization and a 6.1 Dolby Digital distri-bution of the amplified audio signal. In this con-tribution we present choices and design criteria,materials and audio devices used to obtain thefinal result, to underline the key role of architec-tural acoustics and electroacoustic integrateddesign. Engineering Brief 101

14:30

EB4-5 Expressive Physical Modeling of KeyboardInstruments: From Theory to Implementation—Stefano Zambon,1 Leonardo Gabrielli,2Balázs Bank31Viscount International S.p.A., Mondaino (RN), Italy2Universitá Politecnica delle Marche, Ancona, Italy3Budapest University of Technology and Economics, Budapest, Hungary

Physics-based algorithms for sound synthesishave been extensively studied in the pastdecades. Nevertheless, their use in commercialsynthesizers is still limited due to the difficulty inachieving realistic and easily controllable soundswith current technology. In this Engineering Brief �


we present an overview of the models used inPhysis Piano, a digital piano recently introducedin the market with dedicated physics-based algo-rithms for acoustic pianos, electric pianos (e.g.,Rhodes, Wurlitzer, and Clavinet), and chromaticpercussions (e.g. vibraphone, marimba, xylo-phone). The synthesis algorithms, which arebased on standard techniques such as ModalSynthesis and Digital Waveguides, have beenhighly customized in order to faithfully reproducethe sound features of the original instrumentsand are easily controllable by a set of meaning-ful, user-friendly parameters. Engineering Brief 102

Workshop 10 Tuesday, May 714:30 – 16:30 Sala Foscolo

MIKING FOR PA

Chair: Eddy B. Brixen, EBB-consult, Smørum, Denmark

Panelists: Giacome De Caterini, Casal Bauer, ItalyIgor Fiorini, VDM Group, ItalyCathrine Sadolin, Complete Vocal Institute, Copenhagen, DenmarkHenrik Sadolin, Complete Vocal Institute, Copenhagen, Denmark

Miking for PA is a very important task. Providing amplifi-cation to the spoken or singing voice or the acousticalmusic instrument requires good knowledge about the

Technical Programsound source, about the PA-system, about the monitor-ing system—and about the microphones. This workshoptakes you through some of the important issues and decisions when selecting the microphone with regards topeak level capacity, sensitivity, directivity, frequency response, sensitivity, etc. Getting balance, getting defini-tion, getting the right timbre or “sound”—and still avoid-ing acoustical feedback, that’s the thing. Recognized engineers and sound designers will generously sharetheir experiences from their work on the stages. Warn-ing: Some of the attendees may pick up ideas that willchange their habits forever…

Student/Career EventSTUDENT DELEGATE ASSEMBLY MEETING—PART 2Tuesday, May 7, 14:30 – 16:00Auditorium Loyola

At this meeting the SDA will elect a new vice chair. Onevote will be cast by the designated representative fromeach recognized AES student section in the Europe andInternational Regions. Judges’ comments and awardswill be presented for the Recording Competitions. Plansfor future student activities at local, regional, and interna-tional levels will be summarized.

AES 134th Convention Papers and CD-ROM* * * * * * * * * * * * * * * * * * * * * * * * * *

ORDERING INFORMATION AND PRICESUnited States: Orders must be prepaid by Visa, Mastercard orAmerican Express (AMEX) credit card or in U.S. dollars drawn on aU.S. bank. Make checks payable to Audio Engineering Society. Sendcompleted order form to: Audio Engineering Society, 60 East 42ndStreet, Suite 2520, New York, NY 10165-2520, USA. Tel. +1 212 6618528, Fax +1 212 682 0477, E-mail [email protected], Web sitewww.aes.org.

Continental Europe: Send completed order form, WITHOUT paymentbut indicating method of payment (credit card, Eurocheque orcheque) in Euros to: Michael Williams, Ile du Moulin, 62 bis Quai del’Artois, FR-94170 Le Perreux sur Marne, France. Tel. +33 1 48814632, Fax +33 1 4706 0648, E-mail [email protected]. You will receivea pro forma invoice indicating details of payment.

United Kingdom: Orders must be prepaid. Make cheques payable toAudio Engineering Society. Send completed order form to: AESBritish Section, P. O. Box 645, Slough SL1 8BJ, UK. Tel. +44 1628663725, Fax +44 1628 667002, E-mail [email protected].

134th CONVENTION PAPERS (single copy)

Member: ———x $ 5. (USA) = ————

Nonmember: ———x $ 20. (USA) =—————

134th CD-ROM (114 Convention Papers)

Member: _______ x $140. (USA) = ————

Nonmember: _______ x $190. (USA) = ————

Ñ Ñ Ñ Total Amount Enclosed:———————

AES Member Number:—————————————————Name: —————————————————————————————————————————————————————

Address: ————————————————————————————————————————————————————

————————————————————————————————————————————————————————

City/State/Country and Zip or Country Code: —————————————————————————————————————

VISA/MC and AMEX ———————————————————————————————————Exp. Date:————————

Signature: ———————————————————————————————————————————————————

aes134 th conven tion pro gram - aes | audio engineering ...using time scale modified speech, and...

Documents