Transcript

Expanding the Scope of Signal Processing

[from THE EDITOR]Li Deng

Area Editor, Feature [email protected]

www.ee.Columbia.edu/spm

IEEE SIGNAL PROCESSING MAGAZINE [2] MAY 2008

This special section in thisissue of IEEE Signal Pro-cessing Magazine is dedicat-ed to the theme of spokenlanguage technology, explor-

ing “signal” processing techniques forboth speech and language. The articles inthis issue cover “language” and/or“understanding” aspects of “signal” pro-cessing, beyond the conventional scopeof speech processing. This forms a starkcontrast to the special section in theSeptember 2005, speech technology inhuman-machine communication, whereonly half of the articles dealt with lan-guage processing while the focus was onthe more traditional tasks of speech

analysis and recognition with the trendtowards speech understanding.

This change represents a significantnew development in our field of signalprocessing, where the conventional wayof thinking about the signal as low-level,numerical-valued information is tran-scended to a new perspective concerningthe importance of high-level, symbolic-valued informational sources (e.g., lan-guage or text) that embed theirunderlying semantic contents. The IEEESignal Processing Society’s Constitutionreads: “… The Field of Interest of theSociety shall be the theory and applica-tion of filtering, coding, transmitting,estimating, detecting, analyzing, recog-nizing, synthesizing, recording, andreproducing signals by digital or analog

devices or techniques. The term ‘signal’includes audio, video, speech, image,communication, geophysical, sonar,radar, medical, musical, and othersignals…” (article II) While reviewingthis, I would naturally classify “language”into one of the nontraditional “other sig-nals” (albeit its prominence as witnessedby the articles in this special section).

Article II of our constitution also listsa range of signal/information processingtasks or application areas, each associat-ed with a particular type of signal.Language as a signal is intimately relatedto the task of understanding (i.e., discov-ering the underlying meaning or seman-tics embedded in the signal), which hasnot appeared in the traditional list oftasks. Hence, if we expand the traditionalDigital Object Identifier 10.1109/MSP.2008.920380

IEEE SIGNAL PROCESSING MAGAZINEShih-Fu Chang, Editor-in-Chief — Columbia

University

AREA EDITORSLi Deng, Feature Articles — MicrosoftAdriana Dumitras, Columns and Forums —

Apple Inc.Doug Williams, Special Sections —

Georgia Institute of TechnologyMin Wu, e-Newsletter — University of Maryland

EDITORIAL BOARDAlex Acero — MicrosoftJohn G. Apostolopoulos — Hewlett-Packard

LaboratoriesPhilip A. Chou — Microsoft ResearchIngemar Cox — University College LondonSadaoki Furui — Tokyo Institute of Technology,

JapanAlex Gershman — Darmstadt University of

Technology, GermanyYingbo Hua — University of California, RiversideAlex Kot — Nanyang Technological University,

SingaporeBede Liu — Princeton UniversityB.S. Manjunath — University of California,

Santa-BarbaraNasir Memon — Polytechnic UniversityAntonio Ortega — University of Southern

CaliforniaThrasos Pappas — Northwestern UniversitySoo-Chang Pei — National Taiwan University,

TaiwanMichael Picheny — IBM T.J. Watson

Research CenterVincent Poor — Princeton UniversityJose C. Principe — University of Florida

Majid Rabbani — Eastman Kodak CompanyTodd Reed — University of HawaiiPhillip A. Regalia — Catholic University

of AmericaHideaki Sakai — Kyoto University, JapanDan Scholfeld — University of Illinois

at ChicagoPeter Stoica — Uppsala University, SwedenLee Swindlehurst — Brigham Young UniversityMurat Tekalp — Koc University, TurkeyAhmed Tewfik — University of MinnesotaIsabel Trancoso — INESC, Portugal Xiaodong Wang — Columbia UniversityRabab Ward — University of British Columbia,

Canada

ASSOCIATE EDITORS—COLUMNS AND FORUMGhassan AlRegib — Georgia Institute

of TechnologyAlen Docef — Virginia Commonwealth UniversityDan Ellis — Columbia UniversityBerna Erol — Ricoh California Research CenterKonstantinos Konstantinides — Hewlett-PackardRick Lyons — Besser AssociatesAleksandra Mojsilovic — IBM T.J. Watson

Research CenterGeorge Moschytz — Bar-Ilan University, IsraelShrikanth Narayanan — University of Southern

CaliforniaFernando Pereira — Instituto Superior Tecnico,

PortugalC. Britton Rorabaugh — DRS C3 Systems Co.Greg Slabaugh — Medicsight PLC, U.K.Hari Sundaram — Arizona State UniversityWade Trappe — Rutgers UniversityStephen T.C. Wong — Methodist Hospital-

Cornell

ASSOCIATE EDITORS—e-NEWSLETTERNitin Chandrachoodan — India Institute

of Technologies Huaiyu Dai — North Carolina State UniversityPascal Frossard — EPFL, SwitzerlandAlessandro Piva — University of Florence, ItalyMihaela van der Schaar — University of

California, Los Angeles

IEEE PERIODICALS MAGAZINES DEPARTMENTGeraldine Krolin-Taylor — Senior Managing

EditorSusan Schneiderman — Business Development

Manager+1 732 562 3946 Fax: +1 732 981 1855

Dawn M. Melley — Editorial DirectorPeter M. Tuohy — Production DirectorFelicia Spagnoli — Advertising Production Mgr.Janet Dudar — Art DirectorGail A. Schnitzer — Assistant Art DirectorFran Zappulla — Staff Director, Publishing

Operations

IEEE SIGNAL PROCESSING SOCIETY José M.F. Moura — PresidentMos Kaveh — President-ElectMichael D. Zoltowski — Vice President,

Awards and MembershipAthina Petropulu — Vice President, ConferencesPetar Djuric — Vice President, FinanceK.J. Ray Liu — Vice President, Publications Alex Acero — Vice President, Technical

DirectionsMercy Kowalczyk — Executive Director and

Associate Editor

Authorized licensed use limited to: MICROSOFT. Downloaded on January 9, 2009 at 13:16 from IEEE Xplore. Restrictions apply.

[from THE EDITOR] continued

IEEE SIGNAL PROCESSING MAGAZINE [4] MAY 2008

signals to include language, then we alsowould expand the traditional signal pro-cessing tasks to include understanding.

When we associate each signal typewith the full range of the processingtasks, a matrix can be established inwhich each entry in the matrix representsa specific subfield in signal processing.For example, the row in the matrix forthe task of synthesis applies to all types ofsignals. And for each entry in this row, wewould have computer music (for theaudio/music signal), speech synthesis (forthe speech signal), and computer graph-ics (for the image signal). After expandingthe traditional signals to include lan-guage, we will have another entry callednatural language generation, a well-established research and application areain computational linguistics and artificialintelligence. Similarly, after expandingthe traditional signal processing tasks toinclude understanding, then we expandthe traditional matrix with one new row.For example, the entries in the matrix

corresponding to the speech columnwould contain not only the traditionalelements of speech coding, speech trans-mission, speech analysis, speechenhancement, speech synthesis, andspeech recognition but also a new ele-ment of speech understanding (or spokenlanguage understanding), which is thefocus of the issue in your hands now.

To extrapolate the above idea further, Ican propose new expansion of the signalprocessing tasks to include that ofretrieval/mining (a simpler task thanunderstanding). Then, the new row ofthis further expanded matrix will containthe new elements of music retrieval, spo-ken document retrieval, image retrieval,video search, and text search (informa-tion retrieval). Other possible expansionswould include the new task of securityand new bioinformatic/ genomic signals.These are not traditional signal process-ing and as was elegantly pointed out byProf. Ray Liu earlier, information pro-cessing is our destiny beyond the tradi-

tional scope of signal processing (see theeditorials written in the January andSeptember 2004 issues of this magazine).

Some of the new column elements inthe expanded matrix after including lan-guage/text in the scope of signal process-ing are: document summarization (forthe task of coding), text parsing (foranalysis), spelling/grammar correction(for enhancement), natural languagegeneration (for synthesis), natural lan-guage understanding (for understand-ing), and Web search (for retrieval).

What are practical benefits of con-structing the signal/task matrix just dis-cussed? First, we would see a naturalexpansion of the scope of signal process-ing with regard to the treatment of lan-guage/text as the signal and treatment ofunderstanding and retrieval as the pro-cessing tasks, which I outlined above.This would be viewed as the extension ofthe matrix size in both the row (process-ing task) and column (signal). Indeed,language adds a new dimension in thesignal set with its essential task of con-tent understanding, an ultimate goal ofhuman intelligence. Second, successfulmachine learning techniques developedfor one particular signal type may bemore naturally welcomed and examinedby researchers working on other signals.Such critical examination would facilitatethe potential unification of signal pro-cessing methodologies, enhancing theappreciation of and knowledge about thesimilarities and differences in processingtechniques across different signal types.This would be viewed as cross-columnpropagation from the perspective ofmatrix construction. Third, we can alsobenefit from cross-row propagation. Forexample, while extension from the recog-nition task to the understanding task isbecoming well established for the speechsignal (as clearly demonstrated in thisspecial section), are there and shouldthere be similar trends on task expansionfor other types of signals?

These are some of my thoughts aboutthis special section. Enjoy reading it. [SP]

360 697-3472 voice

360 697-7717 fax

[email protected]

Pioneer Hill Software

24460 Mason Rd

Poulsbo WA 98370PHS

High Performance

Real-time Spectrum Analyzer

Features2 ch, 24bit, 192kHz

FFT to 1048576 pts

1/1 to 1/96 Octave

Waveform Analysis

Spectrogram Plot

3-D Surface Plot

Transfer Functions

Cross Spectrum

Signal Generator

Data Logging

Test Automation

SpectraPLUS 5.0FFT Spectral Analysis System

Download 30 day FREE trial!

www.spectraplus.com

A cost effective portable signal analysis solution using your

sound card for data acquisition. High performance USB

sound cards are widely available – see website for details.

Authorized licensed use limited to: MICROSOFT. Downloaded on January 9, 2009 at 13:16 from IEEE Xplore. Restrictions apply.


Top Related