acoustics research institute
DESCRIPTION
Austrian Academy of Science. OeAW-ISF. Acoustics Research Institute. MPEG-7 Today‘s Multimedia Standard Peter Balazs http://www.kfs.oeaw.ac.at. Peter Balazs 1999 started as programmer at the ISF 2001 finshed mathematics (University of Vienna). - PowerPoint PPT PresentationTRANSCRIPT
Acoustics Research Institute
Austrian Academy of Science
MPEG-7 Today‘s Multimedia Standard
Peter Balazshttp://www.kfs.oeaw.ac.at
Institut für Schallforschung der Österreichischen Akademie der Wissenschaften: A-1010 Wien; Liebiggasse 5. Tel. +43 1/4277-29500; Fax +43 1/4277-9296; email: [email protected]; http://www.kfs.oeaw.ac.at
OeAW-ISF
Peter Balazs1999 started as programmer at the ISF2001 finshed mathematics (University of Vienna)
MPEG-7
OeAW-ISF
• ISO / IEC Standard„Mulitmedia Content Description Interface“
• Multimedia data / metadata description systemLow Level – High Level; content based
• Open systemInheritance
• Description of methodsnormativ – informativ
MPEG-7
OeAW-ISF
• ISO / IEC Standard„Mulitmedia Content Description Interface“
• Multimedia data / metadata description systemLow Level – High Level
• Open systemInheritance
• Description of methodsnormativ – informativ
<AudioDescriptorxsi:type="SoundModelStatePathType"> <SoundModelRef>IDDogBarks</SoundModelRef>
<StateRef>IDState1</StateRef> <RelativeFrequency>0.000</RelativeFrequency> <StateRef>IDState2</StateRef> <RelativeFrequency>0.000</RelativeFrequency> <StateRef>IDState3</StateRef> <RelativeFrequency>0.045</RelativeFrequency> <StateRef>IDState4</StateRef> <RelativeFrequency>0.000</RelativeFrequency> <StateRef>IDState5</StateRef> <RelativeFrequency>0.442</RelativeFrequency> <StateRef>IDState6</StateRef> <RelativeFrequency>0.513</RelativeFrequency>
</AudioDescriptor>
MPEG-7
OeAW-ISF
• History
Call for Proposals October 1998
Evaluation February 1999
First version of Working Draft (WD) December 1999
Committee Draft (CD) October 2000
Final Committee Draft (FCD) February 2001
Final Draft International Standard (FDIS) July 2001
International Standard (IS) September 2001
• Development
Amendment Audio May 2002
Call for Proposals (Systems, version 2) July 2002
MPEG 21 international standard April 2009
XML = eXtensible Markup Language
XML
OeAW-ISF
<?xml version=„1.0“>
• Metasprache
• Hypertext
• Markup markup = tag <Befehl> ... </Befehl>
• Open Standard <?xml version=„1.0“>
<!DOCTYPE document [<!ELEMENT ADRESSE (Vorname,
Nachname, Wohnort)><!ELEMENT Vorname (#PCDATA)>....]>
<?xml version=„1.0“>
<!DOCTYPE document [<!ELEMENT ADRESSE (Vorname,
Nachname, Wohnort)><!ELEMENT Vorname (#PCDATA)>....]>
<ADRESSE> <Vorname> Peter </Vorname> <Nachname> Balazs </Nachname> <Wohnort> Tulln </Wohnort></ADRESSE><ADRESSE> ........
<?xml version=„1.0“> <!-– XMl-Test --><!DOCTYPE document [
<!ELEMENT ADRESSE (Vorname, Nachname, Wohnort)>
<!ELEMENT Vorname (#PCDATA)>....]>
<ADRESSE> <Vorname> Peter </Vorname> <Nachname> Balazs </Nachname> <Wohnort> Tulln </Wohnort></ADRESSE><ADRESSE> ........
XML = eXtensible Markup Language
XML
OeAW-ISF
• Metasprache
• Hypertext
• Markup markup = tag <Befehl> ... </Befehl>
• Open Standard <?xml version=„1.0“> <!-– XMl-Test --><!DOCTYPE document [
<!ELEMENT ADRESSE (Vorname, Nachname, Wohnort)>
<!ELEMENT Vorname (#PCDATA)>....]>
<ADRESSE> <Vorname> Peter </Vorname> <Nachname> Balazs </Nachname> <Wohnort> Tulln </Wohnort></ADRESSE><ADRESSE> ........
<Set ID="Viewer3" RunMode="Multiple> <Table ID="Settings"> CursorOpts = 0 0 1 440 SignalOpts = 1 1 </Table> <Set ID="Profiles"> <Table ID="Default"> FrameOpts = 40 1 75 2 0 1 GraphXY = 0 1e4 1 -80 50 1 Method = 0 32 20 0 1 0 0 0 1 0 0 Average = 0 0 99 </Table> </Set></Set>
MPEG-7
OeAW-ISF
• DescriptorsLow Level
• Descriptor SchemesHigh Level, container
• Descriptor Definition Language (DDL)XML Schema, STX Schema
• System ToolsASCII Text - binary
OeAW-ISF
MPEG-7 Audio: Low Level Descriptors
• Single Sample
• SegmentsDS, compare to STX
Out of [1]
OeAW-ISF
MPEG-7 Audio: Low Level Descriptors
• Scalar
• Vector
• Single
• Seriesseries of vectors
= table, matrix
• Scalable Series Out of [2]
OeAW-ISF
MPEG-7 Audio: Low Level Descriptors• Basic
AudioWaveform, AudioPower
• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness
OeAW-ISF
MPEG-7 Audio: Low Level Descriptors• Basic
AudioWaveform, AudioPower
• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness
• Signal ParametersAudioHarmonicity,
AudioFundamentalFrequency
OeAW-ISF
MPEG-7 Audio: Low Level Descriptors• Basic
AudioWaveform, AudioPower
• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness
• Signal ParametersAudioHarmonicity, AudioFundamentalFrequency
• Timbral TemporalLogAttackTime, TemporalCentroid
OeAW-ISF
MPEG-7 Audio: Low Level Descriptors• Basic
AudioWaveform, AudioPower
• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness
• Signal ParametersAudioHarmonicity, AudioFundamentalFrequency
• Timbral TemporalLogAttackTime, TemporalCentroid
• Timbral SpectralSpectralCentroid, HarmonicSpectralCentroid, HarmonicSpectralDeviation, HarmonicSpectralSpread, HarmonicSpectralVariation
OeAW-ISF
MPEG-7 Audio: Low Level Descriptors• Basic
AudioWaveform, AudioPower
• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness
• Spectral BasisAudioSpectrumBasis, AudioSpectrumProjection
• Signal ParametersAudioHarmonicity, AudioFundamentalFrequency
• Timbral TemporalLogAttackTime, TemporalCentroid
• Timbral SpectralSpectralCentroid, HarmonicSpectralCentroid, HarmonicSpectralDeviation, HarmonicSpectralSpread, HarmonicSpectralVariation Out of [1]
OeAW-ISF
MPEG-7 Audio: Low Level Descriptors• Basic
AudioWaveform, AudioPower
• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness
• Spectral BasisAudioSpectrumBasis, AudioSpectrumProjection
• Signal ParametersAudioHarmonicity, AudioFundamentalFrequency
• Timbral TemporalLogAttackTime, TemporalCentroid
• Timbral SpectralSpectralCentroid, HarmonicSpectralCentroid, HarmonicSpectralDeviation, HarmonicSpectralSpread, HarmonicSpectralVariation Out of [1]
OeAW-ISF
MPEG-7 Audio: Low Level Descriptors• Basic
AudioWaveform, AudioPower
• Basic SpectralAudioSpectrumEnvelope, AudioSpectrumCentroid, AudioSpectrumSpread, AudioSpectrumFlatness
• Spectral BasisAudioSpectrumBasis, AudioSpectrumProjection
• Signal ParametersAudioHarmonicity, AudioFundamentalFrequency
• Timbral TemporalLogAttackTime, TemporalCentroid
• Timbral SpectralSpectralCentroid, HarmonicSpectralCentroid, HarmonicSpectralDeviation, HarmonicSpectralSpread, HarmonicSpectralVariation Out of [2]
• Silence
Out of [1]
OeAW-ISF
MPEG-7 Audio: High Level DSs
• AudioSignatureAudioSpectrumFlatness
• Musical Instrument Timbre Description Tool HarmonicInstrumentTimbre (LAT + timbre spectral)
PercussiveInstrumentTimbre (timbre temporal + SpectralCentroid)
OeAW-ISF
MPEG-7 Audio: High Level DSs
• AudioSignatureAudioSpectrumFlatness
• Musical Instrument Timbre Description Tool HarmonicInstrumentTimbre (LAT + timbre spectral)
PercussiveInstrumentTimbre (timbre temporal + SpectralCentroid)
• Melody Description ToolsMelodyContour DS, Melody Sequence DS
OeAW-ISF
MPEG-7 Audio: High Level DSs
• AudioSignatureAudioSpectrumFlatness
• Musical Instrument Timbre Description Tool HarmonicInstrumentTimbre (LAT + timbre spectral)
PercussiveInstrumentTimbre (timbre temporal + SpectralCentroid)
• Melody Description ToolsMelodyContour DS, Melody Sequence DS
• General Sound Recognition and Indexing Description Tool SpectralBasis, SoundClassificationModel : SoundModels, classification scheme;
SoundModelStatePath, SoundModelStateHistogram
OeAW-ISF
MPEG-7 Audio: High Level DSs
• AudioSignatureAudioSpectrumFlatness
• Musical Instrument Timbre Description Tool HarmonicInstrumentTimbre (LAT + timbre spectral)
PercussiveInstrumentTimbre (timbre temporal + SpectralCentroid)
• Melody Description ToolsMelodyContour DS, Melody Sequence DS
• General Sound Recognition and Indexing Description Tool SpectralBasis, SoundClassificationModel : SoundModels, classification scheme;
SoundModelStatePath, SoundModelStateHistogram
• SpokenContentDescription Tools SpokenContentHeader : WordLexicon, PhonLexicon;
SpokenContentLattice: WordLinks, PhonLinks.
OeAW-ISF
MPEG-7 Audio: Amendment
• New Base typesoptional attribute for channel
• Modification of Spoken Content Description Tools„acoustics only“ score possible for speech recognition; prosody, syllabels
• Audio Signal Quality DSBackgroundNoiseLevel, BalanceType, DCoffsetType, BandwidthType.
TransmissionTechnologyType: shellac, vinyl,....
• Additional Tools:tempo description, compact variable precision representation (BAM)
• Liguistic Description Tools:semantic structure of liguistic data
OeAW-ISF
MPEG-7
Literatur:
[1] José M. Martínez, MPEG-7 Overview (version 8) ISO/IEC JTC1/SC29/WG11N4980, Klagenfurt, July 2002, http://mpeg.telecomitalialab.com/standards/mpeg-7/mpeg-7.htm
[2] ISO / IEC, Information Technology – Multimedia Content Description Interface – Part 4: Audio, Geneva, July 2001
[3] Oliver Pott, Günter Wielange, XML Praxis und Referenz, München 2001
[4] J. Bitzer, J. H. Martínez, Information Technology — Multimedia Content Description Interface — Part 4: Audio — Proposed Draft Amendment , Fairfax, May 2002
Links:
[4] MPEG Home Page, http://mpeg.telecomitalialab.com/
[5] Extensible Markup Language, http://www.w3.org/XML/
[6] STX, http://www.kfs.oeaw.ac.at/software.htm