leveraging wideband codecs for voip development laurent amar president, voiceage corporation
TRANSCRIPT
Leveraging Wideband Codecs for VoIP Development
Laurent AmarPresident, VoiceAge Corporation
Contents
Speech Communication/Coding Basics
Wideband Speech Description and Applications
Wideband Speech Codec Standards
Real-World Wideband VoIP Deployment
Wideband Momentum
What’s Next & Wrap Up
Speech Signal Basics
Understanding Speech Communication
Human Physiology and Perception are Key
Encode and exchange primarily the speech signal information that is important for human perception
Use human speech production and comprehension parameters to reduce bit rate and enhance communication quality
Speech Coding Attributes Bit rate
• As low as possible
Delay• As little as possible
Quality• As high as possible
Complexity• As algorithmically simple as possible to constrain platform processing and
memory requirements
Robustness• Effective operation under background noise and channel impairment
conditions
Standards compliance• Open, tested and interoperable solutions
As required by specific applications
Difficult to attain all of these often divergent objectives at the same time
Speech Synthesis ModelUsed in CELP (Code Excited Linear Prediction) Speech
Coding
PredictionLong-term
s(n)c(n) v(n) ^
PredictionShort-term
Innovative excitation
Synthesized speech
1
2
3
1 2 3
1 = air from lungs
2 = vocal chords (periodicity)
3 = vocal tract (mouth + lips)Very successful speech compression algorithm is based on Algebraic CELP:
ACELP ®
ACELP at the heart (overview)
Ask Redwan – glean from his presentation
CELP Decoder Principles
More on ACELP implementation
Ask Redwan or take block diagrams from the old poster.
CELP Encoder Principles
• The excitation parameters (codebook indices and gains) are determined by minimizing the perceptually weighted error between original and synthesized speech.
• Analysis-by-synthesis where a ‘local decoder’ (the orange part) exists inside the encoder.
International Standards Using ACELP
What is Wideband Telephony?AMR Standard Codec Family at a GlanceBuilt on a solid, market-proven ACELP ® technology foundation
G.722.2
A Complete Suite of Low Bit Rate Speech and Audio Coding Solutions
Contents
Speech Communication/Coding Basics
Wideband Speech Description and Applications
Wideband Speech Codec Standards
Real-World Wideband VoIP Deployment
Wideband Momentum
What’s Next & Wrap Up
What is Wideband Telephony?What is Wideband Telephony?
An Emerging Opportunity to Deliver Vastly Improved Speech Quality
•Substantially increases transmitted speech information
• Double the bandwidth
•Enables digital end-to-end packet-based telephony services to deliver much better speech quality than traditional PSTN circuit-switched telephony
• VoIP quality differentiatorHearing is believing! Visit VoiceAge at booth #305 for a demoAlso visit the listening room at www.voiceage.com to hear samples
Why Wideband VoIP Telephony Now?Enabling Technologies and Consumer Perceptions are
Converging
Improved presence, naturalness and intelligibility• Reduces listener fatigue• Improved Hands-free/speakerphone sound quality
Improves speaker and speech recognition High-quality low bit rate wideband codecs
• e.g., G.722.2/AMR-WB & VMR-WB at rates ranging from 7–24 kbps
Rising user awareness of enhanced sound quality• Wideband teleconferencing• Wideband enterprise IPtelephony• Wireless/VoIP multimedia services
Driving up expectations! Interoperable wideband codec solutions over end-to-end
digital networks help pave the way for fixed/mobile convergence
Wideband Telephony Benefits:
Typical Speech Signal Acoustics
1 0 0 0
0
2 0 0 0
3 0 0 0
4 0 0 0
5 0 0 0
6 0 0 0
7 0 0 0
0 .50 1 .0 1 .5 2 .0 2 .5 3 .0T im e [s]
Fre
qu
ency
[H
z]
200
- 34
00H
z50 -
700
0Hz
Typical Speech Signal AcousticsWideband telephony covers much more speech signal information
Improved voice quality and intelligibility (e.g., s & f differentiation)
Improved speech naturalness, presence and comfort
“Everyone looked extremely confused about the news”
Wideband Telephony ApplicationsScope is much wider than VoIP Telephony
VoIP hi-fi telephony (G.722.2)
Cellular wireless hi-fi telephony (AMR-WB & VMR-WB)
Wi-Fi VoIP telephony
Converged wireless/wire-line telephony
Multi-point audio and video teleconferencing
Video telephony audio coding
Call center conversation recording and archiving
Speech and speaker recognition-based systems
Digital radio broadcasting and field reporting
Hi-fi ringtones
Contents
Speech Communication/Coding Basics
Wideband Speech Description and Applications
Wideband Speech Codec Standards
Real-World Wideband VoIP Deployment
Wideband Momentum
What’s Next & Wrap Up
The Standard Solution AdvantageInteroperable, Open and Fully Tested
Open, collaborative and competitive process
Requirements specifically address target applications
Published algorithms and source code • Permits wider and more effective scrutiny
Rigorous comparative testing under diverse conditions
• Background noise types and levels• Spoken languages• Speaker types• Various network impairmentsEnsures that the best technologies are chosen
Evolution of Wideband StandardsA steady progression of high-quality speech coding
technologies
1987FR
13 kb/s1994HR
5.6 kb/s 1995EFR
12.2 kb/s1999
AMR-NB4.75-12.2 kb/s
1972G.71164 kb/s
1984G.72632 kb/s
1992G.72816 kb/s
1995G.729
6.4,8,11.8 kb/s
1988G.722
48,56,64 kb/s
1999G.722.124,32 kb/s
2001-20023GPP/ITU-TAMR-WB/
G722.26.6-23.85 kb/s
1993IS-96A
Rate-Set I
1995IS-96A
Rate-Set II
1997EVRC
Rate-Set I
2000SMV
Rate-Set I
20043GPP2
VMR-WB(Source
Controlled)Rate-Set I & II
3GPP2
3GPPWideband
ITU-T
InteroperableWideband
Narrowband
ITU-T
G.722.2/AMR-WB and VMR-WB Standards
• 3GPP 1999 TS 26.111 recommends AMR-WB for (3G-324H) multimedia telephone handsets
• 3GPP 2001 TS 26.190 defines the AMR-WB codec• ITU-T 2002 G.722.2 recommended for wideband speech• 3GPP2 (2004) C.S0052, “Source-Controlled Variable-Rate
Multimode Wideband Speech Codec (VMR-WB), Service Options 62 and 63 for Spread Spectrum Systems,” specifies the VMR-WB codec for cdma2000® systems.
• 3GPP 2005 TS.235 requires packet-switched multimedia terminals at 16kHz and PoC terminals to support AMR-WB
• OMA 2005 Push-to-Talk User Plane states the PoC server must support AMR and AMR-WB media parameters
Widespread success in international standards competitions
G.722.2 Subjective Testing Results
Clean Condition Test (English Language)AMR-WB Characterization Test
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
MO
S
No Tandem -26 dBov Self-Tandem -26 dBov
G.722 @ 64 kbps
G.722 @ 48 kbps
G.722.2 @ 8.85 kbps
G.722.2 @ 12.65 kbps
G.722.2 @ 18.25 kbps
G.722.2 @ 23.05 kbps
G.722.2/AMR-WB Delivers Excellent Wideband Speech QualityEven at Low Bit Rates (e.g. MOS at 8.85 kbps exceeds G.722 at 48 kbps)
Contents
Speech Communication/Coding Basics
Wideband Speech Description and Applications
Wideband Speech Codec Standards
Real-World Wideband VoIP Deployment
Wideband Momentum
What’s Next & Wrap Up
Enabling Wideband VoIP TelephonyThe Key Underpinnings
Wideband speech coding technology is ready – what else is needed for mass adoption?
Wideband capable terminal device speakers and microphones
More and more network elements and end-devices equipped with compatible wideband codecs
•Standard wideband codecs ensure smooth interoperability•Software-driven terminals enable downloading of the
latest enhancements to standard wideband codecs•Relevant application servers and network infrastructure
gear need to support the necessary wideband standard codecs
Fully digital packet-based VoIP networks that are readily configurable to support wideband telephony
Implementation Considerations Interoperability
• Important to eliminate or reduce transcoding Transcoding adds cost, delay and jitter Degrades speech quality
Complexity• Tradeoff between bit rate and complexity/memory• An important design consideration for handheld devices• Miniaturization trends, Moore’s law and other
innovations are still going strong though
Quality of Service• Robust real-world performance, need to consider:
Packet loss – Counter with concealment and FEC methods Background noise – Mitigate with noise suppression
Delay and jitter – Minimize delay and manage jitter
Total bit rate available• Codec & system/channel coding both contribute
Enabling Transcoder Free Interoperability
Enabling Seamless Communication across Wireline, Wireless and Wi-Fi Networks
Growing Real-World Wideband Deployment Momentum
Teleconferencing system vendors• Wideband telephony deployment pioneers – have a very
compelling wideband speech application Hi-fi ringtones (True Tones)
• Increasing deployment in newer mobile phones from major vendors
Enterprise IPphone systems• Campus LAN environments provide an ideal platform for
rolling out wideband telephony Emerging wideband VoIP services for the masses
• Provide an opportunity for service providers to differentiate VoIP offering to the mass market
• Broadband Internet access is quickly becoming the norm helping VoIP become mainstream
• Increasing availability of wideband speech capable devices• Softphone clients like XtenTM’s eyeBeamTM are integrating
wideband codecs (G.722.2)
Wideband
Wideband in Enterprise VoIP Enterprise are deploying wideband VoIP
telephony• Intra-site GbE/10 GbE LANs widely deployed
Facilitate converged IT corporate data and VoIP voice communications over a common network infrastructure
Intra-site networks primed for VoIP with wideband
• Improves communications effectiveness and productivity within a corporate network
Little or no additional cost needed Improves mission critical communications (e.g.
hospitals)
• Compression and robustness are important for cost-effective communications between sites over a WAN
Also significant when reaching out to mobile employees (either over cellular at remote sites or over a WLAN connection within a site or campus)
Wideband VoIP over Xten Softphones XtenTM eyeBeamTM has readily implemented and
demonstrated G.722.2 VoIP Enabling a higher quality conversation
with the same/similar bandwidth as narrowband codecs
Service providers can provide a higher value service for the same cost
Supports interoperability between SIP and 3G cellular network devices without audio signal transcoding
G.722.2 readily integrated and demonstration on the eyeBeam
Reduces the need for operators to purchase, operate and maintain additional equipment such as transcoders and wideband capable hard-phones
Enables service providers to rollout VoIP services with superior voice quality
XtenTM eyeBeamTM
Contents
Speech Communication/Coding Basics
Wideband Speech Description and Applications
Wideband Speech Codec Standards
Wideband VoIP Implementation Considerations
Real-World Wideband Momentum
What’s Next & Wrap Up
Beyond Wideband Speech, what’s next?
Teleconferencing solution pioneers are introducing new audio enhancements:
• Ultra-wideband, which typically increases the transmitted speech bandwidth to 14 – 16 kHz
Increases further the richness of conversational voice quality• Stereo sound and spatial sound
Gives a better sense of speaker directionality for remote meetings
Audio improvements also driven by multimedia services, such as:
• On-line gaming, audiovisual telephony and rich messaging
Emerging hybrid speech and stereo audio codecs effectively meet these emerging needs with efficient use of channel capacity, e.g.:
• The AMR-WB+ hi-fi audio compression codec (selected by the 3GPP for mobile multimedia services), encompasses essentially the full human audio spectrum with parametric stereo, even at low bit rates.
Summary Wideband speech is beginning to gain real-world
momentum• The key enablers are widely available (end-user devices,
end-to-end digital networks, interoperable standard WB codecs, …)
User expectations for improved audio quality are rising• Video telephony, audiovisual conferencing and remote
collaboration and other multimedia services are expected to be extremely popular for both business and residential use
Once wideband speech communication is widely deployed and available it will increasingly become expected by users as the norm
The stage is set for widespread wideband VoIP – it is time for main the players (you the developers) to make it happen
What are you waiting for?Go make it happen!
Abbreviations/Glossary3GPP: Third Generation Partnership Project (Standards body defining GSM evolution to 3G
networks)3GPP2: Third Generation Partnership Project 2 (Standards body defining CDMA evolution to 3G)AMR: Adaptive Multi-Rate (standard narrowband speech codec for GSM and WCDMA networks)AMR-WB/G.722.2: Adaptive Multi-Rate Wideband (standard wideband speech codec for GSM and
WCDMA networks and ITU-T (as G.722.2))AMR-WB+: Extended Adaptive Multi-Rate Wideband (standard wideband speech and hi-fi audio
codec)CDMA: Code Division Multiple Access (Technology behind the second most popular cellular
networks)BTS: Base Transceiver StationBSS: Base Station SystemCNG: Comfort Noise Generation (decoder feature the generates comfort noise to avoid listener
annoyance when the encoder at the far-end is not transmitting due to silence)GSM: Global System for Mobile (most widely deployed cellular mobile technology) ITU-T: International Telecommunications Union – Telecommunications standardization sectorMOS: Mean Opinion Score (a subjective test methodology for evaluating speech quality)OMA: Open Mobile Alliance (an organization formed to facilitate the global user adoption of mobile
data services) PoC: Push-to-talk over Cellular (walkie-talkie like service over cellular networks)VAD: Voice Activity Detection (an encoder feature that detects when the user is speaking)VMR-WB: Variable Rate Multi-mode Wideband (standard wideband speech codec for CDMA2000®)WCDMA: Wideband CDMA (Technology adopted by GSM networks for their evolution to 3G)wMOPS: weighted Million Operations Per Second (measure of codec complexity)