in celp coders, the past excitation signal used to build the adaptive codebook is the main source of...

1
In CELP coders, the past excitation signal used to build the adaptive codebook is the main source of error propagation when a frame is lost. We presents a novel resynchronization technique using very low bit rate side information to correct the past excitation signal after a frame erasure. The novelty of this technique is that the correction is computed at the encoder in a closed loop fashion, based on the actual error introduced by the concealment. Objective and subjective test results show that this approach is a promising area for future research on frame loss recovery. A frame loss is simulated at the encoder (concealment) in order to determine the correction that should be applied to the past excitation signal (adaptive codebook). TYPE B (LOST ALIGN.): For stationary voiced signals, the correction consists in a gain (g) and a shift () TYPE A (LOST ONSETS): Side information describes the position and amplitude of the largest pulse We have demonstrated a concept which can be applied to any CELP codec: Very efficient for single frame loss Very limited bit rate (13 bits per frame) Minimum complexity overhead at the encoder, no overhead at the decoder Various improvements for errors of type A and B, and various solutions for errors of type C, are proposed in the paper. Analysis of actual “good” and “bad” past excitation signals shows that typical CELP concealment introduce 3 types of errors which are characterized by strong error propagation: Type A: Lost onsets Type B: Lost alignments Type C: Waveform mismatch The determination of the correction is done in the LPC excitation domain. The correction information depends on a signal classification step. To demonstrate the concept, we have chosen to concentrate on errors of types A and B. (a) no frame lost; (b) standard decoder; (c) modified decoder using side information; (d) and (e) error signals for the standard and modified decoders. AB comparison test between the standard and a modified AMR-WB codec; one lost frame every 10 frames; 32 sentence pairs (4 speakers); 6 experienced listeners. STANDARD University of Sherbrooke Faculté de Génie 2500, boul. de l’Université Sherbrooke (Québec) J1K 2R1 Canada IMPROVED FRAME LOSS RECOVERY USING CLOSED- LOOP ESTIMATION OF VERY LOW BIT RATE SIDE INFORMATION Philippe Gournay [email protected] VoiceAge Corporation 750 Chemin Lucerne, Suite 250 Montreal (Quebec) H3R 2H6 Canada 1. Abstract Bitstream Out Side Information z -1 Audio In Encoder Interna l State old “good” past exc. Standar d Encoder Concealment Correctio n new “good” past exc. “Good” past excitation “Bad” past excitation g T 0 “Good” past excitation “Bad” past excitation T 0 5. Conclusions and Perspectives 2. Modified Encoder 3. Estimation of the Correction 4. Performance evaluation Strong Slight None Slight Strong 0 10 20 30 40 50 60 TYPE A (LOST ONSET) ERROR SIGNALS SHOW THE EFFECT OF A RESTORED ONSET TYPE B (LOST ALIGN.) ERROR SIGNALS SHOW A FASTER RECONVERGENCE LOST FRAME (a) (b) (c) (d) (e) (a) (b) (c) (d) (e) Interspeech 2008, Brisbane,

Upload: alban-mitchell

Post on 13-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: In CELP coders, the past excitation signal used to build the adaptive codebook is the main source of error propagation when a frame is lost. We presents

In CELP coders, the past excitation signal used to build the adaptive codebook is the main source of error propagation when a frame is lost. We presents a novel resynchronization technique using very low bit rate side information to correct the past excitation signal after a frame erasure. The novelty of this technique is that the correction is computed at the encoder in a closed loop fashion, based on the actual error introduced by the concealment. Objective and subjective test results show that this approach is a promising area for future research on frame loss recovery.

A frame loss is simulated at the encoder (concealment) in order to determine the correction that should be applied to the past excitation signal (adaptive codebook).

TYPE B (LOST ALIGN.): For stationary voiced signals, the correction consists in a gain (g) and a shift ()

TYPE A (LOST ONSETS): Side information describes the position and amplitude of the largest pulse

We have demonstrated a concept which can be applied to any CELP codec: Very efficient for single frame loss Very limited bit rate (13 bits per frame) Minimum complexity overhead at the encoder, no overhead at the decoder

Various improvements for errors of type A and B, and various solutions for errors of type C, are proposed in the paper.

Analysis of actual “good” and “bad” past excitation signals shows that typical CELP concealment introduce 3 types of errors which are characterized by strong error propagation:

Type A: Lost onsets Type B: Lost alignments Type C: Waveform mismatch

The determination of the correction is done in the LPC excitation domain. The correction information depends on a signal classification step. To demonstrate the concept, we have chosen to concentrate on errors of types A and B.

(a) no frame lost; (b) standard decoder; (c) modified decoder using side information; (d) and (e) error signals for the standard and modified decoders.

AB comparison test between the standard and a modified AMR-WB codec; one lost frame every 10 frames; 32 sentence pairs (4 speakers); 6 experienced listeners.

STANDARD MODIFIED

PREFERENCE

University of SherbrookeFaculté de Génie

2500, boul. de l’UniversitéSherbrooke (Québec)

J1K 2R1 Canada

IMPROVED FRAME LOSS RECOVERY USING CLOSED-LOOP ESTIMATION OF VERY LOW BIT RATE SIDE INFORMATION

Philippe [email protected]

VoiceAge Corporation750 Chemin Lucerne, Suite 250

Montreal (Quebec)H3R 2H6 Canada

1. Abstract

Bitstream Out Side Information

z-1

Audio In

Encoder Internal

State

old “good” past exc.

Standard Encoder

Concealment

Correction

new “good” past exc.

“Good” past

excitation

“Bad” past excitation

g

D

T0

“Good” past excitation

“Bad” past excitation

T0

D 5. Conclusions and Perspectives

2. Modified Encoder

3. Estimation of the Correction

4. Performance evaluation

Strong Slight None Slight Strong

0

10

20

30

40

50

60

TYPE A (LOST ONSET)

ERROR SIGNALS SHOWTHE EFFECT OF ARESTORED ONSET

TYPE B (LOST ALIGN.)

ERROR SIGNALS SHOWA FASTER

RECONVERGENCE

LOST FRAME

(a)

(b)

(c)

(d)

(e)

(a)

(b)

(c)

(d)

(e)

Interspeech 2008, Brisbane, Australia