development of speaker verification under limited data and condition

*DEVELOPMENT OF SPEAKER VERIFICATION UNDER LIMITED DATA AND CONDITION

Under guidance of

Dr. G. Pradhan

NIT PATNA (ECE dept.)

NAME-PAMMI KUMARI

M.TECH 2nd yr (ECE dept.)

ROLL NO.-1329005

• Introduction

• Summary of Literature review

• Issues in existing speaker verification systems

• Motivation for the present work

• Baseline speaker verification system

• Experimental results

• Proposal for future work

*OUTLINE

To develop voice password based speaker verificationTo study impact of text-mismatch on the performance of voice password based speaker verification system

Develop a voice password based speaker verification system in text-independent mode

Explore method to model speaker information in limited data condition

Most of the application where speech signal of short duration used around 3-5ms, but Speaker verification system provide poor performance for short duration speech signal

This degradation of performance is due to phonetic variability between training and testing speech data

Objective and Motivation for this work

SPEAKER VERIFICATION: The speaker verification is a process of verifying the identity of the claimant . It performs one-to-one comparison between a newly input voiceprint and the voiceprint for the claimed identity that is stored in the database.

* INTRODUCTION

Fig :-Block diagram of speaker verification system

InputSpeech

Similarity

FeatureExtraction

Verification result

Speaker ID(#M)

Reference model (Speaker #M)

Threshold

Decision

*Modular representation of Voice pass word based speaker verification system

Training Reference model

Speech

Identity claim

Testing

Speech R

Accept/reject

Pre-

processing

Feature

extraction

Model

Building

Pre-

processing

Feature

extraction

comparison

Decision logic

Fig: Voice password speaker verification system

Cont….

• when an identity claim is made by a speaker, the speech data is compared with respect to the model of the speaker whose identity is claimed.

• The concept of threshold is used to come up with the decision.

• If the similarity of the test speech data to the target model is below the threshold ,the speaker is accepted.

• This process involves a binary decision (accept/reject) about the claimed identity regardless of the population size.

• Hence, the performance of the verification system does not depend on the size of the population.

• In the first stage, pre-processing and feature extraction is performed over a database of speakers.

• The second stage is to generate models, where vectors representing speaker specific characteristic are obtained, this leads to the feature vectors.

• The third stage is decision, which accepts or rejects the claimed identity of a speaker.

* Speaker verification system comprises of three stages :-

Basic block diagram of a biometric system

PRE-PROCESSING

FEATHER EXTRACTION

APPLICATION DEVICE

TEMPLATEGENERATO

RMATCHER

STOREDTEMPLATE

SENSOR

* Speaker verification can be classified into:-1) Text-dependent2) Text-independent

Text-dependent speaker verification-In this, speaker system is based on the utterance of a fixed predetermined phrases. Text-independent speaker verification-In this, the reference (what are spoken in training) & the test (what are uttered in actual use) utterance may have completely different content is text-independent.

*Literature review

• Research in the field of speaker recognition was initially carried out in 1950s in Bell laboratories using isolated digites [1].

• In 2000 most of the research was describe the major elements of Gaussian mixture model (GMM)-based speaker verification system used successfully in several NIST Speaker Recognition Evaluations(SREs).

• 1960-1990 most of the research was focused on extraction of speaker specific information from the speech data, and development of text dependent speaker verification system.

• In 1990-2005 the speaker recognition method shifted from template based pattern matching to statistical modeling. Different statistical modeling method like GMM and GMM-UBM are proposed.

• 2005- 2014 most of the research was focused on compensation of mismatches and development of practical verification systems. Different compensation methods like i-vectors and PLDA are proposed

1. K. H. Davis, et. al., “Automatic recognition of spoken digits,”

J.A.S.A., 24 (6), pp. 637-642, 1952.

* Cont…

• In the speech analysis stage, through the techniques have been developed to improve the speaker verification performance, no particular analysis techniques is specially meant for limited data condition.

• The use of segmental analysis under limited data condition provides few feature vectors which leads to poor speaker models leads to degradation of performance.

* Issues in existing speaker verification system

• Most of the application where speech signal of short duration used around 3-5ms, but Speaker verification system provide poor performance for short duration speech signal

• This degradation of performance is due to phonetic variability between training and testing speech data

• The phonetic variability may be reduced by artificially generating multiple utterance.

• Most of the SV system develop score normalization using on cohort centric normalization. The speaker centric score normalization may provide better result.

* MOTIVATION FOR THE PRESENT WORK

• For Baseline speaker verification the following parameter are used VAD threshold is taken 0.1 of

average energy Baseline uses MFCC features Feature vector: It uses 39

dimension feature vector and 20ms frame size with shift 2ms.

Modeling: GMM GMM size: 8, 16, 32, 64.

* BASELINE SPEAKER VERIFICATION SYSTEM

*Experimental ResultFor original data

34.61332.87

32.097132.4634

* Experimental resultFor test 15sec and train15sec

27.4725 25.1374

23.672222.6190

• Extraction of feature to reduce the impact of phonetic variability.

• Different residue of behavioral feature may be extracted in addition to MFCC for speaker verification.

• In this project we considered GMM modeling technique in next work many other technique may be used like i-vector.

* Proposal for future work

*Thank you

development of speaker verification under limited data and condition

Engineering

speaker system

speaker information

reference model speaker

field of speaker recognition

claimed identity

testing speech data

test speech data

short duration speech