subjective speech quality measurement: comparison … · subjective speech quality measurement:...

Subjective Speech QualityMeasurement: Comparison ofLaboratory Test Results and Resultsof Test with Parallel TaskHAKOB AVETISYANDEPARTMEN T OF MEA SUR EMEN T, FA CU LT Y OF E L E C T R I C A L ENG I N E E R I NG

C Z E CH T E CHN I C A L UN I V E R S I T Y I N P R AGUE

111‐May‐17 HAKOB AVETISYAN

Subjective testing• Listening quality: e.g. Mean Opinion Score(MOS) ITU‐T: P.800• Listening Quality testing under noisyconditions: S‐MOS, N‐MOS, G‐MOS: ITU‐T:P.835• Conversational Quality: ITU‐T: P.805• Intelligibility testing: MRT or ITU‐T: P.807• etc.

11‐May‐17 2HAKOB AVETISYAN

Laboratoryenvironment

Real lifeenvironment


11‐May‐17 4

We have to somehow occupy subjects’ minds

HAKOB AVETISYAN

Mentally oriented task

only

Physically oriented task

only


Our experiment

11‐May‐17 6

• Laser shooting simulator (muted sound)

• 3 subjects at a time

• 1 shooter and 2 counters

• Light indicators indicate the shooter

• 24 subjects in total

• P.835 methodology

• HD600 Sennheiser

• Speech samples from previous lab‐comparison experiment used (AMR‐WB12k, EVS_WB13k, Cafeteria, Mensa, Road and Pub noise+ref. samples)

HAKOB AVETISYAN

Test A (no p. task) Test B (p. task)PRAGUE, JULY OF 2015 PRAGUE, JANUARY OF 2017

11‐May‐17 7

Correlation : 0.97

S‐MOS (per condition)

1

2

3

4

5

1 2 3 4 5

HAKOB AVETISYAN

PRAGUE, JULY OF 2015 PRAGUE, JANUARY OF 2017

11‐May‐17 8

Correlation : 0.98

N‐MOS (per condition)

Test A (no p. task) Test B (p. task)

1

2

3

4

5

1 2 3 4 5

HAKOB AVETISYAN

PRAGUE, JULY OF 2015 PRAGUE, JANUARY OF 2017

11‐May‐17 9

Correlation : 0.99

G‐MOS (per condition)

Test A (no p. task) Test B (p. task)

1

2

3

4

5

1 2 3 4 5

HAKOB AVETISYAN

Average CI95 of the tests

11‐May‐17 10

S‐MOS N‐MOS G‐MOS

A (no p. task) 0.13 0.12 0.11

B (with p. task) 0.22 0.19 0.21

HAKOB AVETISYAN

Conclusions• P.835‐like experiment has been performed with 24listeners with parallel task and compared with rigorousP.835 results

• Differences were discussed• P.835 methodology seems to be too complex for subjectswhen performing parallel task – risk of wrong parameterassessment etc.: to be replaced by plain P.800 in our futureMOS experiments


Future work• Intelligibility testing: ITU‐T Recommendation P.807

• 48 native and 48 non‐native English speakers• Newly created samples

• New software (using LabVIEW programming environment)

• Comparison between tests without and with parallel task


ETSI DTR/STQ‐255


11‐May‐17 14

Questions ?

Thank You !

HAKOB AVETISYAN

subjective speech quality measurement: comparison … · subjective speech quality measurement:...

Documents