yüz tanıma

Upload: subaru81

Post on 10-Apr-2018

251 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 yz tanma

    1/63

    Human behaviour analysis and

    interpretation based on the videomodality: postures, facialexpressions and head movementsA. Benoit, L. Bonnaud, A. Caplier, N. Eveno, V.

    Girondel, Zakia Hammal, M. Rombaut

  • 8/8/2019 yz tanma

    2/63

    Introduction

    Looking at people domain : automatic analysisand interpretation of human actions (gestures,

    behaviour, expressions)Needs low level information (video analysis stepanswering to how are things ? ) and highlevel interpretation (data fusion answering to what is happening ? ).

  • 8/8/2019 yz tanma

    3/63

    Applications

    multimodal interactions and interface cf Similar NOE mixed reality systems smart rooms smart surveillance systems (hypovigilance detection,detection of distress cases: old people surveillance, bussurveillance)

    e-learning assistance

  • 8/8/2019 yz tanma

    4/63

    Outline

    1. Global posture recognition2. Facial expressions recognition3. Head motion analysis and interpretation4. Conclusion

    which expression ? which head motion ? which posture ?

  • 8/8/2019 yz tanma

    5/63

    System overview

    1. Low level data extraction :- segmentation- temporal tracking

    - skin detection and face localization2. Static posture recognition :

    - low-level data- belief theory, definitions, models, data fusion, decision

    3. Results :- training and test sets- recognition results

    - video sequence

  • 8/8/2019 yz tanma

    6/63

    System overview

    1030

    1030

    1030

    1030

    1030 sitting

    Video sequence

    Segmentation

    Temporal tracking

    Skin detection / Face localization

    Static posture recognition

    Indoor scene filmed by a static camera L o w l e

    v e l

    d a t a e x t r a c t i o n

  • 8/8/2019 yz tanma

    7/63

    Low Level data: personsegmentation

    Adaptive background removal algorithm :consecutive frame differences + adaptive reference image

    A. Caplier, L. Bonnaud and J. M. Chassery Robust fast extraction of video objects combining framedifferences and reference image in Proc. IEEE International Conference on Image Processing , pp. 785-788, September 2001.

    Low-level data computed :- rectangular bounding box(SRBB)- principal axes box (SPAB)- gravity center 475

    475

    ---- SPAB

    gravity center

    SRBB

  • 8/8/2019 yz tanma

    8/63

    Low Level data: skin detection

    Color space YCbCr :no conversions, luminance and chrominances apartskin databases : Von Luschans, one obtained with camera

    Method :thresholding in CbCr planewith initial thresholds :Cb [86, 140] Cr [139, 175]

    Y

    Y

    Cr

    Cb

  • 8/8/2019 yz tanma

    9/63

    Low Level data: temporal tracking

    Computation of SRBBs overlap : forward backward result

    Low-level data computed :- Identification numbers (IDs)- temporal split/merge information

    T-1

    T

  • 8/8/2019 yz tanma

    10/63

    Low Level data: face localization

    Automatic thresholds adaptation :translation, reduction of detection

    intervals towards Cb and Cr mean valuesFace and hands identification :

    sorted lists of skin patchescriteria related to temporal tracking and human morhology

    V. Girondel, A. Caplier, and L. Bonnaud Hands Detection and Tracking for Interactive MultimediaA lications in Proc. International Con erence on Com uter Vision and Gra hics . 282-287

    Cr

    Cb

    face

    lefthand

    righthand

  • 8/8/2019 yz tanma

    11/63

    System overview

    1. Low level data extraction :- segmentation- temporal tracking

    - skin detection and face localization2. Static posture recognition :

    - low-level data- belief theory, definitions, models, data fusion, decision

    3. Results :- training and test sets- recognition results

    - video sequence

  • 8/8/2019 yz tanma

    12/63

    Posture recognition: measures

    Reference posture Da Vinci Vitruvian man posture standing, arms stretched horizontally

    Distance measurements D1, D2, D3, D4

    ideas : person height and shape compactness

    Normalization : ri=Di/Diref860

    SRBBSPAB

    112

    SRBB

    SPAB

    i

  • 8/8/2019 yz tanma

    13/63

    Posture recognition:belief theory

    Advantages :- use of imprecise, conflictous data- not computationally expensive (HMMs, NNs)

    Universe : = {Hi} i=1n 2N subsets A of

    Hypotheses : Hi disjunctive if exhaustive closed universeelse open universe

    Considered postures :standing (H1), sitting (H2), squatting (H3) and lying (H4)one hypothesis added for unknown postures (H0)

    Belief mass distribution m :

    confidence degree in A with

  • 8/8/2019 yz tanma

    14/63

    Posture recognition: measuresevolution

    Example for r1 measurement :

    frame

  • 8/8/2019 yz tanma

    15/63

    Posture recognition: measuresmodeling

    Belief mass distributions : mri , measurements imprecision

    =

  • 8/8/2019 yz tanma

    16/63

    Posture recognition: data fusion

    Final belief mass distribution : mr1234

    Orthogonal sum of Dempster :

    Example :

    Conflict : non-null belief mass for empty set -> contradictorymeasurements

  • 8/8/2019 yz tanma

    17/63

    System overview

    1. Low level data extraction :- segmentation- temporal tracking

    - skin detection and face localization2. Static posture recognition :

    - low-level data- belief theory, definitions, models, data fusion, decision

    3. Results :- training and test sets- recognition results

    - video sequence

  • 8/8/2019 yz tanma

    18/63

    Results: implemented system andcomputing time

    Implemented system- Sony DFW-VL500 camera- YCbCr 4:2:0 format, 30 fps, 640x480 resolution

    - low-end PC : 1.8 GHz- unoptimized C++ code

    Computing time :- segmentation : 42%

    - temporal tracking : 3%- skin detection and face localization : 50%- static posture recognition : 5%

    Results :obtained at approximately 11 fps

  • 8/8/2019 yz tanma

    19/63

    Results: databases

    Training set :- 6 video sequences, different persons of various heights- 10 consecutive postures

    - normal postures, in front of the cameraTest set :

    - 6 video sequences, other persons of various heights- 7 different postures- free postures, i.e. move the arms, sit sideways

  • 8/8/2019 yz tanma

    20/63

    Results: recognition rates

    Training set :meanrecognition

    rate :88.2%

    Test set :meanrecognitionrate :80.8%

  • 8/8/2019 yz tanma

    21/63

    Results: video example

  • 8/8/2019 yz tanma

    22/63

    Outline

    1. Global posture recognition2. Facial expressions recognition3. Head motion analysis and interpretation4. Conclusion

    which expression ? which head motion ? which posture ?

  • 8/8/2019 yz tanma

    23/63

    Facial expressions analysis

    1 Assumptions2 Facial features segmentation: low level data

    3 Facial expressions recognition: high levelinterpretation4 Facial expression recognition based on audio: a

    step towards a multimodal system

  • 8/8/2019 yz tanma

    24/63

    Assumptions

    Facial regognition based on the analysis of thedeformations of facial permanent features (lips,

    eyes and brows) 6 universal emotions: surprise, joy, disgust,anger, fear, sadness

    Is it possible to recognize

    the facial expression ?

  • 8/8/2019 yz tanma

    25/63

    Facial expressions analysis

    1 Assumptions and Applications2 Facial features segmentation: low level data

    3 Facial expressions recognition: high levelinterpretation4 Facial expression recognition based on audio: a

    step towards a multimodal system

  • 8/8/2019 yz tanma

    26/63

    Facial features extraction:models choice

    P3

    P2P1

    P4

    P5 P7

    P6

    Open eye : circle, parabola,

    Bezier curveBrow : Bezier curve

    P1 . . P2

    Closed eye : line

    External lips : 4 cubics,2 broken lines

    More complex model, more possibledeformations

  • 8/8/2019 yz tanma

    27/63

    Facial features extraction:models initialisation

    Detection of characteristic points: eyes corners, mouth corners

    Luminance and chrominancegradient information

    xx

    xx x

    x

    Luminance gradient information

  • 8/8/2019 yz tanma

    28/63

    Facial features extraction:models deformations (1)

    Gradient flows of luminance and/or chrominance maximisation

    ( ). ( ) p cercle

    E I p n p

    = ur r

    I(p)n(p)

  • 8/8/2019 yz tanma

    29/63

    Facial features extraction:models deformations (2)

    A single control point displacementGradient flow of luminance

    2 or 3 control points displacementGradient flow of luminance

    Mouth corners displacementGradient flow of chrominance and luminance

  • 8/8/2019 yz tanma

    30/63

    Facial features extraction:some results

    Flexibility of the chosen models, accuracy

  • 8/8/2019 yz tanma

    31/63

    Facial expressions analysis

    1 Assumptions and Applications2 Facial features segmentation: low level data

    3 Facial expressions recognition: high levelinterpretation

    4 Facial expression recognition based on audio: a

    step towards a multimodal system

  • 8/8/2019 yz tanma

    32/63

    Facial expressions recognition:recognition on facial skeletons

    disgust Fear anger

    SurpriseJoy sadness

  • 8/8/2019 yz tanma

    33/63

    Facial expressions recognition:characteristic distances

    Facial features deformations related to characteristic distances

    D2

    D1

    D5

    D4

    D3

    Joy : {open mouth} => D3 > Dn3 and D4 > Dn4 ,{mouth corners backwards} => D5 < Dn5 ,{slakened brows} => D2 no modification

    Neutral => Dni reference values

    Surprise : {raising up brows} => D2 >D n2,{stared eyes} => D1 > Dn1 ,{open mouth} => D3< Dn3 and D4> Dn4 .

  • 8/8/2019 yz tanma

    34/63

    D5

    Facial expressions recognition:distances discretisation

    Distances discretisation : 3 states Dni : distance for the neutral expression

    S : stable C+ : Di >> Dni C- : Di Dni (state S C+) SC- : Di < Dni (state S C-)

    D2 evolution (surprise)

    D5 evolution (sourire)

    D2

  • 8/8/2019 yz tanma

    35/63

    Facial expressions recognition:basis of rules

    joy C- S / C- C + C+ C -

    Surprise C+ C+ C - C+ C+

    disgust C - C - S / C+ C + S / C -

    anger C + C - S S / C - S

    sadness C - C + S S S

    fear S / C+ S / C+ S / C- S / C+ S

    neutral S S S S S

    D1 D2D3

    D4

    D5

  • 8/8/2019 yz tanma

    36/63

    Facial expressions recognition:evidence mass distribution and modelling

    To each Di is associated the following mass of evidence :

    Cliquez pour modifier les styles du texte du masqueDeuxime niveau

    Troisime niveau

    Quatrime niveau Cinquime niveau

    Modelling

    thresholds (ah) related to each Di are estimated after atraining step (analysis of distances evolution for 4 facialexpressions and 13 different persons.

    :m Di ]1,0[2

    )( A A m Di

    Di

    Dim

  • 8/8/2019 yz tanma

    37/63

    Facial expressions recognition:method principle (1)

    Distances Di measurement and symbolic state determination

    Computation of the mass distribution for each Di state

    With the basis of rules, computation of the evidence mass for each expression and each Di

    Combination of evidence mass distribution in order to take allthe measures into account before taking a decision.

  • 8/8/2019 yz tanma

    38/63

    Facial expressions recognition:method principle (2)

    ( )=

    =

    =

    AC B

    D D

    D D

    C m Bm Am

    mmm

    )(.)( 21

    21

    Mass of evidence combination (orthogonal sum):

    A, B and C : expressions or subset of expressions.

    Reject class ( E8 : unknown ) : this is an expression which is differentfrom all the expressions described in the rules table.

  • 8/8/2019 yz tanma

    39/63

    Facial expressions recognition:method principle (3)

  • 8/8/2019 yz tanma

    40/63

    Facial expressions recognition:results (1)

    Hammal-Caplier database

    Cohn-Kanade database

  • 8/8/2019 yz tanma

    41/63

    Facial expressions recognition:results (2)

    neutral : 100% unknown : 100% joy : 100%

    3 frames from neutral to joy=> transitory expression

  • 8/8/2019 yz tanma

    42/63

    Facial expressions recognition:results (3)

    joy

    E2 Surprise

    E1 joy

    Surprise disgust neutral

    E3 disgust

    E7 neutral

    E1 E3

    E8 unknown

    other

    Total

    76,36%

    12%

    43,10%

    88%

    6,06%

    6,66%

    10,90%

    0

    0,02%

    0

    0

    0

    12%

    11,08%

    0

    0,78%

    00

    0

    6,06%

    72,44%

    2,08%

    0

    8,62%

    9,48%

    15,51%

    12,06%

    0

    11,32%

    0

    0

    0

    84,44% 51,72% 88%87,26%

    Syst\Exp

    E2 E6

    Results on the Hammal-Caplier database (21 samples for each consideredexpression, 630 images).

  • 8/8/2019 yz tanma

    43/63

    Facial expressions analysis

    1 Assumptions2 Facial features segmentation: low level data

    3 Facial expressions recognition: high levelinterpretation4 Facial expression recognition based on audio: a

    step towards a multimodal system

  • 8/8/2019 yz tanma

    44/63

    Facial expressions analysis based on audio(collaboration with Mons University)

    Idea: characterization of expressions in the speechsignal => use of statistical speech features such asspeech rate, SPI, energy and pitch

    Problem: expressions classes are different. After thepreliminary study, 2 classes active (joy, surprise, anger)and passive (neutral, sadness) suitable for speech

    Perspectives: definition of a multimodal system for facialexpression recognition.

  • 8/8/2019 yz tanma

    45/63

    Outline

    1. Global posture recognition2. Facial expressions recognition3. Head motion analysis and interpretation4. Conclusion

    which expression ? which head motion ? which posture ?

  • 8/8/2019 yz tanma

    46/63

    Head motion interpretation

    1 Introduction2 Head motion estimation: biological modelling

    3 Examples of head motion interpretation

  • 8/8/2019 yz tanma

    47/63

    Introduction

    Idea: Global head motions such as nods and localfacial motion such as blinking are involved in thehuman to human communication process.Aim: automatic analysis and interpretation ofsuch gestures.

  • 8/8/2019 yz tanma

    48/63

    Head motion interpretation

    1 Introduction2 Head motion estimation: biological modelling

    3 Examples of head motion interpretation

  • 8/8/2019 yz tanma

    49/63

    Head motion estimation: biological modelling

    Algorithm overview: human visual system modelling

  • 8/8/2019 yz tanma

    50/63

    Head motion estimation: retina filtering

    OPL stage: spatio-temporal filtering- contours enhancement- noise attenuation- illuminations variations removing

    IPL stage: temporal high passfiltering dedicated to movingStimulus- moving contours (perpendicularto the motion direction) extraction- static contours removing

  • 8/8/2019 yz tanma

    51/63

    Head motion estimation: log-polar spectrumcomputation

    Computation of the spectrum of the retina filtered imagein the log polar domain => spectrum easier to analyse:

    - roll and zoom = global energy spectrum translations- pan and tilt = local energy spectrum translations- translations = no changes in the energy spectrum from frame to frame

  • 8/8/2019 yz tanma

    52/63

    Head motion estimation: log-polar spectruminterpretation (1)

    Maximums of energy on the contoursperpendicular to the motion direction=> Cumulated curve of energy per orientation

    Abscissa of the maximums = motion directionTemporal evolution of the abscissa = motion typeAmplitude of the maximum proportionalto the motion amplitude => energy decreasing

    Or annulation in case of stops.

  • 8/8/2019 yz tanma

    53/63

    Head motion estimation: log-polar spectruminterpretation (2)

  • 8/8/2019 yz tanma

    54/63

    Head motion estimation: log-polar spectruminterpretation (3)

    Each minimum of energy is related to a motion stop

  • 8/8/2019 yz tanma

    55/63

    Head motion estimation: log-polar spectruminterpretation (4)

    To summarize : properties of the retina filteredenergy spectrum in the log-polar domain

    max of energy associated to the contours moving to the motion no motion = no energyOrientation of energy maximums = motion directions movements of energy maximums = motion type

  • 8/8/2019 yz tanma

    56/63

    Head motion interpretation

    1 Introduction2 Head motion estimation: biological modelling

    3 Examples of head motion interpretation

    d d f b ( )

  • 8/8/2019 yz tanma

    57/63

    Head nods of approbation or negation (1)

    Idea : detection of periodic head motions

    I am still with you

    Approach: to put a biological head motion detectoron a face bounding box and to control all thehead movements

    Goal : recognition of head nods of approbation and negation

    approbation: periodic head tilting negation: periodic head panning

    H d d f b i i (2)

  • 8/8/2019 yz tanma

    58/63

    Head nods of approbation or negation (2)

    Bli ki d i

  • 8/8/2019 yz tanma

    59/63

    Blinking detection

    Blink : vertical motion of the eyelidApproach: to put a biological motion detector on a bounding box

    around the eyes

    Y i d i

  • 8/8/2019 yz tanma

    60/63

    Yawning detection

    Yawn : vertical motion of the mouthApproach: to put a biological motion detector on a bounding box

    around the mouth

    H i il d i

  • 8/8/2019 yz tanma

    61/63

    Hypo vigilance detection

    Hypovigilence : short or long eyes closingmultiple head rotations

    frequent yawning

    Approach: combine the information coming from 3 biologicalmotion detectors

    l

  • 8/8/2019 yz tanma

    62/63

    Outline

    1. Global posture recognition2. Facial Expressions recognition3. Head motion analysis and interpretation4. Conclusion

    which expression ? which head motion ? which posture ?

    l

  • 8/8/2019 yz tanma

    63/63

    Conclusion

    Human activities analysis and recognition based onvideo and images data

    Unified approaches: extraction of low level data andfusion process for high level semantic interpretation

    Correlation with application; exple: project 4 ofEnterface Workshop about Attention level detectionof driver