detection and tracking of pianist hands and fingers dmitry o. gorodnichy 1 and arjun yogeswaran 23 1...

12
Detection and tracking Detection and tracking of pianist hands and of pianist hands and fingers fingers Dmitry O. Gorodnichy 1 and Arjun Yogeswaran 23 1 Institute for Information Technology, National Research Council Canada 2 School of Information Technology and Engineering, University of Ottawa 3 Department of Music, University of Ottawa http://synapse.vit.iit.nrc.ca/piano Canadian conference on Computer and Robot Vision (CRV’06) Quebec city, QC, Canada, June 7-9, 2006

Upload: kyle-reid

Post on 26-Mar-2015

227 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Detection and tracking of pianist hands and fingers Dmitry O. Gorodnichy 1 and Arjun Yogeswaran 23 1 Institute for Information Technology, National Research

Detection and tracking Detection and tracking of pianist hands and fingersof pianist hands and fingers

Dmitry O. Gorodnichy 1 and Arjun Yogeswaran 23 1 Institute for Information Technology, National Research Council Canada2 School of Information Technology and Engineering, University of Ottawa3 Department of Music, University of Ottawahttp://synapse.vit.iit.nrc.ca/piano

Canadian conference on Computer and Robot Vision (CRV’06)Quebec city, QC, Canada, June 7-9, 2006

Page 2: Detection and tracking of pianist hands and fingers Dmitry O. Gorodnichy 1 and Arjun Yogeswaran 23 1 Institute for Information Technology, National Research

2

Goals1. To Recognize pianist hands (Left or Right) and fingers (1,2,3,4,5), as s/he

plays a piano.2. To “see” each (otherwise “blind”) MIDI event:

– old way: pitch, volume, etc– now way : + person, hand, finger

Examples of applications:1. Intelligent MIDI record /replay: “Play Midi of the left hand only”2. Write finger number (suggestion) on top of each played note3. Augmentation & Virtualization of piano performance

Motivation: for Computer Vision

Unique unbiased testbed for hand/finger detectionUnique unbiased testbed for hand/finger detection..• In other applications (HCI, robotics, sign language), hands and fingers

move in order to be detected (i.e. to send visual information). • Pianists hands/fingers are not used to send visual information. They are

extremely flexible & Have unlimited set of states.

Page 3: Detection and tracking of pianist hands and fingers Dmitry O. Gorodnichy 1 and Arjun Yogeswaran 23 1 Institute for Information Technology, National Research

3

Motivation: for Music Teaching

1. Video-conferencing (VC) for distant piano learning. • Conventional session includes transmission of a video image only.

• Video recognition technology allows one to transmit also the annotated video image.

Also for:2. for storing detailed information regarding music pieces

3. searchable databases (as in [4])

4. facilitating producing music sheets.

5. score driven synthetic hand/finger motion generation (as in [9])

Page 4: Detection and tracking of pianist hands and fingers Dmitry O. Gorodnichy 1 and Arjun Yogeswaran 23 1 Institute for Information Technology, National Research

4

Setup, Video Input, Recognition OutputIn Piano Pedagogy studio lab

(with MIDI-equipped grand piano)In home environment

(with Yamaha MIDI-keyboard)

Camera view from above

What computer would do: keyboard rectification, key recognition

hand detection, finger detection

Page 5: Detection and tracking of pianist hands and fingers Dmitry O. Gorodnichy 1 and Arjun Yogeswaran 23 1 Institute for Information Technology, National Research

5

Step 1a. Image rectification

1. Top and bottom black corners are

detected – lowest and highest

points of black dilated blobs

satisfying ratio.

2. Two lines are fit into detected corners.

3. Image rotated to make these

lines parallel to Ox, cutting

the image part.

4. Black blobs are counted to

detect “C”.

Step 1b. Recognizing “C” key

Page 6: Detection and tracking of pianist hands and fingers Dmitry O. Gorodnichy 1 and Arjun Yogeswaran 23 1 Institute for Information Technology, National Research

6

Step 2a . Hand detection

1. Background model of (detected in step 1) piano is maintained: IBG / DBG += new data AND I(no motion over several frames*)

2. Hands are detected as foreground: FG = |I − IBG| > 2 *DBG.

3. When they are detected, – Skin model is updated (UCS-masked HCr 2D histogram)– Number of hands is detected by K-means clustering

Page 7: Detection and tracking of pianist hands and fingers Dmitry O. Gorodnichy 1 and Arjun Yogeswaran 23 1 Institute for Information Technology, National Research

7

Step 2b. Hand Tracking

Technique:

Deformable box-shape template,

1. where only gradual changes (x,y,h,w, Vx,Vy) are allowed (compared to previous frame)

2. initialized by• foreground detection, or • skin colour tracking (by

backprojecting 2D histogram of HCr learnt in Step 2a)

Foreground detection extracts blobs corresponding to hand images (left column) Hand template tracking allows one to detect partially occluded hands (right column)

Page 8: Detection and tracking of pianist hands and fingers Dmitry O. Gorodnichy 1 and Arjun Yogeswaran 23 1 Institute for Information Technology, National Research

8

Step 3. Detecting fingers

Pianist fingers:

• Unlike in other applications, these fingers are never protruded! - Mostly bent towards keyboard (away from camera), often touching and occluding each other, tightly grouped together

• Low resolution video even more difficult to separate them

However: in camera these fingers are seen as convex objects!

Once hands are detected, fingers are detected by a new edge detection technique that scans

hand areas searching for crevices.

Page 9: Detection and tracking of pianist hands and fingers Dmitry O. Gorodnichy 1 and Arjun Yogeswaran 23 1 Institute for Information Technology, National Research

9

Crevice-detection operator ___Conventional edge detection (Canny, Harris) don’t use a-priory

information about finger shapes => Return too many / small # pixels .

Definition: Crevices are locations in image where two convex shapes meet.

Finger edges are detected using crevice detection operator.

Crevice Detection operator:scans in a one direction I(x) and marks a single pixel x*, where I(x) after going down goes up.

Requires post-processing:

• Method 1: merging adjacent pixels

• Method 2: filling a blob in between two “crevice edge pixels” x1* and x2* on the same line

Page 10: Detection and tracking of pianist hands and fingers Dmitry O. Gorodnichy 1 and Arjun Yogeswaran 23 1 Institute for Information Technology, National Research

10

Stage 4. Associating to MIDI events

C-MIDI program interface

When a MIDI signal is received (i.e. a piano key was pressed), the hand and finger that are believed to press the piano key are shown.

(hand is highlighted in red, the finger number is shown on top of the image).

Page 11: Detection and tracking of pianist hands and fingers Dmitry O. Gorodnichy 1 and Arjun Yogeswaran 23 1 Institute for Information Technology, National Research

11

C-MIDI outputWindows on GUI screen show (in clockwise order from top):

• image captured by camera;

• computed background image of the keyboard (used to detect hands

as foreground);

• binarized image (used i. to detect black keys, ii. for video-MIDI calibration)

• automatically detected piano keys (highlighted as white rectangles),

• segmented blobs in foreground images (coloured by # of blobs)

• final finger and hand detection results (shown upside down, as camera

views - on top left, and vertically flipped for viewing by a pianist - bottom middle)– The label of finger that played a key is shown on the top of the image.

• results of vision-based MIDI annotation (in separate window at bottom right): each received MIDI event receives visual label – for hand (either 1 or 2, i.e. left or right)– for finger (either 1,2,3,4, or 5, counted from right to left) that played it.

When the finger can not be determined, the annotation is omitted.

Page 12: Detection and tracking of pianist hands and fingers Dmitry O. Gorodnichy 1 and Arjun Yogeswaran 23 1 Institute for Information Technology, National Research

12

DEMO (Recorded LIVE)• Three music pieces

– of increasing complexity (speed, finger/hand motion)– played by professional piano teacher.

Limitations

Temporal boundaries: Video process practically real-time:

annotating MM 160 1/8 notes (and faster) is possible

Spatial boundaries: 4 (5) octaves, small (10-year olds) hands – borderline

Behavioral: Overlapping of hands, overlapping by head, etc

Environmental (Lighting, Shadows, Colours): On different pianos, different

auditoriums, different hands colour

Acknowledgements- Partially supported by SSHERC and CFI grants to UofO Music Dept.

- MIDI events reader coding helped by Mihir Sharma (SITE, UofO)

- Team members influences: Gilles Comeau (UofO), Emond Bruno and Martin

Brooks (IIT, NRC)