eg-workshop9 ohnetext...

33
PhoneGuide PhoneGuide Adaptive Image Recognition on Mobile Phones Adaptive Image Recognition on Mobile Phones Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 2009

Upload: dinhthuy

Post on 21-Feb-2018

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

PhoneGuidePhoneGuideAdaptive Image Recognition on Mobile PhonesAdaptive Image Recognition on Mobile Phones

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 2009

Page 2: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

PhoneGuide

classification multimediamultimedia

1. take foto of exhibit 2. detect correct exhibit 3. receive information

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 20092

Page 3: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Motivation• Enhanced museum guidance

• Rich multimedia presentations (images, video, audio, text, cg)

• Visitors use their own mobile devices• Little / no acquisition and maintenance cost for museums

• ChallengesChallenges• Uncontrolled public environments

• Varying illumination and user behavior

K i t ti ti l• Keeping computation time low

• Why not labels (numbers, barcodes) [Rohs et al.’05]?• Distract overall appearancepp

• Why not electronic tags [Chang et al. ‘08] ? • Not possible for densely located objects in showcases, etc.

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 20093

Page 4: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Motivation• Why not client-server based [Ruf et al.’08, Luley et al.’05, Hare et al.’04]?

• Advantages of on-device recognition• Decentralized, scalable • Classification requests processed in parallel, independent of the number of users• No sophisticated infrastructure saves costs• Independent of network coverage (“Museum 411”)

Wh d ti ?• Why adaptive?• Recently upcoming approaches port existing desktop algorithms to mobile devices without

taking much into account the benefits of them[Bay et al.‘06, Takacs et al.‘08][ y ]In contrast, we think that if the context of mobile phones are considered, the classification performance can be improved and outperform related traditional approaches.Advantages of mobile phones for object recognition applications• Location-based (e.g. information are retrieved at/for particular locations, [Paletta et al.‘08])• User-centered (e.g. visitors can give feedback on recognized objects; system should

distinguish between important and less important exhibits)• Multi-user system (e g multiple devices can interoperate)

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 20094

Multi user system (e.g. multiple devices can interoperate)

Page 5: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Overview

image recognitionadaptation

pervasive trackingpervasive tracking

P2P communication

objects

pervasive storage*subobjects

pathway awareness*

user interface

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 20095

user interface* in progress

Page 6: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Technical Overview

communicating twice

bil h

begin / end of visit

server

preprocessing

mobile phones

image recognitionpreprocessing - Keyframe extraction - Clustering - Neural network training

image recognition- User feedback- Sensor awareness- Multimedia presentationNeural network training

- …Multimedia presentation

- …

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 20096

Page 7: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Image Recognition (Objects) - Preprocessing

M patches clustering

capturing video from

p

compute 40 global color features per keyframe extraction

g

capturing video from different perspectives

compute 40 global color features per patch (histogram, mean, variance)

keyframe extraction

for each patch

neural networks used for on-d i bj t iti

transfer once

train M 3-layer neural networks device object recognition

Mobile Phone Enabled Museum Guidance with Adaptive ClassificationE B B B b h d O Bi b

Adaptive Training of Video Sets for Image Recognition on Mobile PhonesE B d O Bi b

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 20097

E. Bruns, B. Brombach and O. Bimber IEEE Computer Graphics and Applications, vol. 28, no. 4, pp. 98-102, July/August 2008

E. Bruns and O. Bimber Journal of Personal and Ubiquitous Computing, 2008

Page 8: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Image Recognition (Objects) - Classification

result

M patchescapture one image

p

average votingfeatures ∑

compute 40 features per patch (hi t i )

……

(histogram, mean, variance)

M 3-layer neural networks

Mobile Phone Enabled Museum Guidance with Adaptive ClassificationE B B B b h d O Bi b

Adaptive Training of Video Sets for Image Recognition on Mobile PhonesE B d O Bi b

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 20098

E. Bruns, B. Brombach and O. Bimber IEEE Computer Graphics and Applications, vol. 28, no. 4, pp. 98-102, July/August 2008

E. Bruns and O. Bimber Journal of Personal and Ubiquitous Computing, 2008

Page 9: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Image Recognition (Objects) - Results

• Field study with 139 objects in City Museum of Weimar• Recognition rate: 92,6% for experienced user• Recognition rate: 82,0% for 15 unexperienced users• Computation time (Nokia N95, Java ME)

F t t ti 400 (160 120 i l )• Feature extraction: ~400 ms (160x120 pixels)• Classification (neural networks): ~100 ms• No scaling with increasing number of training samplesg g g p• Little scaling with increasing number of objects

Mobile Phone Enabled Museum Guidance with Adaptive ClassificationE B B B b h d O Bi b

Adaptive Training of Video Sets for Image Recognition on Mobile PhonesE B d O Bi b

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 20099

E. Bruns, B. Brombach and O. Bimber IEEE Computer Graphics and Applications, vol. 28, no. 4, pp. 98-102, July/August 2008

E. Bruns and O. Bimber Journal of Personal and Ubiquitous Computing, 2008

Page 10: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Image Recognition (Subobjects)• Motivation

• Global features do not allow detecting multiple objects in a single image• Problem of local image features

• Local image features (e.g. SIFT/SURF) are computational expensive and do not allow co putat o a e pe s e a d do ot a odetecting small or completely occluded objects

• IdeaC t ti l ifi ti t b d• Carry out a consecutive classification step based on image classification and spatial relationships (defined by angles and distances) for identifying multiple objects in a single imagemultiple objects in a single image

Subobject Detection through Spatial Relationships on Mobile PhonesB b h B B E d Bi b O

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 200910

Brombach, B., Bruns, E. and Bimber, O.International Conference of Intelligent User Interfaces (IUI‘09), 2009

Page 11: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Image Recognition (Subobjects) - Preprocessing

1 2 3

4 56 7 84 5

bounding boxes of clustering of equallybounding boxes of subobjects in first frame

clustering of equally sized subobjects

transfer

subobject classifiers

transfer

data

t i i f l

groupclassifiers

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 200911

training of neural network classifiers subobject trackingnon-subobjects SR extraction

Page 12: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Image Recognition (Subobjects) - Classification

• Spatial relationships• Predict subobject locations• Validate detected subobjects

• Searching subobjects• Search in given region with multiple scales• Start searching in center of region• Search mask iterates spirally (+ neighborhood search)

• If subobject was found apply SR• Use SR to refine search regionsg

• The more SR can be applied the better the search regions

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 200912

Page 13: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Image Recognition (Subobjects) - Classification

• After a minimum number of subobjects are reliably detected• Apply only subobject classifier• Apply only subobject classifier • Increase emphasis of SR

• Reliability functions of image y gclassifiers and SR as well as more details see paper

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 200913

Page 14: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Image Recognition (Subobjects) - Classification

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 200914

Page 15: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Image Recognition (Subobjects) - Classification

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 200915

Page 16: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Image Recognition (Subobjects) - Results

• Laboratory experiment• 12 subobject groups, 3 to 8 subobjects per group (av. 5.4)• 68% faster than subobject detection via brute-force• 50% faster than brute-force with early stopping

• User study• In the City Museum of Weimar, Germany • Recognition rate: 94.4% for an experienced user• Recognition rate: 85.9% for 15 unexperienced users

• Max: 100%, min 52.4%, per subobject group

• Computation time (Nokia N95, Java ME)• Average recognition time of 2.85 seconds• Max: 4.45 seconds, min: 1.25 seconds, per subobject group

Subobject Detection through Spatial Relationships on Mobile PhonesB b h B B E d Bi b O

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 200916

Brombach, B., Bruns, E. and Bimber, O.International Conference of Intelligent User Interfaces (IUI‘09), 2009

Page 17: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Overview

image recognitionadaptation

pervasive trackingpervasive tracking

P2P communication

objects

pervasive storagesubobjects

pathway

user interface

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 200917

user interface

Page 18: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

User Interface• After an image was classified, a probability-sorted objects list

of result candidates is presented on the device • User is able to select the correct object through the user j g

interface if the classification has failed• Mapping of photographed image and real object ID is stored

on mobile phone and transferred to the server• Server retrains neural networks with new images• Wrong assignments are rejected through clustering

Continious adaption (supervised learning)Users‘ preferencesPhotographed perspectivesg p p p

Opens opportunities for further adaptation techniques

Adaptive Training of Video Sets for Image Recognition on Mobile PhonesE B d O Bi b

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 200918

Probability-sorted objects listE. Bruns and O. Bimber Journal of Personal and Ubiquitous Computing, 2008

Page 19: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Overview

image recognitionadaption

pervasive trackingpervasive tracking

P2P communication

objects

pervasive storagesubobjects

pathway

user interface

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 200919

user interface

Page 20: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Pervasive Tracking

• Motivation• Object recognition rate decreases with an increasing number of objects

• Idea• Distribute BT beacons for rough location estimation of museum visitors• For each location cell, train one neural network

signal cell intersection (A ∩ B)

Bluetooth

signal cell A signal cell B

Bluetooth beacon

Bluetooth beacon

Enabling Mobile Phones To Support Large-Scale Museum Guidance

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 200920

Enabling Mobile Phones To Support Large Scale Museum GuidanceE. Bruns, B. Brombach, T. Zeidler and O. BimberIEEE Multimedia, vol. 14, no. 2, pp. 16-25, April 2007

Page 21: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Phone-to-Phone Communication• Motivation

• Global color features of object recognition are variant to illuminationvariant to illumination

• Current adaptation process is not in real-time

• Problem• Retraining neural networks is too time consuming

• Idea• Exchange classification data through

courtesy: http://www.awe-communications.com

• Exchange classification data through phone-to-phone communication for real-time adaptation (cp. car-to-car communication)

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 200921

Phone-to-Phone Communication for Adaptive Image ClassificationE. Bruns and O. Bimber International Conference on Advances in Mobile Computing & Multimedia (MoMM'08), 2008

Page 22: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

P2P Communication

ID 1local object database (LODB)

first classification step

12

1 2 3 4 55 2 4 3 1

mappingtime weighted

captured image neural network mappingtime weighted

synchronization(phone-to-phonecommunication)

0.70.040.150 08

1234

nearest neighborfinal result

5243

12

1 2 3 4 57210 36 29 43468 15 35 240.08

0.0345

probabilities object IDs second classification step

votes

31

global object database (GODB)

2 3468 15 35 24

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 200922

Page 23: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

P2P Communication - Results• Evaluation

• Simulation with real image data and realistic parametersR idl h i ill i i• Rapidly changing illumination

• Approximated real illumination data, recorded during one day 25 minutes

• 84 and 252 visitors/day• Result

Recovered classification 252 visitorsrate after 25 minThe more visitors thefaster the adaptation illumination

84 visitors

Independent of illumination changes without ad-hoc

synchronization

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 200923

Phone-to-Phone Communication for Adaptive Image ClassificationE. Bruns and O. Bimber International Conference on Advances in Mobile Computing & Multimedia (MoMM'08), 2008

Page 24: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Outlook

- Work in Progress -

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 2009

Page 25: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Pervasive Storage• Motivation

• P2P only effective in well attended museums• IdeaIdea

• Extend Bluetooth module that is already applied for tracking with memory for storing GODB dataE t d b ith ill i ti Ill i i

3 cm

• Extend sensor box with illuminationsensor for beeing aware of current lighting condition for adaptation

no additional computation timeA h

Illumination sensor

7 cm

• Approach• Load and store GODB through

Bluetooth once per location cell• Cluster capture images based on their

Bluetooth module

• Cluster capture images based on theirillumination state

Generate multiple neural networks and choose classifier based on current illumination state

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 200925

MemoryMobile Phone Enabled Museum Guidance with Adaptive ClassificationE. Bruns, B. Brombach and O. Bimber IEEE Computer Graphics and Applications, vol. 28, no. 4, pp. 98-102, July/August 2008

Page 26: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Pathway Awareness

• Motivation• Bluetooth tracking gives only a rough estimation of the visitors’ location• The smaller the number of potential result candidates / feature space,

the higher might be the classification rate

• Idea• Idea• Determine the pathways of the museum visitors to weight the

importance of exhibits and consequently their classifiers

• Approach• Determine pathways in a field study• Use transition probabilities for weighted nearest neighbor appoach (cp.

Cost et al.’93, Amlacher et al.’08)

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 200926

Page 27: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Pathway Awareness

• Field experiment for determining pathways (70 visitors)

1212

4 1

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 200927

Page 28: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Pathway Awareness

• Pathway: threshold ≥ 2 visitors

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 200928

Page 29: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Pathway Awareness

• Pathway: threshold ≥ 3 visitors

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 200929

Page 30: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Pathway Awareness

• Pathway: threshold ≥ 4 visitors

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 200930

Page 31: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Pathway Awareness

• Pathway: threshold ≥ 5 visitors

Main route

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 200931

Page 32: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Pathway Awareness + Time

• Pathway: threshold ≥ 5 visitors

0:23 s0:18 s

0:23 s0:28 s

1 09

0:45 s

1:09 s

0:45 s

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 200932

Page 33: EG-workshop9 ohnetext [Kompatibilitätsmodus]people.csail.mit.edu/kapu/EG_09_MGW/EG-workshop_PhoneGuide.pdf · • Computation time (Nokia N95, Java ME) • Average recognition time

Thank youThank you.

www.uni-weimar.de/ar/phoneguide

Thanks to:

Naturkundemuseum Erfurt

Stadtmuseum Weimar

Erich Bruns and Oliver Bimber EG - Mobile Graphics Workshop 2009