can you trust what you see? the magic of visual perception
TRANSCRIPT
![Page 1: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/1.jpg)
Can you trust what you see?The magic of visual perception
OgeMarques,PhDProfessor
College ofEngineering andComputer ScienceFloridaAtlanticUniversity– BocaRaton,FL(USA)
![Page 2: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/2.jpg)
The Distinguished Speakers Program is made possible by
For additional information, please visit http://dsp.acm.org/
![Page 3: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/3.jpg)
About ACM
ACM, the Association for Computing Machinery is the world’s largest educational and scientific computing society, uniting educators, researchers and
professionals to inspire dialogue, share resources and address the field’s challenges.
ACM strengthens the computing profession’s collective voice through strong leadership, promotion of the highest standards, and recognition of technical
excellence.
ACM supports the professional growth of its members by providing opportunities for life-long learning, career development, and professional
networking.
With over 100,000 members from over 100 countries, ACM works to advance computing as a science and a profession. www.acm.org
![Page 4: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/4.jpg)
A man enters a room…
Source:https://www.youtube.com/watch?v=zNbF006Y5x4
![Page 5: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/5.jpg)
Surprised?
• Thevideoiscalled“Assumptions”
• Theauthor(andactor)isBritishprofessionalmagicianandonlyProfessorinthePublicUnderstandingofPsychology (UniversityofHertfordshire)RichardWiseman
• Formore:http://richardwiseman.wordpress.com/
![Page 6: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/6.jpg)
My background
• OgeMarques,PhD– ProfessorofEngineeringandComputerScienceatFAU
– Resarchfocus:Intelligentprocessingofvisualinformation(blendofimageprocessing,computervision,humanvision,artificialintelligenceandmachinelearning).
– 10yearsago,I’vedecidedtostudyhumanvisionandactivelyinteractwithresearchersinthefield.
– HerearesomeofthethingsI’velearnedalongtheway…
Facebook:https://www.facebook.com/ProfessorOgeMarques
![Page 7: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/7.jpg)
Goals of this talk
• Toexploretogetherseveralvisualperceptionphenomenathatchallengeourcommonknowledgeofhowwellwemakedecisionsupontheinformationthatarrivesatourbrainthroughoureyes.
• Toexaminepossibleapplications ofhumanvisionknowledgetothesolutionofcomputervisionresearch questions.
![Page 8: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/8.jpg)
Visual illusions
• Seriousvisionresearch– “Errorsofperception
(phenomena ofillusions)canbeduetoknowledgebeinginappropriate orbeingmisapplied.Soillusionsareimportant forinvestigatingcognitiveprocesses ofvision.”(RichardGregory)
• Fun(partytricks)– “Tricksworkonlybecause
magiciansknow,atanintuitivelevel,howwelookattheworld.[…]Magiciansweretakingadvantageofthesecognitiveillusions longbeforeanyscientist identifiedthem.” (StephenMacknik andSusanaMartinez-Conde)
![Page 9: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/9.jpg)
Speaking of fun tricks…
Source:https://www.youtube.com/watch?v=r6h02WuxmVY
![Page 10: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/10.jpg)
Warm up
• Whatdoyousee?
Source:Frisby andStone(2012)
![Page 11: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/11.jpg)
Warm up
• Whichcircleisbigger?
Source:https://en.wikipedia.org/wiki/Ebbinghaus_illusion#/media/File:Mond-vergleich.svg
![Page 12: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/12.jpg)
Warm up
• Whichlineislonger?
Source:https://s-media-cache-ak0.pinimg.com/originals/5a/a5/34/5aa534b42bf7c6cd61e1b710a360d056.gif
![Page 13: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/13.jpg)
Warm up
• Whichlineislonger?
![Page 14: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/14.jpg)
Warm up
• Whichlineislonger?
Source:http://ww2.justanswer.com/uploads/fael/2011-08-06_032529_ponzoillusionapplet.gif
![Page 15: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/15.jpg)
Warm up
• Duckorrabbit?
Source:Frisby andStone(2012)
![Page 16: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/16.jpg)
Warm up
• Whichdotisdifferentthananyother?
![Page 17: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/17.jpg)
![Page 18: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/18.jpg)
What do we know about visual perception?
Not much compared to what we don’t know
Source:Barenholtz (2009)
Ignorance
Knowledge
![Page 19: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/19.jpg)
State of the art
![Page 20: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/20.jpg)
Things we DO know
• Visiblelight
![Page 21: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/21.jpg)
Things we DO know
• Eye(retinalimage)
![Page 22: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/22.jpg)
Things we DO know
• Eyetobrainpath
![Page 23: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/23.jpg)
Things we DO know
• VisionforACTIONvs.visionforRECOGNITION
What?
Where?
![Page 24: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/24.jpg)
Example of what we DON’T know (yet)
• Themoonseemslargerwhenitisnearthehorizonthanwhenitishighinthesky. Why?
• Itfoolsthehumanbrain,butcannotbecapturedinaphoto.
• Manycompetingtheories,noconsensus.
Source:https://freethoughtblogs.com/singham/files/2014/02/moonrise-timelapse-over-la.jpg
Themoonillusion
![Page 25: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/25.jpg)
How scientists learn about human vision
• Patientswithbraindamageoreyeconditions• Directaccesstothebrain
– Single-cellrecording– Modernbrainimagingandactivityrecordingdevices
• Controlledexperiments– Calibratedmonitorsandrooms– Eye-trackingdevices– Psychophysics
![Page 26: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/26.jpg)
Can you trust your brain?
• “Ourbrainsarebrilliantinstruments,abletoreason,synthesize,rememberandimagineatanextraordinarypitchandrate.Wetrustthemimmediatelyandinnately – andhavereasonstobedeeplyproudofthemtoo.
• However,thesebrains[…]arealsoverysubtlyanddangerouslyflawedmachines,flawedinwaysthattypicallydon’tannouncethemselvestousandthereforegiveusfewcluesastohowonguardweshouldbeaboutourmentalprocesses.”
(AlaindeBotton,“Thefaultywalnut”)
Source:http://www.thebookoflife.org/the-faulty-walnut/
![Page 27: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/27.jpg)
Sometimes we see what is notthere
Source:Palmer(1999)
![Page 28: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/28.jpg)
Count the black dots…
Source:http://www.slideshare.net/mrg3515/optical-illusions-8167051/3
![Page 29: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/29.jpg)
Sometimes we only see half of the story
Source:WikimediaCommons
![Page 30: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/30.jpg)
Sometimes we must make a ‘best guess’
Source:http://www.slideshare.net/mrg3515/optical-illusions-8167051/3
![Page 31: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/31.jpg)
Sometimes we even combine two or more illusions
Source:Goldstein(2002)
![Page 32: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/32.jpg)
Sometimes we have trouble with (relative) brightness and contrast
Source:WikimediaCommons
![Page 33: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/33.jpg)
Sometimes we have trouble with (relative) brightness and contrast
Source:WikimediaCommons
![Page 34: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/34.jpg)
Sometimes we have trouble with color (constancy)
Source:http://www.lottolab.org/
![Page 35: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/35.jpg)
“The dress”
• On Feb 26,2015thisdress “broke theInternet”– #whiteandgold
or– #blackandblue?
![Page 36: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/36.jpg)
“The dress”: a simplifiedexplanation
• Most of the time,our visualsystemdoesaremarkable job of inferring the ambientlighting conditions at any given timeanddiscounting their contribution to colorcomputations.
• But inthis image,the cues to the lightingconditions areparticularly ambiguous.
• Is the lightilluminating the dress brightand yellowish or is itdim and blueish?Yourbrain has to make aguess.
Source:http://web.mit.edu/bcs/nklab/what_color_is_the_dress.shtml
![Page 37: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/37.jpg)
“The dress” meets the color cube
• An experiment by RosaLafer-Sousa(Kanwisher Lab,MIT)combined the dress with Beau Lotto’s colorcube.Here arethe results:
Source:http://web.mit.edu/bcs/nklab/what_color_is_the_dress.shtml
![Page 38: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/38.jpg)
“The dress”
• But what coloris it?
• Think the controversy is over?Think again!
![Page 39: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/39.jpg)
Sometimes we miss on seeing things…
…becausetheyhappentoofast
![Page 40: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/40.jpg)
Sometimes we miss on seeing things…
…becausetheyhappentooslowly
Source:O’Regan
![Page 41: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/41.jpg)
Sometimes we miss on seeing things…
…becausesomething(else)flashes/flickers
Source:Skoda(https://www.youtube.com/watch?v=qpPYdMs97eE)
![Page 42: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/42.jpg)
And sometimes interpretation changes with viewing distance
Source:Torralba &Oliva (2006)
![Page 43: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/43.jpg)
Sometimes our prior knowledge gets on the way…
Source:Adelson (1995)
![Page 44: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/44.jpg)
Sometimes our prior knowledge gets on the way…
Source:Adelson (1995)
![Page 45: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/45.jpg)
Sometimes…
…ourinterpretationofanimagedependsonwhetherwearelookingatitsparts ortakingitasawhole
![Page 46: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/46.jpg)
![Page 47: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/47.jpg)
![Page 48: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/48.jpg)
Sometimes…
…thethingsthatwestruggletoseeforthefirsttimebecomesurprisinglyeasyfromthesecondtimeon
![Page 49: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/49.jpg)
Source:Goldstein(2002)
![Page 50: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/50.jpg)
Source:Goldstein(2002)
![Page 51: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/51.jpg)
![Page 52: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/52.jpg)
![Page 53: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/53.jpg)
Sometimes we know we’re being fooled…
Source:YouTube
![Page 54: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/54.jpg)
Ames room: explanation
Source:Goldstein(2002)
![Page 55: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/55.jpg)
Sometimes we know that what we’re seeing is not what is there…
…butwestillcan’thelpit.
Source:Gregory(2006)
![Page 56: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/56.jpg)
Applications to multimedia research
• Computationalmodelingofvisualattention– Imageretrieval– Objectdetection
• Facerecognition– Game:GuessThatFace
![Page 57: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/57.jpg)
Visual Attention
Wecanonlypayattentiontopartofthevisualscene
Whichpart?
Source:Yarbus (1965)
![Page 58: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/58.jpg)
We can only pay attention to part of the visual scene
Whichpart?
Source:Yarbus (1967)
![Page 59: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/59.jpg)
We can only pay attention to part of the visual scene
• Contemporarycomputermodels
Source:http://www.saliencytoolbox.net/
![Page 60: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/60.jpg)
Our work
VisualAttention+ImageRetrieval
Hindawi Publishing CorporationEURASIP Journal on Advances in Signal ProcessingVolume 2007, Article ID 43450, 17 pagesdoi:10.1155/2007/43450
Research ArticleAn Attention-Driven Model for Grouping SimilarImages with Image Retrieval Applications
Oge Marques,1 Liam M. Mayron,1 Gustavo B. Borba,2 and Humberto R. Gamba2
1 Department of Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL 33431-0991, USA2 Programa de Pos-Graduacao em Engenharia Eletrica e Informatica Industrial, Universidade Tecnologica Federal do Parana (UTFPR),Curitiba, Parana 80230-901, Brazil
Received 1 December 2005; Revised 3 August 2006; Accepted 26 August 2006
Recommended by Gloria Menegaz
Recent work in the computational modeling of visual attention has demonstrated that a purely bottom-up approach to identify-ing salient regions within an image can be successfully applied to diverse and practical problems from target recognition to theplacement of advertisement. This paper proposes an application of a combination of computational models of visual attention tothe image retrieval problem. We demonstrate that certain shortcomings of existing content-based image retrieval solutions canbe addressed by implementing a biologically motivated, unsupervised way of grouping together images whose salient regions ofinterest (ROIs) are perceptually similar regardless of the visual contents of other (less relevant) parts of the image. We propose amodel in which only the salient regions of an image are encoded as ROIs whose features are then compared against previously seenROIs and assigned cluster membership accordingly. Experimental results show that the proposed approach works well for severalcombinations of feature extraction techniques and clustering algorithms, suggesting a promising avenue for future improvements,such as the addition of a top-down component and the inclusion of a relevance feedback mechanism.
Copyright © 2007 Hindawi Publishing Corporation. All rights reserved.
1. INTRODUCTION
The dramatic growth in the amount of digital images avail-able for consumption and the popularity of inexpensivehardware and software for acquiring, storing, and distribut-ing images have fostered considerable research activity in thefield of content-based image retrieval (CBIR) [1] during thepast decade [2, 3]. Simply put, in a CBIR system users searchthe image repository providing information about the actualcontents of the image, which is often done using another im-age as an example. A content-based search engine translatesthis information in some way as to query the database (basedon previously extracted and stored indexes) and retrieve thecandidates that are more likely to satisfy the user’s request.
In spite of the large number of related papers, proto-types, and several commercial solutions, the CBIR problemhas not been satisfactorily solved. Some of the open prob-lems include the gap between the image features that can beextracted using image processing algorithms and the seman-tic concepts to which they may be related (the well-knownsemantic gap problem [4–6], which can often be translated as“the discrepancy between the query a user ideally would andthe one it actually could submit to an information retrieval
system” [7]), the lack of widely adopted testbeds and bench-marks [8, 9], and the inflexibility and poor functionality ofmost existing user interfaces, to name just a few.
Some of the early CBIR solutions extract global featuresand index an image based on them. Other approaches takeinto account the fact that, in many cases, users are search-ing for regions or objects of interest as opposed to the entirepicture. This has led to a number of proposed solutions thatdo not treat the image as a whole, but rather deal with por-tions (regions or blobs) within an image, such as [10, 11], orfocus on objects of interest, instead [12]. The object-basedapproach for the image retrieval problem has grown to be-come an area of research referred to as object-based imageretrieval (OBIR) in the literature [12–14].
Object- and region-based approaches usually must relyon image segmentation algorithms, which leads to a num-ber of additional problems. More specifically, they must em-ploy strong segmentation—“a division of the image data intoregions in such a way that region T contains the pixels ofthe silhouette of object O in the real world and nothing else”[3], which is unlikely to succeed for broad image domains.A frequently used alternative to strong segmentation is weaksegmentation, in which “region T is within bounds of object
![Page 61: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/61.jpg)
Our work
Visualattention+objectdetection(usingagame)
Ask’nSeek: a new game for object detection and labeling
Axel Carlier1, Oge Marques2, and Vincent Charvillat1
1 IRIT-ENSEEIHT, University of Toulouse, France{Axel.Carlier, Vincent.Charvillat}@enseeiht.fr
2 Florida Atlantic University, USA [email protected]
Abstract. This paper proposes a novel approach to detect and label objects withinimages and describes a two-player web-based guessing game – Ask’nSeek – thatsupports these tasks in a fun and interactive way. Ask’nSeek asks users to guessthe location of a hidden region within an image with the help of semantic andtopological clues. The information collected from game logs is combined withresults from content analysis algorithms and used to feed a machine learning al-gorithm that outputs the outline of the most relevant regions within the image andtheir names. Two noteworthy aspects of the proposed game are: (i) it solves twocomputer vision problems – object detection and labeling – in a single game; and(ii) it learns spatial relations within the image from game logs. The game has beenevaluated through user studies, which confirmed that it was easy to understand,intuitive, and fun to play.
1 Introduction
There are many open problems in computer vision (e.g., object detection) for whichstate-of-the-art solutions still fall short of performing perfectly. The realization thatmany of those tasks are arduous for computers and yet relatively easy for humans hasinspired many researchers to approach those problems from a ‘human computation’viewpoint, using methods that include crowdsourcing (“a way of solving problem basedon a large number of small contributions from a large number of different persons”) andgames – often called, more specifically, “games with a purpose (GWAPs)” [1].
In this paper we propose a novel approach to solving a subset of computer vi-sion problems – namely object detection and labeling
3 – using games and describeAsk’nSeek, a two-player web-based guessing game targeted at the tasks of object de-tection and labeling. Ask’nSeek asks users to guess the location of a small rectangularregion hidden within an image with the help of semantic and topological clues (e.g., “tothe right of the bus”), by clicking on the image location which they believe correspondsto (one of the points of) the hidden region. Once enough games have been played us-ing a given image, our novel machine learning algorithm combines user-provided input(coordinates of clicked points and spatial relationships between points and regions –‘above’, ‘below’, ‘left’, ‘right’, ‘on’, ‘partially on’, or ‘none’) with results from off-the-shelf computer vision algorithms applied to the image, to produce the outline (boundingbox) of the most relevant regions within the image and their associated labels. These
3 In this paper we use the phrase object labeling to refer to the process of assigning a textuallabel to an object’s bounding box.
![Page 62: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/62.jpg)
Face Recognition
Weseemtobeparticularlygoodatrecognizingfamous/familiarfacesevenwhenthey’reblurry
even though the effective resolution in that region is verylimited. Recognition performance changes only slightlyafter obscuring the gait or body, but is affected dramaticallywhen the face is hidden, as illustrated in Fig. 2. This doesnot appear to be a skill that can be acquired through generalexperience; even police officers with extensive forensicexperience perform poorly unless they are familiar with thetarget individuals. The fundamental question this finding,and others like it [49], [66], bring up is the following: Howdoes the facial representation and matching strategy usedby the visual system change with increasing familiarity, soas to yield greater tolerance to degradations? We do not yetknow exactly what aspect of the increased experience witha given individual leads to an increase in the robustness ofthe encoding; is it the greater number of views seen or isthe robustness an epiphenomenon related to some bio-logical limitations such as slow memory consolidationrates? Notwithstanding our limited understanding, someimplications for computer vision are already evident. In
considering which aspects of human performance to takeas benchmarks, we ought to draw a distinction betweenfamiliar and unfamiliar face recognition. The latter mayend up being a much more modest goal than the formerand might constitute a false goal towards which to strive.The appropriate benchmark for evaluating machine-basedface recognition systems is human performance withfamiliar faces.
3) Result 3: High-Frequency Information by Itself Does NotLead to Good Face Recognition Performance: We have longbeen enamored of edge maps as a powerful initial repre-sentation for visual inputs. The belief is that edges capturethe most important aspects of images (the discontinuities)while being largely invariant to shallow shading gradientsthat are often the result of illumination variations. In thecontext of human vision as well, line drawings appear to besufficient for recognition purposes. Caricatures and quickpen portraits are often highly recognizable. Do theseobservations mean that high spatial frequencies arecritical, or at least sufficient, for face recognition? Severalresearchers have examined the contribution of differentspatial frequency bands to face recognition [14], [21].Their findings suggest that high spatial frequencies mightnot be too important for face perception. In the particulardomain of line drawings, Graham Davies and his col-leagues have reported [16] that images which containexclusively contour information are very difficult to re-cognize (specifically, they found that subjects could recog-nize only 47% of the line drawings compared to 90% of theoriginal photographs; see Fig. 3). How can we reconcilesuch findings with the observed recognizability of linedrawings in everyday experience? Bruce and colleagues[6], [7] have convincingly argued that such depictions do,in fact, contain significant photometric cues and that thecontours included in such a depiction by an accomplishedartist correspond not just to a low-level edge map, but in
Fig. 2. Frames from video-sequences used in Burton et al. [10] study.
(a) Original input. (b) Body obscured. (c) Face obscured. Based on
results from such manipulations, researchers concluded that
recognition of familiar individuals in low-resolution video is based
largely on facial information.
Fig. 1. Unlike current machine-based systems, human observers are able to handle significant degradations in face images. For instance,
subjects are able to recognize more than half of all familiar faces shown to them at the resolution depicted here. Individuals shown in
order are: Michael Jordan, Woody Allen, Goldie Hawn, Bill Clinton, Tom Hanks, Saddam Hussein, Elvis Presley, Jay Leno,
Dustin Hoffman, Prince Charles, Cher, and Richard Nixon.
Sinha et al. : Face Recognition by Humans: Nineteen Results Researchers Should Know About
1950 Proceedings of the IEEE | Vol. 94, No. 11, November 2006
even though the effective resolution in that region is verylimited. Recognition performance changes only slightlyafter obscuring the gait or body, but is affected dramaticallywhen the face is hidden, as illustrated in Fig. 2. This doesnot appear to be a skill that can be acquired through generalexperience; even police officers with extensive forensicexperience perform poorly unless they are familiar with thetarget individuals. The fundamental question this finding,and others like it [49], [66], bring up is the following: Howdoes the facial representation and matching strategy usedby the visual system change with increasing familiarity, soas to yield greater tolerance to degradations? We do not yetknow exactly what aspect of the increased experience witha given individual leads to an increase in the robustness ofthe encoding; is it the greater number of views seen or isthe robustness an epiphenomenon related to some bio-logical limitations such as slow memory consolidationrates? Notwithstanding our limited understanding, someimplications for computer vision are already evident. In
considering which aspects of human performance to takeas benchmarks, we ought to draw a distinction betweenfamiliar and unfamiliar face recognition. The latter mayend up being a much more modest goal than the formerand might constitute a false goal towards which to strive.The appropriate benchmark for evaluating machine-basedface recognition systems is human performance withfamiliar faces.
3) Result 3: High-Frequency Information by Itself Does NotLead to Good Face Recognition Performance: We have longbeen enamored of edge maps as a powerful initial repre-sentation for visual inputs. The belief is that edges capturethe most important aspects of images (the discontinuities)while being largely invariant to shallow shading gradientsthat are often the result of illumination variations. In thecontext of human vision as well, line drawings appear to besufficient for recognition purposes. Caricatures and quickpen portraits are often highly recognizable. Do theseobservations mean that high spatial frequencies arecritical, or at least sufficient, for face recognition? Severalresearchers have examined the contribution of differentspatial frequency bands to face recognition [14], [21].Their findings suggest that high spatial frequencies mightnot be too important for face perception. In the particulardomain of line drawings, Graham Davies and his col-leagues have reported [16] that images which containexclusively contour information are very difficult to re-cognize (specifically, they found that subjects could recog-nize only 47% of the line drawings compared to 90% of theoriginal photographs; see Fig. 3). How can we reconcilesuch findings with the observed recognizability of linedrawings in everyday experience? Bruce and colleagues[6], [7] have convincingly argued that such depictions do,in fact, contain significant photometric cues and that thecontours included in such a depiction by an accomplishedartist correspond not just to a low-level edge map, but in
Fig. 2. Frames from video-sequences used in Burton et al. [10] study.
(a) Original input. (b) Body obscured. (c) Face obscured. Based on
results from such manipulations, researchers concluded that
recognition of familiar individuals in low-resolution video is based
largely on facial information.
Fig. 1. Unlike current machine-based systems, human observers are able to handle significant degradations in face images. For instance,
subjects are able to recognize more than half of all familiar faces shown to them at the resolution depicted here. Individuals shown in
order are: Michael Jordan, Woody Allen, Goldie Hawn, Bill Clinton, Tom Hanks, Saddam Hussein, Elvis Presley, Jay Leno,
Dustin Hoffman, Prince Charles, Cher, and Richard Nixon.
Sinha et al. : Face Recognition by Humans: Nineteen Results Researchers Should Know About
1950 Proceedings of the IEEE | Vol. 94, No. 11, November 2006
![Page 63: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/63.jpg)
Our work
![Page 64: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/64.jpg)
Our work
!
![Page 65: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/65.jpg)
Let’s play the game!
http://tinyurl.com/guessthatface
![Page 66: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/66.jpg)
Going back to our original question…
Can you trust what you see?
![Page 67: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/67.jpg)
Source:Torralba (MIT)
![Page 68: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/68.jpg)
![Page 69: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/69.jpg)
Source:Torralba (MIT)
![Page 70: Can you trust what you see? The magic of visual perception](https://reader031.vdocuments.mx/reader031/viewer/2022030304/587854de1a28ab68198b7095/html5/thumbnails/70.jpg)
Thank you!
• CheckmyFacebookpageforrelatedresources– https://www.facebook.com/ProfessorOgeMarques