![Page 1: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/1.jpg)
LabelMe: Online Image Annotation and Applications
Proceedings of the IEEE 2010Antonio Torralba, MIT
Jenny Yuen, MITBryan C. Russell, MIT
![Page 2: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/2.jpg)
OutlineIntroductionWeb Annotation and Data Statistics
-A. Data Set Evolution and Distribution of Objects-B. Study of Online Labelers
The Space of LabelMe Images-A. Distribution of Scene Types-B. The Space of Images-C. Recognition by Scene Alignment
Beyond 2-D Images-A. From Annotations to 3-D-B. Video Annotation
Conclusion
![Page 3: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/3.jpg)
IntroductionFrom small data set to large data setIn 2005, an online tool LabelMe is
createdLabelMe provides functionalities for
drawing polygons to outline the spatioal extent of object in images
![Page 4: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/4.jpg)
Web Annotation and Data StatisticsA. Data Set Evolution and Distribution of
ObjectsB. Study of Online Labelers
![Page 5: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/5.jpg)
The Features of LabelMe DatabaseObject class recognitionLearning about objects embedded in a sceneHigh-quality labelingMany diverse object classesMany diverse imagesMany noncopyrighted imagesOpen and dynamic
![Page 6: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/6.jpg)
Data Set Evolution and Distribution of Objects(1/2)
(a)Number of annotated objects(b)Number of images with at least one annotated object(c)Number of unique object descriptions
![Page 7: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/7.jpg)
Data Set Evolution and Distribution of Objects(2/2)
The observation suggests two learning problems:1) Learning from few training samples(N->1)2) Learning with millions of samples(N->)
![Page 8: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/8.jpg)
Study of Online LabelersFrom July 7, 2008
to March 19, 2009
(a)Number of new annotations provided by individual users(b)Distribution of the length of time it takes to label an object
![Page 9: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/9.jpg)
The Space of LabelMe ImagesA. Distribution of Scene TypesB. The Space of ImagesC. Recognition by Scene Alignment
![Page 10: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/10.jpg)
Distribution of Scene Types(1/1)Let’s start from cognitive psychologyNext we study how many configurations of 4
objects are presentedThe distribution follows a power law
(n=1,2,4,8)
![Page 11: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/11.jpg)
The Space of Images(1/3)Define “Semantic Distance”:
1) Assign each pixel to a single object category2) Divide the image into NN nonoverlapping windows and build histogram for each window3) Use spatial pyramid matching over object labels
![Page 12: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/12.jpg)
Process of Defining Semantic Distance(2/3)
![Page 13: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/13.jpg)
The Space of Images(3/3)A visualization of 12201 images that are fully
annotated
![Page 14: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/14.jpg)
Recognition by Scene AlignmentWhen giving a new image as input, we use GIST
descriptor to compute the distance
![Page 15: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/15.jpg)
The Power of a Large Scale DatabaseAn algorithm provides an upper bound:
find the nearest neighbor of input image as a labeling of the input image
This result gives us a hint about “How many more images do we need to label”?
![Page 16: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/16.jpg)
Beyond 2-D ImagesA. From Annotations to 3-DB. Video Annotation
![Page 17: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/17.jpg)
From Annotations to 3-D(1/7)The label of objects now contains some
implicit information observed by analyzing the overlap between object boundaries
Object types Ground Objects
Standing Objects
Attached objects
Relations between objects
Supported-by
Part-of
![Page 18: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/18.jpg)
From Annotations to 3-D(2/7)Learning the relationship between objects
1) part-of : evaluate the frequency of high relative overlap between polygons2)supported-by : have the bottom part of its polygon live inside the supporting object
![Page 19: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/19.jpg)
From Annotations to 3-D(3/7)
![Page 20: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/20.jpg)
From Annotations to 3-D(4/7)Reconstructing a 3D model for input image
1) define object type2) define polygon edge type3) compute the real distance between objects
Object type Edge type
Ground objects(green)
Contact(white)
Standing objects(red)
Attached(gray)
Attached objects(yellow)
Occlusion(black)
![Page 21: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/21.jpg)
From Annotations to 3-D(5/7)
![Page 22: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/22.jpg)
From Annotations to 3-D(6/7)The more labeling makes the quality betterHowever, if the labeling goes wrong
![Page 23: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/23.jpg)
From Annotations to 3-D(7/7)
![Page 24: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/24.jpg)
Video Annotation(1/1)
![Page 25: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/25.jpg)
ConclusionA web-based tool that allows the labeling of
objects and their location in imagesLabelMe has collected a large annotated
database of images with many different scene and object class
LabelMe can recover the 3-D description of an image
The next goal is expending the database of video and offering a promising direction of computer vision and computer graphics
![Page 26: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/26.jpg)
References
![Page 27: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT](https://reader038.vdocuments.mx/reader038/viewer/2022110322/56649d2c5503460f94a0304c/html5/thumbnails/27.jpg)
References
There are a lot more references …