looking for a needle in video haystack #appsummit14
Post on 22-Nov-2014
307 Views
Preview:
DESCRIPTION
TRANSCRIPT
Finding the Needle in the Video Haystack
Dr. Gerald Friedland Director Audio and Multimedia Research
International Computer Science Institute Berkeley, CA friedland@icsi.berkeley.edu
The Internet is Multimedia
2
Multimedia in the Internet is Growing
3
Multimedia in the Internet is Growing
3
• YouTube alone claims 48 72 100 hours video uploads every minute.
Multimedia in the Internet is Growing
3
• YouTube alone claims 48 72 100 hours video uploads every minute.
• Youku (Chinese YouTube) claims 80k video uploads per day
Multimedia in the Internet is Growing
3
• YouTube alone claims 48 72 100 hours video uploads every minute.
• Youku (Chinese YouTube) claims 80k video uploads per day
• Flickr, Instagram, Liveleak, Vimeo...
4
The Opportunity
5
The Opportunity
• Consumer-Produced Multimedia allows empirical studies at never-before-seen scale:
5
The Opportunity
• Consumer-Produced Multimedia allows empirical studies at never-before-seen scale:– sociology,
5
The Opportunity
• Consumer-Produced Multimedia allows empirical studies at never-before-seen scale:– sociology, – medicine,
5
The Opportunity
• Consumer-Produced Multimedia allows empirical studies at never-before-seen scale:– sociology, – medicine,– economics,
5
The Opportunity
• Consumer-Produced Multimedia allows empirical studies at never-before-seen scale:– sociology, – medicine,– economics, – …
5
The Opportunity
• Consumer-Produced Multimedia allows empirical studies at never-before-seen scale:– sociology, – medicine,– economics, – …
• Problem: Videos need to be searchable beyond keywords.
5
The Opportunity
• Consumer-Produced Multimedia allows empirical studies at never-before-seen scale:– sociology, – medicine,– economics, – …
• Problem: Videos need to be searchable beyond keywords.
• 5
Our Approach
6
Ball soundMale voice (near)
Child’s voice (distant)Child’s whoop (distant)
Room tone
Cameron learns to catch (http://www.youtube.com/watch?v=o6QXcP3Xvus)
Our Approach
Multimodal exploitation of video content, including audio and temporal information.
6
Ball soundMale voice (near)
Child’s voice (distant)Child’s whoop (distant)
Room tone
Cameron learns to catch (http://www.youtube.com/watch?v=o6QXcP3Xvus)
Location Estimation
7
J. Choi, G. Friedland, V. Ekambaram, K. Ramchandran: "Multimodal Location Estimation of Consumer Media: Dealing with Sparse Training Data," in Proceedings of IEEE ICME 2012, Melbourne, Australia, July 2012.
Bayesian graphical framework
8
{berkeley, sathergate, campanile}
{berkeley, haas}
{campanile} {campanile, haas}
Node: Geoloca7on of the image
Edge: Correlated loca7ons (e.g. common tag)
Edge Poten,al: Strength of an edge, (e.g. posterior distribu7on of loca7ons given common tags)
p(xi, xj |{tki } � {tkj })
p(xj |{tkj })p(xi|{tki })
top related