igor rosenberg summer internship creating a building detector june 16 th to september 15 th in...
DESCRIPTION
3 My activities - little modules - building detector - visiting IrelandTRANSCRIPT
Igor Rosenberg
Summer internship
Creating a building detector
June 16th to September 15th
in Dublin City University, Ireland
Supervisor: Alan Smeaton
2
EnvironmentDCU: Dublin City University
CDVP : Centre for Digital Video Processing
(25 people)
My lab: 1 professor3 post docs5 PhD students
3
My activities
- little modules
- building detector
- visiting Ireland
4
Fischlar: video enhancement
Adding content to a video to use it as search data.For example,
separating shots, extracting stories in news video, finding text in the video
5
Adding information to the MPEG7 descriptor
XML<VideoSegment> <StartTime 00:15:16> <TextAnnotation> This is the shot where the sun sets. <\TextAnnotation> <Frame height=… width= … > <FileFormat ….><\VideoSegment>
Manual annotations
Audio information
One video
XML descr.
MPEG1
6
Just to get back into coding
Width & height of keyframes
ASR
Frame rate
XML descriptor
(read the extracted images)
(read the time stamps)
(read mpeg1)
Creation of thumbnails from the keyframes(changing size of images)
7
Closed captions
0.00 this0.10 is 0.15 the 0.22 time0.35 of 0.41 red0.44 and 0.50 Sean
ASR (time stamps)
XML descriptor
CC (precise)
Shot boundary
This is the time of redemption
ENS ‘99
Shot boundary
8
ASR: w1 …….....w4 ….w7 …...
CC: ...w1 ….w4……………w7...
matching x>y => match (x) > match(y)
maximum number of matches
“tree”=match(“Trees”)
Rules:
Closed captions: matching
9
M(Ua, Vb) = f( M(U,V) M(Ua,V) M(U, Vb) )
Closed captions : dynamic programming
Time is up, Time is cut
Time is up, man!
Time is cut up
Time is up, man!
Time is cut
Time is up,
Time is cut upMan! cut
X
10
Alignment
ASR not aligned to the video time (slight offset ~ ±30 sec).
ASR VIDEO05:30.2 word 05:48.5 word
The ASR delay file is man made errors
BUT TREC changed the guidelines: work thrown in the bin
11
Research
12
Building detector
Given an image, say if a building can be seen
Literature : 40 % precision
Use for TREC - one of the features to detect: landscape/cityscape?
13
Ideas- Region segmentation- Dominant color - Texture homogeneity- Edge histogram
- Support Vector Machine to aggregate results
Extract possible building regions before anything else
14
Then evaluate each regions
Values could describe: - dominant color - texture homogeneity - measure of how straight the lines are…
v
Values = … Values = …
Values = …
15
Finally sum up these values
Have to decide on strategy:
- mean ?- highest score?- values + importance in the image?
- Support Vector Machine (once trained, decides without heuristics)
16
What my utility does
Extracts regions from image
examines regions with different tools
Sums up the results
returns boolean
17
Tools not used
Canny Edge detector
Fast Fourrier Transform
Line kernel
Hough transform
Sobel is enough
Time’s up
Doesn’t work
Works only in simple cases
What should be added
- better regional weighing- better overall measure
- SVM
19
Results
• Number of images tested: 268• Precision: 29.7 %• Recall 7.46%
20
This research experience was cool…
I met lots of peopleLearnt a lot about programmingPraticed english.
21
That’s it folks!
- Get yourselves a good supervisor
- Don’t go out on the second last week
- Don’t go to Ireland (filthy weather)
- Start early
Thank you!
22
Structure of fischlár
CocoonTomCat
•configure•process
•configure•process