improvised architeture of visual attention model

34
BARATH MUTHU KUMAR BLENU4CSE08023 RAVIKIRAN CH BLENU4CSE08027 V SUBASHINI BLENU4CSE08510 GUIDE DR.AMUDHA J ASSOCIATE PROFESSOR AMRITA SCHOOL OF ENGG.,BANGLORE Amrita School of Engineering,Bangalore-35

Upload: ravi-kiran-chadalawada

Post on 13-Apr-2017

100 views

Category:

Technology


0 download

TRANSCRIPT

BARATH MUTHU KUMAR BLENU4CSE08023

RAVIKIRAN CH BLENU4CSE08027

V SUBASHINI BLENU4CSE08510

GUIDE

DR.AMUDHA J

ASSOCIATE PROFESSOR

AMRITA SCHOOL OF ENGG.,BANGLORE

Amrita School of Engineering,Bangalore-35

Problem statement

Detailed Design(Training)

Detailed Design(Testing)

Coding Guidelines

Implementation

Performance Evaluation

References

Conclusion

Amrita School of Engineering,Bangalore-35

Given an Image frame or a video, analyse it using

Improvised VAM to generate Salient region and

find the target object.

Amrita School of Engineering,Bangalore-35

IMAGE

VAM

VAM

WINNER

TAKE ALL

EXTRACTED

FEATURES

DECISION

TREE

CLASSIFIER

FEATURE

SELECTION

TYPE

SM

90FM2

BYFM3

BYFM1

FEATURE MAPS

B

O

T

T

O

M

-

U

P

M

O

D

U

L

E

CONSPICUITY MAPS

SALIENCY

MAP

PYRAMID

CONSTRUCTION

FEATURE

NO.

TYPE

1 RGFM1

2 SM

3 RGFM3

4 BYFM4

5 90FM2

6 INTFM1

7 INTFM5

8 BYFM3

9 0FM1

10 BYFM1

11 135FM3

Amrita School of Engineering,Bangalore-35

REQUIRED

FEATURES

TYPE

SM

90FM2

BYFM3

BYFM1

T

O

P

-

D

O

W

N

M

O

D

U

L

E

CLASSIFIER

DETECT TARGET

OBJECT

TEST IMAGE

Amrita School of Engineering,Bangalore-35

IDE:

Using CodeBlocks cross platform IDE

CodeBlocks console project(.cbp)

Variable Naming Convention:

Names relevant to the use of variable.

For ex: variable that stores original image read from Hard

disk is named as original_img.

Documented beside each variable to state the data it holds.

Amrita School of Engineering,Bangalore-35

Function Naming Convention:

Named relevant to functionality.

For ex: function to find pyramids is named as find_pyramids()

Seperate function to perform each modular task in the project.

Data Structures:

Used data structures present in “cxcore.h” header file of

Opencv libraries.

Ex: “Mat” to store image

Made use of vectors and scalars in C++.

Amrita School of Engineering,Bangalore-35

Predefined

cv.h

highgui.h

cxcore.h

mll.h

User Defined

color.h

intensity.h

orientation.h

saliency.h

Winner_takeall.h

color_feature.cpp

Intensity_feature.

cpp

orientation_feature

.cpp

Saliency.cpp

Amrita School of Engineering,Bangalore-35

buildPyramid()

pyrup()

absdiff()

minMaxLoc()

Winner take

all.cpp

Function name: buildPyramid

Number of Parameters: 3

Parameters:

1. Actual image

2. Variable of type vector to store different levels of Pyramid

3. Number of levels

Syntax:

buildPyramid( image, dest vec, no of levels)

Output:

Images at different levels of pyramids

Amrita School of Engineering,Bangalore-35

Function name: pyrUp

Number of Parameters: 3

Parameters:

1. Actual image

2. Variable to store resized image

3. Destination size

Syntax:

pyrUp( image, dest , size)

Output:

Resized image

Amrita School of Engineering,Bangalore-35

Function name: absdiff

Number of Parameters: 3

Parameters:

1. Image 1

2. Image 2

3. Destination to store difference of image1 and image2

Syntax:

absdiff( image1,image2,dest)

Output:

Difference image

Amrita School of Engineering,Bangalore-35

Function name: minMaxLoc

Number of Parameters: 3

Parameters:

1. Image

2. Variable to store minimum intensity

3. Variable to store maximum intensity

Syntax:

minMaxLoc( image,&min,&max)

Output:

Minimum and Maximum intensities in an image

Amrita School of Engineering,Bangalore-35

color_feature.cpp

Called through “color_feature( )”

Finds colour pyramids

Finds RG,BY colour maps

Colour Feature maps.

Colour conspicuity map

Amrita School of Engineering,Bangalore-35

Amrita School of Engineering,Bangalore-35

Color_feature.cpp

R=r-(g+b)/2.

G=g-(r+b)/2.

B=b-(r+g)/2.

Y=r+g-2(|r-g|+b)

RG_fmap=|(R(c)-G(c) ) - (R(s)-G(s) )|

BY-fmap=|(B(c)-Y(c) ) - (B(s)-Y(s) )|

intensity_feature.cpp

Called through “intensity_feature( )”

Finds intensity pyramids

Intensity Feature maps.

Intensity conspicuity map

Amrita School of Engineering,Bangalore-35

intensity_feature.cpp

I=(r+g+b)/3.

Intensity_fmap=|I(c)-I(s)|

Amrita School of Engineering,Bangalore-35

orientation_feature.cpp

Called through “orientation_feature( )”

Finds orientation pyramids

Finds orientation feature maps

Uses Gabor function with kernel size of 21x21

Feature maps are found for 4 orientation angles viz. 0, 45,90

and 135 degrees respectively.

Amrita School of Engineering,Bangalore-35

orientation_feature.cpp

Gabor function has the following parameters

λ -> Wavelength of sinusoidal factor

θ -> Represents the orientation angle

ψ -> Phase offset

σ -> Gaussian envelope

γ -> Specifies spatial aspect ratio

Amrita School of Engineering,Bangalore-35

where,

x’ = x cos θ + y sin θ

y’ = -x sin θ + y cos θ

saliency.cpp

Uses the results from color_feature.cpp, intensity_feature.cpp

and orientation_feature.cpp

Finds average of all the conspicuity maps to obtain saliency

map.

Saliency_map=(color_consp+int_consp+orient_consp)/3;

Amrita School of Engineering,Bangalore-35

Signboard Class No of samples

Pedestrain Sb 16

Bike sb 16

Crossing sb 16

Total 48

Images

Signboard Class No of samples

Pedestrain Sb 6

Bike sb 6

Crossing sb 6

Total 18

Training Testing

Two classes on total

1.Signboard(SB)

2. Non signboard(NSB)

Every object detected must belong to one of the above

mentioned classes.

Amrita School of Engineering,Bangalore-35

Four Categories for every classification:

1.True Positive

2.True Negative

3.False Positive

4.False Negative

Amrita School of Engineering,Bangalore-35

Consider the following example:

A study Evaluating a new test that screens people for a disease.

Each person taking test will

1.Either have disease (sick class)

2.Does not have disease(nsick class)

The test results may be

1.positive->stating disease

2.Negative->No disease

Amrita School of Engineering,Bangalore-35

True Positive OR True Negative:

Outcome belongs to class to which it is from.

i.e. With respect to our example

Healthy people correctly diagnosed as Healthy or vice versa.

False Positive OR False Negative:

Outcome does not belongs to original class.

i.e. With respect to our example

Healthy people incorrectly diagnosed as sick or vice versa.

Amrita School of Engineering,Bangalore-35

Confusion Matrix:

• A 2D-array showing all the four possible classifications.

• Shown as below

SB NSB

SB True positive False Negative

NSB False positive True negative

Amrita School of Engineering,Bangalore-35

Detection Rate:

Ratio of total number of objects correctly detected to the

total number of detections.

Gives the efficiency of the system.

Detection Rate=(true+ve + true-ve) / total detections

where total detections=sum of all values in confusion

matrix

Amrita School of Engineering,Bangalore-35

Sensitivity:

It relates to the system’s ability to identify positive results.

Sensitivity=(true+ve)/(true+ve + true-ve)

It gives the probability of classification being correct given it is

a signboard.

Amrita School of Engineering,Bangalore-35

Specificity:

It relates to the system’s ability to identify negative results.

Specificity=( true-ve)/(true+ve + True-ve)

It gives the probability of classification being correct given it is a not a signboard.

Amrita School of Engineering,Bangalore-35

Precision:

It is a measure of system’s accuracy.

Proportion of the samples which truly belong to a class x to

the total classified under class x.

Precision=(no of true+ve)/(true+ve + false+ve)

Amrita School of Engineering,Bangalore-35

Computation Time:

Time taken to construct required feature maps,detect and classify object in a given image.

Depends on the number of feature maps to be constructed.

Lesser the computational time,more efficient the system is.

Amrita School of Engineering,Bangalore-35

Completed Literature survey

Completed design

Completed 60% of implementation

Amrita School of Engineering,Bangalore-35

• An M-tech thesis on Computational Attention Model for Traffic Sign Detection System, by N.V.P Kiran Yarlagadda, July 2011

• Computational Visual Attention Systems and their Congnitive Foundations, by Simon Frintrop, Eric Rome, Henrik I. Christensen. ACM transactions,Vol. 7, No.1, 2011.

• B.Alefs, G.Eschemann , H.Ramoser and C.Beleznai,”Road Sign Detection From Edge orientation Histograms” in Intelligent Vehicle Symposium in IEEE 2007

• N.Dalal and B.Triggs, “Histograms of Oriented Gradients for human Detection in Computer Vision and Pattern Recognition”,2005 (IEEE)

• Visual Attention: From Bio-inspired modelling to Real-time Implementation, by Nabil Ouerhani,2003

Amrita School of Engineering,Bangalore-35

• Itti, L., Koch, C. and Niebur, E. “A Model of Saliency-Based

Visual Attention for Rapid Scene Analysis”. IEEE Trans. on

PAMI 20 (11, 1998) 1254–1259.

• http://ilab.usc.edu/bu/

• http://opencv.willowgarage.com

• http://www.websters-online-dictionary.org/

Amrita School of Engineering,Bangalore-35

THANK YOU

Amrita School of Engineering,Bangalore-35