cs b657 spring 2016 final project...

6
CS B657 Spring 2016 Final Project Report Usernames: 1. Adithya Nagaraj Tirumale (aditnaga) 2. Akash Ram Gopal (agopal) 3. Sumit Kumar Dey (skdey) Introduction: Nowadays, there is a lot of attention being given to the ability of the car to drive itself. One of the many important aspects for a self driving car is the ability for it to detect traffic signs in order to provide safety and security for the people not only inside the car but also outside of it. The traffic environment consists of different aspects whose main purpose is to regulate flow of traffic, make sure each driver is adhering to the rules so as to provide a safe and secure environment to all the parties concerned. We have focused our project on the US traffic signs and a few of the traffic signs which we have in our dataset is as shown in the figure below. We used the LISA traffic sign dataset [3] . The dataset consisted of 48 different types of US traffic sings. About 75% of the frames were in gray scale and the rest in color. The problem we are trying to solve has some advantages such as traffic signs being unique thereby resulting in object variations being small and traffic signs are clearly visible to the driver/system [1] . The other side of the coin is that we have to contend with lighting and weather conditions [1] . The main objective of our project is to design and construct a computer based system which can automatically detect the road signs so as to provide assistance to the user or the machine so that they can take appropriate actions. The proposed approach consists of building a model using convolutional neural networks by extracting traffic signs from an image using color information. We have used convolutional neural networks (CNN) to classify the traffic signs and we used color based segmentation to extract/crop signs from images. Background and related work: Many different techniques have been applied to detect traffic signs. Most of these techniques are based on using HOG and SIFT features. In our approach we use biologically inspired convolutional neural networks to build a model which can predict the type of traffic sign. One such related work based on convolutional neural

Upload: others

Post on 28-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS B657 Spring 2016 Final Project Reportvision.soic.indiana.edu/b657/sp2016/projects/aditnaga/paper.pdf · 2. Akash Ram Gopal (agopal) 3. Sumit Kumar Dey (skdey) Introduction: Nowadays,

CSB657Spring2016FinalProjectReportUsernames:

1.AdithyaNagarajTirumale(aditnaga)2.AkashRamGopal(agopal)3.SumitKumarDey(skdey)

Introduction:Nowadays,thereisalotofattentionbeinggiventotheabilityofthecartodriveitself.Oneofthemanyimportantaspectsforaselfdrivingcaristheabilityforittodetecttrafficsignsinordertoprovidesafetyandsecurityforthepeoplenotonlyinsidethecarbutalsooutsideofit.Thetrafficenvironmentconsistsofdifferentaspectswhosemainpurposeistoregulateflowoftraffic, make sure each driver is adhering to the rules so as to provide a safe and secureenvironmenttoallthepartiesconcerned. WehavefocusedourprojectontheUStrafficsignsandafewofthetrafficsignswhichwehaveinourdatasetisasshowninthefigurebelow.WeusedtheLISAtrafficsigndataset[3].Thedatasetconsistedof48differenttypesofUStrafficsings.About75%oftheframeswereingrayscaleandtherestincolor. The problemwe are trying to solve has some advantages such as traffic signs being uniquethereby resulting in object variations being small and traffic signs are clearly visible to thedriver/system[1].Theothersideofthecoinisthatwehavetocontendwithlightingandweatherconditions[1]. Themainobjectiveofourprojectistodesignandconstructacomputerbasedsystemwhichcanautomaticallydetecttheroadsignssoastoprovideassistancetotheuserorthemachinesothattheycan takeappropriateactions.Theproposedapproachconsistsofbuildingamodelusingconvolutionalneuralnetworksbyextractingtrafficsignsfromanimageusingcolorinformation.Wehaveusedconvolutionalneuralnetworks(CNN)toclassifythetrafficsignsandweusedcolorbasedsegmentationtoextract/cropsignsfromimages.Backgroundandrelatedwork:Manydifferenttechniqueshavebeenappliedtodetecttrafficsigns.MostofthesetechniquesarebasedonusingHOGandSIFTfeatures.In our approachweuse biologically inspired convolutional neural networks to build amodelwhichcanpredictthetypeoftrafficsign.Onesuchrelatedworkbasedonconvolutionalneural

Page 2: CS B657 Spring 2016 Final Project Reportvision.soic.indiana.edu/b657/sp2016/projects/aditnaga/paper.pdf · 2. Akash Ram Gopal (agopal) 3. Sumit Kumar Dey (skdey) Introduction: Nowadays,

networksispublishedin‘TrafficSignRecognitionwithMulti-ScaleConvolutionalNetworks’byPierreSermanetandYannLeCun[4].Methods:Theproblemoftrafficsignrecognitionistwofold:

1) Extractingapotentialtrafficsignfromanimage.Traffic signs are designed such that they appear unique and easily identifiable to thehumaneye.TrafficsignsintheUnitedStatesofAmericaareof3maincolors:Red,White,and Yellow. Other colors like orange and blue are also used. In our approach weconcentrateonRed,White,andYellowtrafficsigns.Sincethecolorofatrafficsignisuniqueinabackgroundwecanusethecolorinformationtonarrowdownourareasofinterest(partspotentiallycontainingthetrafficsign).Since RGB colored images are susceptible to variations in lighting, we use HSV (Hue,Saturation,andVariation)images.OncewehavetheHSVimageournextgoalistodefineourareasofinterest(i.e.rangeofYellow,RedandWhite)sothatwecansegmentourHSVimagebasedonthese3colors.Thecolorrangesusedareasfollows:

Color LowerRange(HSV) UpperRange(HSV)Yellow ([10,50,50]) ([30,255,255])Red ([170,50,50]) ([185,255,255])White ([0,0,50]) ([120,15,255])

Thenextstepistousethesecolorrangesandcreatebinarymasksforeachofthe3colors.ForExample,theredbinarymaskwillhave0assignedtoalltheregionswhicharenotintheredrangeand1assignedtoallregionswhichareintheredrange.TheRed,Yellowandwhitebinarymasksforanimageareshownbelow:

Page 3: CS B657 Spring 2016 Final Project Reportvision.soic.indiana.edu/b657/sp2016/projects/aditnaga/paper.pdf · 2. Akash Ram Gopal (agopal) 3. Sumit Kumar Dey (skdey) Introduction: Nowadays,

Asseenfromtheaboveexamplethetheoriginalimageissegmentedbasedoncolors.Weknowthattrafficsignsareusuallyoccur indifferentclosedshaped likerectangles,triangles,diamondsetc.Wecanusethispropertytoextractclosedshapedfromeachofthe3binarymasks.Thiscanbedonebyusing‘TopologicalStructuralAnalysisofDigitizedBinaryImagesbyBorder’[5].WeusedtheOpenCVimplementationoftisalgorithm[6].

Theextractedcontoursfromthebinarymasksareasfollows:

Aswecanseefromtheseimageswehavenarroweddowntheareasofinterestfromtheentireimage.Theseareasofinterestarefurtherrefinedbasedonthesizeofthecontourtoreducetheareasofinterest.

Once we have refined the set of areas of interest, we use the convolutional neuralnetworkwhichwearegoingtobuildinthenextsteptopredictthetypeofthissign(orifitisnotasign).

2) PredictingthetypeofExtractedtrafficsign.Fromtheextractedareasofinterestsinthepreviousstepwewanttodetermineifitisasignornotandifitisasignwewishtoknowwhatthetypeofsignitactuallyis.Forthispurpose,wecantrainaconvolutionalneuralnetwork.Thedatausedtotrainandtest the CNN was obtained from http://cvrr.ucsd.edu/LISA/lisa-traffic-sign-dataset.html.Ithadabout6000framesand49differenttypesoftrafficsigns.Foreachframe,thecoordinatepositionsforthetrafficsignintheimagewasgiven.FromthesepositionsthetrafficsignswerecroppedouttousefortrainingtheCNN.

Page 4: CS B657 Spring 2016 Final Project Reportvision.soic.indiana.edu/b657/sp2016/projects/aditnaga/paper.pdf · 2. Akash Ram Gopal (agopal) 3. Sumit Kumar Dey (skdey) Introduction: Nowadays,

ACNNisbasicallyinspiredbytheconnectionsbetweentheneuronsinthevisualcortexofanimals.[7]Sincetrafficsignshaveuniqueshapesinsidethemlikearrows,words,circlesand so on. It is useful to convert the traffic sign into amore useful form by using aLaplacian operation on the traffic sign. We can apply the Laplacian operation byconvolvingthefollowingkernelontheinputimage:

0 -1 0

-1 4 -1

0 -1 0

ConsiderthefollowingtrafficsignanditsLaplacian:

TheLaplacianisnowfedintotheCNNwhosearchitectureinshownbelow:

5x5Convolution

2x2Convolution

800FullyConnected

256FullyConnected

49Softmaxunits

Page 5: CS B657 Spring 2016 Final Project Reportvision.soic.indiana.edu/b657/sp2016/projects/aditnaga/paper.pdf · 2. Akash Ram Gopal (agopal) 3. Sumit Kumar Dey (skdey) Introduction: Nowadays,

ThelearningrateusedtotraintheCNNwas0.001andthemomentumusedwas0.9.TheCNNwastrainedfor200iterations(magicnumbers).OncetheCNNhasbeentrained,itisusedtopredictthesignofthecontoursobtainedinstep1.EachofthesecontoursareassignedthesignwiththemaximumprobabilitywhichistheoutputoftheCNN.Followingistheexampleimagewhichshowsthepredictedsignsforallthecontours:

WecanalsousethetrainedCNNtogettheAccuracy,Precision,RecallandF1scoremetricsonthetestset.Theseresultsarediscussedinthenextsection.Results:ThefollowingtablegivestheAccuracy,Precision,RecallandF1scoremetricsonthetestset.Thetestsetwasobtainedbysplittingthewholedatasetinto70%traindataand30%validationandtestdata.Outofthe30%validation,15%wasthetestdata.

Page 6: CS B657 Spring 2016 Final Project Reportvision.soic.indiana.edu/b657/sp2016/projects/aditnaga/paper.pdf · 2. Akash Ram Gopal (agopal) 3. Sumit Kumar Dey (skdey) Introduction: Nowadays,

Thefollowingsresultstakeintoconsiderationthetrafficsignthatisperfectlycroppedfromtheimage.Thismaynotbetruewhenweareextractingtrafficsignsfromtheimagewithoutthepriorknowledgeoftheirposition.

Metric ScoreAccuracy 86.9%Precision 0.8638Recall 0.8694

F1Score 0.8633Conclusion:FromthefollowingresultswecanseethattheCNNisdoingagoodjobinclassifyingdifferenttypesoftrafficsignswhentheextractedsignsarecroppedperfectlyfromtheimage.Ourapproachfailstogivegoodresultswhentheextractedsignsfromtestimagesarecroppedincorrectly.Anotherdrawbackofourapproach isthatwhenthecolorofthetrafficsignsvarywhichmaybeduetobadweatherconditionsandpoorcameraquality,theimagemasksobtainedarenotperfectandhencethesignsarenotdetectedproperly.Future improvements can bemade for extracting signs from test images by using advancedsegmentationmethodsReferences:[1]https://bartlab.org/Dr.%20Jackrit's%20Papers/ney/3.KRS036_Final_Submission.pdf[2]http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.695.3606&rep=rep1&type=pdf[3]http://cvrr.ucsd.edu/LISA/lisa-traffic-sign-dataset.html[4]http://yann.lecun.com/exdb/publis/pdf/sermanet-ijcnn-11.pdf[5] Suzuki,S.andAbe,K.,TopologicalStructuralAnalysisofDigitizedBinaryImagesbyBorderFollowing.CVGIP301,pp32-46(1985)[6]http://docs.opencv.org/2.4/doc/tutorials/imgproc/shapedescriptors/find_contours/find_contours.html[7]https://en.wikipedia.org/wiki/Convolutional_neural_network