Ball Detection via Machine Learning - KTH ?· Ball Detection via Machine Learning RAFAEL OSORIO Master’s…

Download Ball Detection via Machine Learning - KTH ?· Ball Detection via Machine Learning RAFAEL OSORIO Master’s…

Post on 13-Sep-2018

213 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

<ul><li><p>Ball Detection via Machine Learning </p><p> R A F A E L O S O R I O </p><p> Master of Science Thesis Stockholm, Sweden 2009 </p></li><li><p>Ball Detection via Machine Learning </p><p> R A F A E L O S O R I O </p><p> Masters Thesis in Computer Science (30 ECTS credits) at the School of Computer Science and Engineering Royal Institute of Technology year 2009 Supervisor at CSC was rjan Ekeberg Examiner was Anders Lansner TRITA-CSC-E 2009:004 ISRN-KTH/CSC/E--09/004--SE ISSN-1653-5715 Royal Institute of Technology School of Computer Science and Communication KTH CSC SE-100 44 Stockholm, Sweden URL: www.csc.kth.se </p></li><li><p>Abstract </p><p>This thesis evaluates a method for real time detection of footballs in low </p><p>resolution images. The company Tracab uses a system of 8 camera pairs that </p><p>cover the whole pitch during a football match. By using stereo vision it is </p><p>possible to track the players and the ball, to be able to extract statistical data. In </p><p>this report a method proposed by Viola and Jones is evaluated to see if it can be </p><p>used to detect footballs in the images extracted by the cameras. The method is </p><p>based on a boosting algorithm called Adaboost and has mainly been used for </p><p>face detection. A cascade of boosted classifiers is trained from examples of </p><p>positive and negative images of footballs. In this report the images are much </p><p>smaller than the typical objects that the method has been developed for and a </p><p>question that this thesis tries to answer is if the method is applicable to objects </p><p>of such small sizes. </p><p>The Support Vector Machine (SVM) method has also been tested to see if the </p><p>performance of the classifier can be improved. Since the SVM method is a </p><p>time-consuming method it has been tested as a last step in the classifier </p><p>cascade, using features selected by the boosting process as input. </p><p>In addition to this, a database of images of footballs from 6 different matches </p><p>consisting of 10317 images used for training and 2221 images used for testing </p><p>has been produced. Results show that detection can be made with improved </p><p>performance compared to Tracabs existing software. </p><p>Sammanfattning </p><p>Bolldetektion via maskininlrning </p><p>Denna rapport granskar en metod fr realtidsdetektion av fotbollar i </p><p>lgupplsta bilder. Fretaget Tracab anvnder sig av 8 kamerapar som </p><p>tillsammans tcker en hel fotbollsplan under en match. Med hjlp av </p><p>stereoseende r det mjligt att flja spelare och boll fr att sedan erbjuda </p><p>statistik till fans. I denna rapport utvrderas en metod utvecklad av Viola och </p><p>Jones fr att se om den gr att anvnda till att detektera fotbollar i bilderna frn </p><p>de 16 kamerorna. Metoden baseras p en boostingalgoritm som kallas </p><p>Adaboost och som frmst har anvnts fr ansiktsdetektion. En kaskad av </p><p>boostade klassificerare trnas utifrn positiva och negativa exempelbilder av </p><p>fotbollar. I denna rapport anvnds bilder p sm bollar som r mindre n de </p></li><li><p>vanliga objekt som metoden skapats fr. En frga som denna rapport frsker </p><p>svara p r huruvida denna metod r applicerbar p s sm objekt. </p><p>Support Vektor Maskiner (SVM) har ven testats fr att se om klassificerarens </p><p>prestanda kan hjas. Eftersom SVM r en lngsam metod har den integrerats </p><p>som ett sista steg i den trnade kaskaden. Features frn Viola och Jones metod </p><p>har anvnts som input till SVM. </p><p>En databas bestende utav ett trningsset och ett testset har skapats frn 6 </p><p>matcher. Trningssetet bestr av 10317 bilder och testsetet bestr av 2221 </p><p>bilder. Resultaten visar att detektion gr att gra med hgre precision jmfrt </p><p>med Tracabs nuvarande mjukvara. </p></li><li><p>Content </p><p>Introduction .............................................................................................................. 1 </p><p>1.1 Background ........................................................................................................... 1 </p><p>1.2 Objective of the thesis .......................................................................................... 2 </p><p>1.3 Hit rate vs. false positive rate ............................................................................... 4 </p><p>1.4 Related Work ........................................................................................................ 5 </p><p>1.5 Thesis Outline ....................................................................................................... 9 </p><p>Image Database ....................................................................................................... 10 </p><p>2.1 Ball tool ............................................................................................................... 10 </p><p>2.2 Images ................................................................................................................. 11 </p><p>2.2.1 Training set .................................................................................................. 11 </p><p>2.2.2 Negatives ..................................................................................................... 12 </p><p>2.2.3 Test set ........................................................................................................ 12 </p><p>2.2.4 Five-a-side.................................................................................................... 13 </p><p>2.2.5 Correctness .................................................................................................. 13 </p><p>Theoretical background ........................................................................................... 15 </p><p>3.1 Overview ............................................................................................................. 15 </p><p>3.2 Features .............................................................................................................. 16 </p><p>3.2.1 Haar features ............................................................................................... 16 </p><p>3.2.2 Integral Image .............................................................................................. 17 </p><p>3.5 AdaBoost ............................................................................................................. 19 </p><p>3.5.1 Analysis ........................................................................................................ 19 </p><p>3.5.2 Weak classifiers ........................................................................................... 20 </p><p>3.5.3 Boosting ....................................................................................................... 21 </p><p>3.6 Cascade ............................................................................................................... 22 </p><p>3.6.1 Bootstrapping .............................................................................................. 23 </p><p>3.7 Support Vector Machine ..................................................................................... 24 </p><p>3.7.1 Overfitting ................................................................................................... 24 </p><p>3.7.2 Non-linearly separable data ........................................................................ 25 </p><p>3.7.3 Features extracted with Adaboost .............................................................. 26 </p><p>3.8 Tying it all together ............................................................................................. 26 </p></li><li><p>Method ................................................................................................................... 27 </p><p>4.1 Training ............................................................................................................... 27 </p><p>4.2 Step size and scaling ........................................................................................... 28 </p><p>4.3 Masking out the audience .................................................................................. 29 </p><p>4.4 Number of stages ................................................................................................ 29 </p><p>4.5 Brightness threshold ........................................................................................... 30 </p><p>4.6 SVM ..................................................................................................................... 31 </p><p>4.7 OpenCV ............................................................................................................... 32 </p><p>Results ..................................................................................................................... 33 </p><p>5.1 ROC-curves .......................................................................................................... 33 </p><p>5.2 Training results ................................................................................................... 34 </p><p>5.3 Using different images for training ..................................................................... 35 </p><p>5.3.1 Image size .................................................................................................... 35 </p><p>5.3.2 Image sets .................................................................................................... 36 </p><p>5.3.3 Negative images .......................................................................................... 37 </p><p>5.4 Step Size .............................................................................................................. 39 </p><p>5.5 Real and Gentle Adaboost .................................................................................. 41 </p><p>5.6 Minimum hit rate and max false alarm rate ....................................................... 42 </p><p>5.7 Brightness threshold ........................................................................................... 44 </p><p>5.8 Number of stages ................................................................................................ 46 </p><p>5.9 Support Vector Machine ..................................................................................... 47 </p><p>5.10 Compared to existing detection........................................................................ 49 </p><p>5.11 Five-a-side ......................................................................................................... 50 </p><p>5.12 Discussion ......................................................................................................... 52 </p><p>Conclusions and future work ................................................................................... 55 </p><p>6.1 Conclusions ......................................................................................................... 55 </p><p>6.2 Future work......................................................................................................... 56 </p><p>Bibliography ............................................................................................................ 57 </p><p>Appendix 1 .............................................................................................................. 59 </p><p>Training set: .......................................................................................................... 59 </p><p>Test set 1: ............................................................................................................. 59 </p></li><li><p>1 </p><p>Chapter 1 </p><p>Introduction </p><p>In this chapter the circumstances of the problem are presented as well as the </p><p>goal of the thesis. Related work is described and an outline of the thesis is </p><p>given. </p><p>1. 1 Background </p><p>This Master thesis was performed at Svenska Tracab AB. Tracab has </p><p>developed real-time camera-based technology for locating the positions of </p><p>football players and the ball during football matches. Eight pairs of cameras are </p><p>installed around the pitch, controlled by a cluster of computers. Fig 1 shows </p><p>how pairs of cameras give stereo vision and how this makes it possible to </p><p>calculate the X and Y coordinates of an object on the pitch. </p><p>Fig 1 - Eight camera pairs cover the pitch giving stereo vision. </p><p>With this information it is possible to extract statistics such as total distance </p><p>covered by a player, a heat map where warmer color means that the player has </p><p>spent more time in that area of the pitch, completed passes, speed and </p><p>acceleration of the ball and of the players and a lot more. The whole process is </p><p>carried out in real-time (25 times per second). The system is semi-automatic </p><p>and is staffed with operators during the game. All moving objects that are </p><p>player-like are shown as targets by the system. The operators need to assign the </p><p>players to a target, since no face recognition or shirt number identification is </p></li><li><p>2 </p><p>done to identify the players. They must also remove targets that are not subject </p><p>of tracking, e.g. medics and ball boys. </p><p>One big advantage of the system is that it does not interfere with the game in </p><p>any way. No transmitters or any other kind of device on the players or the ball </p><p>are used. </p><p>1. 2 Objective of the thesis </p><p>The objective of this Master thesis is to improve the ball detection using </p><p>machine learning techniques. Today the existing ball tracking method primarily </p><p>uses the movement of an object to recognize the ball, instead of the appearance. </p><p>In this report we will see if it is possible to shift the focus from using the </p><p>movement to doing object detection in every frame. A key attribute of the </p><p>method used is that it has to be fast enough for real-time usage. </p><p>Tracabs technology is already good at detecting moving balls against a static </p><p>background, so an aim for this project is to produce reasonable ball hypotheses </p><p>in more difficult situations such as: </p><p> The ball is partially occluded by players. </p><p> The lighting conditions are uneven, especially when the sun only lights </p><p>up a part of the pitch. </p><p> Other objects like the head of a player or the socks of a player look like </p><p>the ball. </p><p> The ball is still, e.g. a free kick. </p><p>A classifier is to be trained to detect footballs, based on a labeled data set of </p><p>ball / non-ball image regions from images captured by Tracabs cameras. When </p><p>talking about image regions in this report, a smaller sub window that is part of </p><p>the whole image is what is meant (left of figure 2). When only talking about an </p><p>image, the whole image is meant (right of figure 2). </p><p>Fig 2 Example of an image region and an image captured by Tracabs cameras. </p><p>The classifier needs to be somewhat robust to changes in ball size and </p><p>preferably also ball color since they differ in different situations. One big </p><p>difference between this project and previous studies of object detection such as </p></li><li><p>3 </p><p>the paper of Viola and Jones is the size of the object [32]. Here it is very small, </p><p>only a few pixels wide. A big question is to see if the method presented in this </p><p>report can be applied to objects of this size. </p><p>Even with reasonably good detection of the ball it is difficult to tell apart the </p><p>ball from other objects using only techniques based on the analysis of still </p><p>images. One way of solving this is by examining the trajectory of the object in </p><p>a sequence of images to discard objects that do not move like a ball. Also, if the </p><p>classifier detects the ball most of the time, only missing out a few frames at a </p><p>time, it is possible to do post processing to calculate the most likely...</p></li></ul>