computer vision for card games -...

Computer Vision for Card Games Matias Castillo, Benjamin Goeing, Jesper Westell Project Objective Data Augmentation and Engineering We tested two classical machine learning methods, using scikit-learn: ● Multi-class OVA classification, using log loss, where we produced a list of classifiers and ● Multi-class OVA classification with L2 Regularization, using lambda 0.1 ● Multi-class SVM, using hinge loss ● Multi-class SVM with L2 Regularization, using lambda 0.1 Classical Machine Learning Methods Results from classical methods ● For the purpose of this project, we have created our own dataset by taking 10 pictures of each playing card lying on a table ● We then wrote a script to augment every image, using different combinations of zoom, brightness, contrast, rotation etc. ● This resulted in 1,050 examples per class for a total of 54,600 images, which we divided into 90% train, 5% dev, and 5% test ● As the training set was too large to fit into memory, we trained using random mini-batches, that were sequentially read into memory and also decided to downsize the images Results Convolutional Neural Net FUTURE WORK Training a custom CNN from scratch: ● For our joint CS229/CS221 project, we are developing a program that can play real world card games against a human player ● The goal of our 229 project is to develop a computer vision method, that can recognize playing cards lying on a table with 100% accuracy ● This is a 52-class classification problem ● In order to reach our goal of achieving 100% accuracy, we decided to design and train a custom CNN using Tensorflow from scratch: ● All of our card images are coming from the same deck that we purchased ● In future work, we plan to use more varied card sets (e.g. with different themes) ● In addition, we are planning to train a model that can recognize multiple cards at the same time ● We managed to achieve 100.0% train, dev and test accuracy with the CNN after 17 epochs ● We decided to use a ReLu activation function, max pooling with window sizes 6x6 and 4x4 and strides 6 and 4 respectively, as well as a “same” padding ● Filters used are [4,4,3,8] and [2,2,8,16] ● We used the cross-entropy loss as a cost function: References ● DeepLearning.ai / CS230 course materials ● The multi-class OVA performs better than the SVM model (85% vs. 82% dev set accuracy) ● There are still noticeable overfitting issues with both models ● Applying L2 regularization improved dev set accuracy in both models (from 85% to 89% in OVA and from 82% to 85% in SVM).

Upload: hathien

Post on 25-Aug-2018

215 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Page 1: Computer Vision for Card Games - cs229.stanford.educs229.stanford.edu/proj2017/final-posters/5141186.pdf · Printing: This poster is 48” wide by 36” high. It’s designed to be

Printing:Thisposteris48”wideby36”high.It’sdesignedtobeprintedonalargeformatprinter.

CustomizingtheContent:Theplaceholdersinthisposterareformattedforyou.Typeintheplaceholderstoaddtext,orclickanicontoaddatable,chart,SmartArtgraphic,pictureormultimediafile.Toaddorremovebulletpointsfromtext,justclicktheBulletsbuttonontheHometab.Ifyouneedmoreplaceholdersfortitles,contentorbodytext,justmakeacopyofwhatyouneedanddragitintoplace.PowerPoint’sSmartGuideswillhelpyoualignitwitheverythingelse.Wanttouseyourownpicturesinsteadofours?Noproblem!JustrightpictureandchooseChangePicture.Maintaintheproportionofpicturesasyouresizebydraggingacorner.

ComputerVisionforCardGamesMatiasCastillo,BenjaminGoeing,JesperWestell

ProjectObjective

DataAugmentationandEngineering

Wetestedtwoclassicalmachinelearningmethods,usingscikit-learn:● Multi-classOVAclassification,usinglogloss,whereweproduceda

listofclassifiersand

● Multi-classOVAclassificationwithL2Regularization,usinglambda0.1

● Multi-classSVM,usinghingeloss● Multi-classSVMwithL2Regularization,usinglambda0.1

ClassicalMachineLearningMethods

Resultsfromclassicalmethods

● For the purpose of this project, we have created our owndataset by taking 10 pictures of each playing card lying on atable

● We then wrote a script to augment every image, usingdifferent combinations of zoom, brightness, contrast, rotationetc.

● This resulted in 1,050 examples per class for a total of 54,600images, which we divided into 90% train, 5% dev, and 5% test

● As the training set was too large to fit into memory, wetrained using random mini-batches, that were sequentiallyread into memory and also decided to downsize the images

ResultsConvolutionalNeuralNet

FUTUREWORK

TrainingacustomCNNfromscratch:● For our joint CS229/CS221 project, we are developing a

program that can play real world card games against a humanplayer

● The goal of our 229 project is to develop a computer visionmethod, that can recognize playing cards lying on a tablewith 100% accuracy

● This is a 52-class classification problem

● In order to reach our goal of achieving 100% accuracy, wedecided to design and train a custom CNN using Tensorflowfrom scratch:

● Allofourcardimagesarecomingfromthesamedeckthatwepurchased● Infuturework,weplantousemorevariedcardsets(e.g.withdifferentthemes)● Inaddition,weareplanningtotrainamodelthatcanrecognizemultiplecardsatthesametime

● We managed to achieve 100.0% train, dev andtest accuracy with the CNN after 17 epochs

● We decided to use a ReLu activation function, maxpooling with window sizes 6x6 and 4x4 and strides 6and 4 respectively, as well as a “same” padding

● Filters used are [4,4,3,8] and [2,2,8,16]

● We used the cross-entropy loss as a cost function:

References● DeepLearning.ai/CS230coursematerials

● Themulti-classOVAperformsbetterthantheSVMmodel(85%vs.82%devsetaccuracy)

● Therearestillnoticeableoverfittingissueswithbothmodels● ApplyingL2regularizationimproveddevsetaccuracyinboth

models(from85%to89%inOVAandfrom82%to85%inSVM).

Multi-Modal Information Extraction (Question-Answer ...cs229.stanford.edu/proj2017/final-posters/5140346.pdf · Textual Mode FUTURE WORK ... RESULTS AND ANALYSIS 1 Experimented with

Image Mosaic - cs229.stanford.educs229.stanford.edu/proj2017/final-posters/5140123.pdf · Geoffrey E Hinton and Sam T Roweis. Stochastic neighbor embedding. In Advances in neural

Investigating Links Between the Immune System and the ...cs229.stanford.edu/proj2017/final-reports/5242561.pdf · ICD-9, Optum, principal components analysis, logistic regression

CS229 Final Project: Language Grounding in Minecraft with ...cs229.stanford.edu/proj2017/final-reports/5242630.pdf · CS229 Final Project: Language Grounding in Minecraft with Gated-Attention