teaching your computer to play video games

34
Teaching Your Computer To Play Video Games A Presentation For The Bainbridge BARN September 18, 2016

Upload: ehrenbrav

Post on 23-Jan-2017

76 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Teaching Your Computer To Play Video Games

Teaching Your Computer To Play Video Games

A Presentation For The Bainbridge BARNSeptember 18, 2016

Page 2: Teaching Your Computer To Play Video Games

About Me

Tech enthusiast; hardware and software hacker; particular interest in machine learning

Pros:This presentation is free of charge!

Cons:No training in computer science, embedded systems design, electrical engineering, software development

Page 3: Teaching Your Computer To Play Video Games

What Is Machine Learning?● A way for computers to learn without being

explicitly programmed

● It allows machines to make predictions about the future after studying examples from the past

● Forms the basis of artificial intelligence

● One of the hottest areas of computer science today!

Page 4: Teaching Your Computer To Play Video Games

Why Video Games?● They are easy to set up and let you control

every aspect of the learning environment

● They are fun

● They can be directly compared against human performance

Here’s one of my favorite examples

Page 5: Teaching Your Computer To Play Video Games

Consider Spam Filtering...

Page 6: Teaching Your Computer To Play Video Games

Consider Spam Filtering...

● It is impossible to predict every possible way a spam email could be written...

● You could try programming a bunch of rules:○ “Cheap Meds From Canada” -> SPAM○ “Your Medication Has Shipped -> NOT SPAM

● This rapidly becomes intractable - likely to get too many false positives and false negatives

Page 7: Teaching Your Computer To Play Video Games

Consider Spam Filtering...● A better way is to show the machine a bunch of

human-labeled examples, and let it generalize a way to identify spam from these

● This is called Supervised Learning, because we train our system on a bunch of examples

● Basis for most spam-filtering systems today

Page 8: Teaching Your Computer To Play Video Games

So How Does It Work?● There are many types of algorithms that are used

for learning

● These have colorful names:○ Naive Baysian Classifiers○ Support Vector Machines○ Random Forest Trees

● But we’ll focus here on Neural Networks since they are currently some of the most widely used and are so cool

Page 9: Teaching Your Computer To Play Video Games

Neural Networks● Neural networks were

inspired from studying how our brains work

● These consist of multiple layers of interconnected nodes (like neurons)

● They take an input (like a video image), pass it through, and yield an output (like a label)

Page 10: Teaching Your Computer To Play Video Games

Neural NetworksEach of these connections has a weight associated

with it.

.3

.5

.8

Page 11: Teaching Your Computer To Play Video Games

Neural NetworksInformation propagates through each layer of the

network, adjusted by the weights

.3

.5

.8

100

Page 12: Teaching Your Computer To Play Video Games

Neural NetworksInformation propagates through each layer of the

network, adjusted by the weights

.3

.5

.8

100

30

50

80

Page 13: Teaching Your Computer To Play Video Games

Neural NetworksInformation propagates through each layer of the

network, adjusted by the weights

.3

.5

.8

100

30

50

80

30

30

30

50

50

50

80

80

80

Page 14: Teaching Your Computer To Play Video Games

How Neural Networks LearnAt each step, you compare the actual output (ie - 78% chance it’s a cat) with the expected output (ie - yes, it’s a cat)

......

NOT CAT

CAT

Pixel Value

183

22

78

Page 15: Teaching Your Computer To Play Video Games

How Neural Networks LearnThe weights are then adjusted to bring the actual output closer to the expect output. Rinse and repeat...

......

NOT CAT

CAT

Weights

183

20

80

Page 16: Teaching Your Computer To Play Video Games

Neural Network Learning● Adjusting these weights is how the network learns

● Real life networks may have millions of weights spread over many layers

● This process allows the network to learn complex behaviors and, we hope, an ability to generalize concepts beyond what it was explicitly taught

Page 17: Teaching Your Computer To Play Video Games

Neural Network Topology● There are multiple ways of connecting the nodes in

a neural network

● All of these seek to minimize the number of weights you need and to combat the central problem of machine learning: overfitting

● Overfitting means your network performs great so long as it’s working with data it’s already seen. But it fails miserably when it needs to generalize to data it hasn’t seen

Page 18: Teaching Your Computer To Play Video Games

Neural Networks● Recently, a type called convolutional neural

networks has been achieving amazing results, particularly for problems that involve classifying images or video

● Moreover, when you incorporate many layers (5, 6, 7, and more), the power of these networks is astounding

● This is where the phrase deep learning comes from, since these networks have many layers

Page 19: Teaching Your Computer To Play Video Games

Stanford’s Image Classifier● Here’s a deep convolutional neural network in

action from a 2014 competition!

● This network was trained on 1.2 million images, each labeled with one of 1000 categories

● Then was tested on images it had not seen before…

● And achieved an error rate of only 5.1% compared with how humans would classify the images

Check it out!

Page 20: Teaching Your Computer To Play Video Games

Text Generation● Another type of neural network (called a recurrent

neural network) is great for sequential problems, like predicting the next word in a sentence: “There are so many clouds in the ____.”

● A fun trick with these is to train them on a body of text (like the Bible or the complete works of Shakespeare) and see what they spit out...

Page 21: Teaching Your Computer To Play Video Games

Computer-Generated Bible and Shakespeare Verses

● 1 Chronicles 4:7 Then came them out of the house of brass; and in the midst is to him, and was done with him with the new moon: for in the city of Jeshua ye shall put him speed, as the horn of me plagued among them that hath need.

● Second Senator: They are away this miseries, produced upon my soul, Breaking and strongly should be buried, when I perish The earth and thoughts of many states.

Source: http://karpathy.github.io/2015/05/21/rnn-effectiveness/

Page 22: Teaching Your Computer To Play Video Games

Image Captioning● An even more challenging machine learning task is

automatically generating captions to images

● This often combines deep convolutional networks with recurrent neural networks

● Here’s an example of Google’s work on this subject...

Page 23: Teaching Your Computer To Play Video Games

Image Captioning

Source: https://research.googleblog.com/2014/11/a-picture-is-worth-thousand-coherent.html

Page 24: Teaching Your Computer To Play Video Games

Image CaptioningThis is really starting to approach how our own minds seem to learn...

Page 25: Teaching Your Computer To Play Video Games

Unsupervised Learning● Most examples so far have been of Supervised

Learning, where the machine is trained on a bunch of human-labeled examples; Unsupervised Learning is where the human does not provide any guidance

● Specifically, Reinforcement Learning simply provides an environment for the machine to play in, and it is given rewards and penalties based on its actions

● It’s up to the machine to figure out the best strategy...

Page 26: Teaching Your Computer To Play Video Games

Learning To Play Video Games● This is how we can teach a machine to play video

games: ○ The score is the reward○ The machine gets to press any buttons it wants

Here’s a video demonstrating Google’s Atari project (from 9:25)

Page 27: Teaching Your Computer To Play Video Games

How Reinforcement Learning Works

● A neural network is at the heart of Reinforcement Learning

● For video games, the input is the screen itself at each frame, and the output is an estimate of the value of each possible move (right, up, jump, etc.)

● The machine records its experiences at each point in time: the screen, the action it took, the reward it received, and the resulting screen afterwards

Page 28: Teaching Your Computer To Play Video Games

How Reinforcement Learning Works (cont.)

● The machine then compares its prediction of the reward it will get given a screen and given a particular move, and compares this with the actual result it received

● The network’s weights are adjusted to bring the two closer

● Rinse and repeat

Page 29: Teaching Your Computer To Play Video Games

Reinforcement Learning

● Many concepts of Reinforcement Learning are analogous to how our own minds work:

○ Learning Rate: how fast the network should adapt to new information

○ Explore v. Exploit: how much to try new things versus simply maxing out the best strategy you’ve found so far

Page 30: Teaching Your Computer To Play Video Games

Reinforcement Learning

● Many concepts of Reinforcement Learning are analogous to how our own minds work:

○ Memory Size: how long should we maintain our memory of past experiences

○ Discount Rate: how much should we discount future rewards over immediate rewards

Page 31: Teaching Your Computer To Play Video Games

Super Mario Bros.

● My own project was to apply Google’s methods to play Super Mario Bros.

Here’s how it started…

And here’s how it was doing after about 72 hours...

Page 32: Teaching Your Computer To Play Video Games

The State Of The Art● The next step for Mario - why run a single game

when you can run eight!

Page 33: Teaching Your Computer To Play Video Games

The State Of The Art● Advances in machine learning are happening

extremely fast!○ More powerful machines○ The proliferation of open-source tools○ The availability of tasks (like video games) we

can use to measure our progress

Page 34: Teaching Your Computer To Play Video Games

Where To Learn More● Google Atari Project, and the paper in Nature

● My fork of this project to play Super Mario Bros.

● A text generator using recurrent neural nets

● The latest-and-greatest A3C algorithm for training Atari

● The latest (free) tools of machine learning: Theano, Torch, TensorFlow, and Chainer