![Page 1: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/1.jpg)
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Marshall Tappen and Ernesto Gonzalez
Amazon Fulfillment Technologies
November 30, 2016
MAC301
Transforming Industrial
Processes with Deep Learning
![Page 2: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/2.jpg)
What to Expect from the Session
• Description of how Amazon Fulfillment Technologies has
used computer vision to improve our processes.
• Walk through how we combined deep learning and
traditional computer vision to automate an industrial
process.
• What are the challenges and the opportunity created by
deep learning classifiers?
![Page 3: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/3.jpg)
Overview of fulfillment process
![Page 4: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/4.jpg)
One thing you have to understand about
fulfillment centers
Bins can hold anything
![Page 5: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/5.jpg)
Misplaced inventory “disappears”
Amazon Confidential 5
Associate
rearranged
inventory
when
picking
items.
![Page 6: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/6.jpg)
Misplaced inventory “disappears”
Amazon Confidential 6
We call this
an
inventory
defect
![Page 7: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/7.jpg)
Items fall out of pods
![Page 8: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/8.jpg)
Our solution: use computer vision to locate
inventory defects
![Page 9: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/9.jpg)
First step: get a physical system to capture
images
Station
Outbound
frame
Inbound frame
Totes and
conveyance
Amazon Confidential 9
![Page 10: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/10.jpg)
Capture set of images as pod arrives at
the stationArrival Image
Tower
Departure Image
TowerStation
![Page 11: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/11.jpg)
Associate interacts with pod
Arrival Image
Tower
Departure Image
Tower
Station
![Page 12: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/12.jpg)
Photographed again as pod leaves
Arrival Image
Tower
Departure Image
Tower
Station
![Page 13: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/13.jpg)
General strategy
• We want to take advantage of deep learning.
• The cameras capture images of an entire pod, but we
need data at the bin level.
• We will have a two-step process:
1. Extracting bins from images
2. Analyzing bin Images
![Page 14: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/14.jpg)
Computer vision step 1: pod image to bin
images
![Page 15: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/15.jpg)
No problem, use 2-D barcodes!
Amazon Confidential 15
![Page 16: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/16.jpg)
No problem, use 2-D barcodes!
Bands block the
barcodes
Amazon Confidential 16
![Page 17: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/17.jpg)
Solution, if we can detect the trays
Amazon Confidential 17
![Page 18: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/18.jpg)
And we can detect the sides
Amazon Confidential 18
![Page 19: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/19.jpg)
We have a set of points to match with a recipe of the
pod’s geometry
Amazon Confidential 19
![Page 20: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/20.jpg)
Map the coordinate system of the database to
the face of the pod in the image
Amazon Confidential 20
![Page 21: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/21.jpg)
Detecting the side of a pod: downsample image
and convert to grayscale
2046 X 2046 Image 512 X 512 Image
Amazon Confidential 21
![Page 22: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/22.jpg)
Correlate* with left rail template
Filter
* In practice, we use normalized cross-correlation
Amazon Confidential 22
![Page 23: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/23.jpg)
Threshold
Amazon Confidential 23
![Page 24: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/24.jpg)
Fit a line (similar process for the other side)
Amazon Confidential 24
![Page 25: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/25.jpg)
We can detect trays in the same way
Amazon Confidential 25
![Page 26: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/26.jpg)
We can detect trays in the same way
Now we
have
locations to
tie the
virtual
template to
the image!
Amazon Confidential 26
![Page 27: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/27.jpg)
Transformation between image and pod
physical coordinates is called a homography
We can verify
that it works by
calculating the
boundary of
each bin in the
image and
coloring it in.
Amazon Confidential 27
![Page 28: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/28.jpg)
How can we use computer vision?
• Automatic
identification of
every item?
Amazon Confidential 28
![Page 29: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/29.jpg)
How can we use computer vision?
• Automatic identification of every item?(TOO HARD)
• Automatic counting of every item?
Amazon Confidential 29
![Page 30: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/30.jpg)
What does computer vision need to tell us?
• Automatic
identification of every
item?(TOO HARD)
• Automatic counting
of every item? (TOO
HARD)
Amazon Confidential 30
![Page 31: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/31.jpg)
Instead, we can look for changes
Inbound to the Station Outbound from the Station
Amazon Confidential 31
![Page 32: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/32.jpg)
Our first attempt was with hand-engineered
computer vision
Amazon Confidential 32
![Page 33: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/33.jpg)
It’s hard!
Must be robust to items rolling or shuffling inside
the bin, illumination changes, specularity, etc.
![Page 34: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/34.jpg)
The big insight
• We realized our problem was just binary classification.
• Two images in, one label out.
• Why not try this deep-learning thing?
![Page 35: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/35.jpg)
We did the simplest thing possible
• Take the first image,
convert it to grayscale,
and put it in the red
channel of a new image
• Take the second image
and put it in the blue
channel
• Now, we have a single
image to pass to the
neural network
![Page 36: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/36.jpg)
It worked great!
Best Hand-
Engineered Model
CIFAR CNN
Krizhevsky’s CNN
![Page 37: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/37.jpg)
Processing pipeline
Pod Image
Bin Extraction
Bin Images
Defect
Detection
![Page 38: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/38.jpg)
Implementation details
• Implemented in OpenCV in Python
• C++ extensions for some steps
• Neural net uses Caffe
• Trained on G2 instances
• Runs on CPU in FC server room
• Can tolerate latency in our current use-pattern
![Page 39: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/39.jpg)
Software architecture
Inventory
Event
Correlator
(EC2)
VBI
Service
(EC2)
Remote
Count
Website
(Defect
Detection)
(EC2)
Site Server Room AWS
Inventory
Bin Count
Elimination
(EC2)
• Get Bin Defect
Result
• Get Bin Space
Available
Capture
Event
Data
Router
Bin
Extraction
Process
Auto
Count
Process
Local
Storage
Service
Put
Pod Face
Images
Put Bin
Images
Get Pod
Images
Camera
Controller
File Pusher
Barcode
Extraction
Edge
Device (s)
EDGE
DEVICE
Get Bin
Image
Get Bin
Image
Applications
SN
S
HTTP
POST
SNS
DynamoDB
SNS
SNS
SQS
Get Work for Remote
Counting
SQS
SQS
SNS
![Page 40: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/40.jpg)
How can we use computer vision?
Automatic
identification of every
item?(TOO HARD)
Automatic counting
of every item?
Amazon Confidential 40
![Page 41: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/41.jpg)
Could we just count the number of items in the
bin?
• At this point, we have lots of data.
• Some of it has errors from inventory defects, but
networks have proven resilient to this kind of thing.
• Why not just train a network to directly count bins?
![Page 42: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/42.jpg)
Using a convolutional neural network
• We used the Caffe implementation of GoogLeNet [1]
[1] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent
Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. In Proceedings of the IEEE International
Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
![Page 43: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/43.jpg)
Maps cleanly onto classification paradigm
• Treat it as a multi-class classification problem
Neural
Network
0.1
0.2
0.4
0.4
![Page 44: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/44.jpg)
This saved the project
• Hit the targets we needed
• Eliminated a lot of hardware (no more before/after shots
needed)
• Made the project cost effective
• Here is what we learned:
• Don’t focus on algorithms, focus on DATA
![Page 45: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/45.jpg)
How else can we use this data?
• We want to find free space
in the bin without having to
label data.
• We can guess from
dimensions of items.
• But where is the space at?
2.0
1.0
![Page 46: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/46.jpg)
Train model to predict emptiness from an image
Emptiness scoreConv
Avg
Po
olGoogleNet
Conv
(3*3)
This is a noisy,
probably incorrect
estimate!
![Page 47: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/47.jpg)
But we can use layers in the network to find where the
space actually is!
emptiness scoreConv
Avg
Po
olGoogleNet
Conv
(3*3)
1024 channels
3*3
![Page 48: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/48.jpg)
Original image Activation map Binary mapOriginal image Activation map Binary map
And it works!
![Page 49: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/49.jpg)
We are releasing a dataset
![Page 50: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/50.jpg)
Takeaways
• We have great pattern recognition machinery now.
• Focus on the data:
• How can you get lots of it?
• What can you get for free?
• How much labeling do you really need?
• Is there a proxy problem?
![Page 51: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/51.jpg)
Thank you!
![Page 52: AWS re:Invent 2016: Transforming Industrial Processes with Deep Learning (MAC301)](https://reader033.vdocuments.mx/reader033/viewer/2022052405/586f852b1a28ab54768b4fed/html5/thumbnails/52.jpg)
Remember to complete
your evaluations!