ntc_tensorflow深度學習快速上手班_part3_電腦視覺應用
TRANSCRIPT
![Page 1: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/1.jpg)
TensorFlow深度學習快速上⼿手班������
三、電腦視覺應⽤用
By Mark Chang
![Page 2: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/2.jpg)
• 電腦視覺簡介 • 模型選擇與參數調整 • 影像識別實作
![Page 3: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/3.jpg)
電腦視覺簡介
![Page 4: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/4.jpg)
電腦視覺 • 電腦視覺是⼀一⾨門研究如何使機器「看」的科學 • ⽤用電腦代替⼈人眼對⺫⽬目標進⾏行識別、跟蹤和測量
等機器視覺,並進⼀一步做圖像處理。 • https://zh.wikipedia.org/wiki/%E8%AE
%A1%E7%AE%97%E6%9C%BA%E8%A7%86%E8%A7%89
![Page 5: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/5.jpg)
影像識別
http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf
![Page 6: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/6.jpg)
物件偵測
http://papers.nips.cc/paper/5207-deep-neural-networks-for-object-detection.pdf
![Page 7: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/7.jpg)
影像補⿑齊
http://arxiv.org/abs/1601.06759
![Page 8: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/8.jpg)
藝術創作
http://arxiv.org/abs/1508.06576
![Page 9: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/9.jpg)
卷積神經網路
![Page 10: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/10.jpg)
影像識別 • 同⼀一個數字可能出現在圖⽚片中的不同部分 • 但這些圖⽚片所代表的數字相同
![Page 11: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/11.jpg)
Local Connectivity
每個神經元只看到圖片中的一小區塊
![Page 12: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/12.jpg)
Parameter Sharing
同一「種類」的神經元具有相同的weights
![Page 13: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/13.jpg)
Parameter Sharing
不同「種類」的神經元具有不同的weights
![Page 14: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/14.jpg)
卷積神經網路 • Convolutional Layer
depth
width width depth
weights weights
height
shared weight
![Page 15: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/15.jpg)
卷積神經網路 • Stride • Padding
Stride = 1
Stride = 2
Padding = 0
Padding = 1
![Page 16: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/16.jpg)
視覺認知
http://www.nature.com/neuro/journal/v8/n8/images/nn0805-975-F1.jpg
![Page 17: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/17.jpg)
特徵擷取
![Page 18: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/18.jpg)
卷積神經網路 • Pooling Layer
1 3 2 4
5 7 6 8
0 0 4 4
6 6 0 0
4 5
3 2 no overlap
no padding no weights
depth = 1
7 8
6 4
Maximum Pooling
Average Pooling
![Page 19: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/19.jpg)
卷積神經網路
Convolutional Layer
Convolutional Layer Pooling
Layer
Pooling Layer
Receptive Fields Receptive Fields
Input Layer
![Page 20: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/20.jpg)
卷積神經網路
Input Layer
Convolutional Layer with
Receptive Fields:
Max-pooling Layer with
Width =3, Height = 3
Filter Responses
Filter Responses
Input Image
![Page 21: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/21.jpg)
影像識別實作
![Page 22: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/22.jpg)
卷積神經網路實作 https://github.com/ckmarkoh/ntc_deeplearning_tensorflow/blob/master/sec3/convnet.ipynb
![Page 23: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/23.jpg)
MNIST • 數字識別 • 多元分類:0~9
https://www.tensorflow.org/versions/r0.7/images/MNIST.png
![Page 24: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/24.jpg)
Create Variables & Operators def weight_variable(shape):
return tf.Variable(tf.truncated_normal(shape, stddev=0.1)) def bias_variable(shape):
return tf.Variable(tf.constant(0.1, shape=shape)) def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME') def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
![Page 25: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/25.jpg)
Computational Graph x_ = tf.placeholder(tf.float32, [None, 784], name="x_") y_ = tf.placeholder(tf.float32, [None, 10], name="y_”) x_image = tf.reshape(x_, [-1,28,28,1]) W_conv1 = weight_variable([5, 5, 1, 32]) b_conv1 = bias_variable([32]) h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1) h_pool1 = max_pool_2x2(h_conv1) W_conv2 = weight_variable([5, 5, 32, 64]) b_conv2 = bias_variable([64]) h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2) h_pool2 = max_pool_2x2(h_conv2) W_fc1 = weight_variable([7 * 7 * 64, 1024]) b_fc1 = bias_variable([1024]) h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64]) h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1) keep_prob = tf.placeholder(tf.float32) h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob) W_fc2 = weight_variable([1024, 10]) b_fc2 = bias_variable([10]) y= tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
![Page 26: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/26.jpg)
卷積神經網路
nx28x28x1
nx28x28x32
nx14x14x32
nx14x14x64
nx7x7x64
nx1024 nx10
x_image
h_conv1
h_pool1
h_conv2
h_pool2
h_fc1 y
![Page 27: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/27.jpg)
Reshape x_image = tf.reshape(x_, [-1,28,28,1])
x n
784 n
28
1
![Page 28: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/28.jpg)
Convolutional Layer W_conv1 = weight_variable([5, 5, 1, 32]) b_conv1 = bias_variable([32])
5
1
32
32
5x5
1
32
32
W_conv1 W_conv1
b_conv1 b_conv1
![Page 29: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/29.jpg)
Convolutional Layer tf.nn.conv2d(x, W , strides=[1, 1, 1, 1], padding='SAME')+b
1 5x5 1x1
28
28 28
28 strides=1
padding='SAME'
[ batch, in_height, in_width, in_channels ]
![Page 30: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/30.jpg)
Convolutional Layer tf.nn.conv2d(x, W , strides=[1, 1, 1, 1], padding='SAME')+b
nx28x28x1 nx28x28x32 28
28
28
28
![Page 31: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/31.jpg)
ReLU
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
ReLU: ⇢
nin if nin > 0
0 otherwise
-0.5 0.2 0.3 -0.1
0.2 -0.3 -0.4 -1.1
2.1 -2.1 0.1 1.2
0.2 3.0 -0.3 0.5
0 0.2 0.3 0
0.2 0 0 0
2.1 0 0.1 1.2
0.2 3.0 0 0.5
![Page 32: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/32.jpg)
Pooling Layer tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
1x2x2x1 1
1
1
1
2
2x2 1x1
![Page 33: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/33.jpg)
Pooling Layer tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
2
strides=2
padding='SAME'
28
28 14
14
![Page 34: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/34.jpg)
Pooling Layer h_pool1 = max_pool_2x2(h_conv1)
nx28x28x32 nx14x14x32
28
28 14
14
![Page 35: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/35.jpg)
Reshape
h_pool2_flat
n 7*7*64
7
64
n h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
![Page 36: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/36.jpg)
GoogLeNet影像識別 https://github.com/ckmarkoh/ntc_deeplearning_tensorflow/blob/master/sec3/googlenet.ipynb
![Page 37: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/37.jpg)
GoogLeNet
http://www.cs.unc.edu/~wliu/papers/GoogLeNet.pdf
22 layers deep network
![Page 38: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/38.jpg)
訓練資料 • ILSVRC 2014 Classification Challenge – http://www.image-net.org/challenges/
LSVRC/2014/ • Dataset:
1000 categories – Training: 1,200,000 – Validation: 50,000 – Testing: 100,000
![Page 39: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/39.jpg)
Inception Module
![Page 40: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/40.jpg)
Load Computational Graph
model_fn = 'tensorflow_inception_graph.pb' graph = tf.Graph() sess = tf.InteractiveSession(graph=graph) graph_def = tf.GraphDef.FromString(open(model_fn).read()) t_input = tf.placeholder(np.float32, name='input') imagenet_mean = 139 t_preprocessed = tf.expand_dims(t_input - imagenet_mean, 0) tf.import_graph_def(graph_def, {'input': t_preprocessed}) t_output = graph.get_tensor_by_name("import/output2:0")
![Page 41: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/41.jpg)
Load Label
f = open("label.json") labels = json.loads("".join(f.readlines())) f.close()
1: "kit fox, Vulpes macrotis", 2: "English setter", 3: "Siberian husky", 4: "Australian terrier", ...... 998: "stole", 999: "carbonara", 1000: "dumbbell"
![Page 42: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/42.jpg)
Run Computational Graph
def load_image(imgfile): return np.float32(PIL.Image.open(imgfile).resize((224,224))) def get_class(image): return labels[str(np.argmax(sess.run([t_output], {t_input:
load_image(image)})))]
print get_class('img/img1.jpg')
leaf beetle, chrysomelid
![Page 43: NTC_TENSORFLOW深度學習快速上手班_Part3_電腦視覺應用](https://reader034.vdocuments.mx/reader034/viewer/2022051709/5878c9281a28ab26728b696d/html5/thumbnails/43.jpg)
講師資訊
• Email: ckmarkoh at gmail dot com • Blog: http://cpmarkchang.logdown.com • Github: https://github.com/ckmarkoh
Mark Chang
• Facebook: https://www.facebook.com/ckmarkoh.chang • Slideshare: http://www.slideshare.net/ckmarkohchang • Linkedin:
https://www.linkedin.com/pub/mark-chang/85/25b/847
43