convnet report

Post on 12-Sep-2015

224 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Convnet Report

TRANSCRIPT

  • Gii thiu v mng convolutional network (convnet)

    Gii thiu

    Kin trc v xy dng mng convnet

    ng lc pht trin v m hnh mt n-ron Lnh vc mng n-ron ban u c pht trin da vo mc tiu m hnh ho h thng thn kinh sinh hc, nhng sau r nhnh v tr thnh mt lnh vc trong k thut v cng ngh, mang n nhng kt qu tt v nhiu trin vng trong My Hc. n v tnh ton c bn trong b no l n-ron. Xp x c khong 86 t t bo n-ron c tm thy trong h thng thn kinh ca con ngi v chng kt ni vi nhau vi s lng xp x t 10!" 10!" synape. Trong Hnh 1 m t cu to ca mt n-ron sinh hc v m hnh ton hc thng dng ca n. Mi n-ron nhn tn hiu u vo t cc si nhnh v pht tnh hiu u ra t mt axon. Axon sau s tip tc r nhnh v kt ni vi cc synape khc kt ni vi cc n-ron khc. Trong m hnh tnh ton ca mt n-ron, tnh hiu s i theo cc axons (!) v tng tc (!!) vi cc nhnh trong cc n-ron da trn sc mnh tip hp ti cc synape. tng mnh tip hp (trng s w) c kh nng hc c v iu khin mnh ca nh hng (theo hng tng cng (trng s dng) hay cn tr (trng s m)) ca n-ron ny ln n-ron khc. Trong m hnh c bn, nhng nhnh mang tnh hiu n cc t bo c th ni chng c tng hp li. Nu kt qu tng hp ln hn mt ngng cho trc, n-ron s c kch hot (fire), gi tn hiu thong qua axon ca n. Trong m hnh tnh ton, chng ta gi thuyt rng chnh xc trong nh thi gian ca tnh hiu truyn ti khng thnh vn , v ta ch quan tm n tn s vic pht tnh hiu thng tin giao tip. Da trn t l dng m, chng ta m hnh t l pht tn hiu ca n-ron bng mt hm kch hot ., hm ny biu din tn sut cc tnh hiu i qua axon. V phng din lch s, hm kch hot thng c s dng nhiu nht l hm sigmoid, hm ly gi tr u vo l mt s thc (kt qu ca ), kt qu nm trong khong t 0 n 1. Ni mt cch khc, mi n-ron thc hin php tch vc-t v hng (dot product) ca

  • d liu u vo v trng s, cng vi mt phn b (bias) v thc hin php phi tuyn ( y s dng hm kch hot) m trong trng hp ny l hm sigmoid = !!!!!!. Mt iu thc s rc ri trong m hnh n-ron sinh hc chnh l s phc tp ca h thng: c rt nhiu loi n-ron khc nhau, mi n-ron li c thuc tnh khc nhau; ng thi cc nhn trong n-rn thc hin nhng tnh ton phi tuyn phc tp. Nhng synape khng ch c mt trng s n l, m mi synape l mt h thng phi tuyn ng phc tp.

    Hnh 1. Cu trc n-ron thn

    Hnh 2. M hnh ton hc ca m n-ron

    M hnh mt n-ron: b phn lp tuyn tnh Hnh thc ton hc ca m hnh n-rn truyn thng c cu trc tng t vi m hnh c gii thiu. N-ron c kh nng xc nh c s tng cng ca (gn bng 1) hay gim thiu (gn bng 0) ca d liu u vo. V nh vy, vi mt hm nh gi thch hp, mt n-ron n l c

  • th tr thnh mt phn lp tuyn tnh. y c th s dng phn lp Softmax nh phn hay SVM nh phn.

    Cc hm kch hot thng dng Mi hm kch hot (hay hm phi tuyn) ly mt gi tr u vo v biu din mt ton t trong . Mt vi hm thng xuyn c s dng c th k n

    Sigmoid Hm sigmoid phi tuyn c cng thc = !!!!!!. Hm c s dng trong thi gian di bi n c mt tnh cht v min gi tr. Tuy nhin, hin nay hm sigmoid khng cn c a thch v tr nn him s dng. C hai nhc im chnh ca hm ny:

    1. Hm sigmoid hi t v s trit tiu gradient: Mt tch cht khng mong mun ca n-ron siggmoid l khi hm kch hot hi t t 0 hoc 1, gradien ti nhng im ny gn nh bng 0. Nhc li trong qu trnh backpropagration, gradient cc b c nhn vi gradient of output ca c i tng. V vy, nu fradient cc b c gi tr rt nh, n s triu tiu hon ton v gradient v hon ton khng c tn hiu i qua n-ron. V nh vy trng s s khng c hc li. Hoc, nu trng s qu ln th a phn cc n-ron s dn bo ho v mng s khng hc c.

    2. Hm sigmoid khng i xng qua gc to . Nh ta quan st, hm sigmoid i xng ti im c y=0.5, v v vy u ra ca hm ny lun lun ln hn 0. iu ny s nh hng n d liu nhn vo ca cc tng pha sau. Khi , nhng tng sau s lun lun nhn c gi tr dng, v vy m gradient ca w trong qu trnh lan trng ngc u m hoc u dng, iu ny s gy ra hiu ng zig-zag khng mong mun trong mi ln cp nhn trong s. Tuy nhin, nh s s thay i v du ca trng s trong ln cp nhn lp cui cng m vn ny khng nh hng qu nghim trng.

    Hm tanh Hm tanh l hm phi tuyn cho min gi tr nm trong khong -1, 1. Ging nh hm sigmoid, hm ny cng gy ra hiu ng bo ho, nhng hm ny c tnh cht i xng qua gc to . Trong thc t, hm tanh c u chung hn so vi hm sigmoid. Hm ReLu (Rectified Linear Unit): hm ny tr nn ph bin trong nhng nm tr li y. Hm c cng thc:

  • = max (0, ) Hm ny l mt hm ngng n gin vi ngng ti 0. u im ca hm ny l tnh hiu qu cao trong m hnh mng, v khng b trng thi bo ho. ng thi, chi ph tnh ton ca hm ny t hn so vi tanh/sigmoid. Tuy nhin, nhc im ca hm ny l hm rt mong manh v d cht. V d: vi lng ln gradient i qua n-ron ReLU gy ra hin tng lun bng 0 v v th s c nhng phn t khng bao gi c kch hot. V output = 0 nn cc n-ron lin kt vi n-ron ny nhn input = 0, kt hp vi trng s (wx = 0) dn n nhng n-ron khng bao gi c kch hot. Theo nghin cu, t l hc cng cao khin t l cc n-ron cht cng nhiu. Nhm gim thiu hin tng ny xy ra, ta c th chn t l hc nh li. Mt gii php khc l chn hm Leaky ReLU. Kin trc mng n-ron nhiu lp T chc theo tng Mng n-ron l c m hnh ho nh tp cc n-ron c lin kt vi nhau nh mt th khng c chu k: kt qu ca n-ron ny c th tr thnh u vo ca n-ron khc. V trong s mng n-ron khng c chu k xut hin v nu tn ti s gy ra hin tng lp v hn trong qu trnh d liu di chuyn trong mng. Mng n-ron thng c t chc theo nhng tng ring bit. Mt mng n-ron thng thng l mng kt ni y cc tng, c ngha l n-ron ca 2 tng k nhau s c cnh ni vi nhau, nhng nhng n-ron trong cng mt tng th khng tn ti lin kt. Tng output Khc vi cc tng cn li, tng ny cc n-ron khng c hm kch hot. tng ny mi n-ron biu din gi tr ca kt qu phn lp. Cc tng c s dng trong ConvNet C 3 loi tng chnh trong kin trc mng ConvNet:

    Tng convolution (CONV). Tng pooling (POOL). Tng kt ni y (FC).

  • Mt v d n gin cho m hnh ConvNet: [INPUT CONV RELU POOL FC]: INPUT [32x32x3]: d liu u vo l mt hnh vi kch thc 32x32 vi 3 knh mu RGB. CONV: tng convolution tnh ton gi tr vi input l mt vng nh so vi c tm hnh. V d: mi n-ron kt ni vi mt vng kch thc 4x4 trong nh, so tnh tch chp vi trng s c trong n-ron. Tng RELU dng hm kch hot ReLU v tnh ton cho mi kt qu tng trc . Tng POOL s dng k thut downsampling nhm gim kch thc ca d liu. Tng FC nhm xy dng b phn lp, ging vi mng full-connected thong thng.

    Hnh 3. Mt m hnh

    Tng convolution Tng ny bao gm cc b lc c kh nng hc cc tham s. Mi b lc c kch thc tng i nh ( so vi kch thc nh u vo). Trong qu trnh tnh ton, mi b lc (tng ng l mt n-ron), s trt trn bc nh vi bc nhy l S (s lng pixel s di chuyn trong mi ln trt). tng ny, tch chp c s dng nhm tnh kt qu ca u vo v trng s.

    Lin kt cc b Trong qu trnh x l vi d liu nhiu chiu nh nh, vic kt ni tt c d liu u vo trong mt n-ron (v d: mi n-ron s c 32x32=1024 trng s nu lin kt hon ton vi 1 nh c kch thc 32x32) s khng

  • thc t v nh vy s lng trng s s rt ln. V vy, mi n-ron ch lin kt vi mt vng nh trong nh. Lu : vi kch thc su th mi n-ron vn lin kt y . V d: Cho nh c kch thc 32x32x3, v 5 n-ron tng CONV, mi n-ron c kch thc 5x5x3. Nh vy tng s trng s ca mi n-ron l 75 (so snh vi cch kt ni thng thng l 32x32x3 = 3072 trng s) Trong tng CONV c 3 siu tham s (hyperparameters): s lng cc n-ron, bc nhy v zero-padding.

    S lng n-ron (D ): mi n-ron s trt trn ton b nh, v nh vy s quy nh kch thc su ca kt qu trong tng ny. V d: vi tm nh m 32x32x3 v 5 n rn c kch thc filter l 5x5x3, vi vng m[1:5,1:5,0:3] s c tnh 5 ln tng ng vi 5 n-ron.

    Bc di (stride S): y l s cc dng/ct pixel s dch chuyn mi khi cc b lc trt trn nh. Nu gi tr S ln th min chng cho s tha hn, ngc li nu S nh th min chng cho s nhiu hn. Gi tr S ny s nh hng ln n kch thc kt qu ca tng ny.

    Vng m 0 (Zero-padding P) quy nh s s dng/ct vi gi tr 0 xung quanh bin ca mt nh.

    Hnh 4. V d v cu hnh cc siu tham s

    V d: Gi s trong v d Hnh 4, gi tr u vo l mt vc-t c gi tr x = [1, 2, -1, 1, -3], b lc c gi tr f = [1, 0, -1], P = 1 nn hai bin ca vc t c thm vo s 0, vi php tnh 1 S = 1, cn php tnh 2 S = 2. Nh trong v d ta thy c s lng phn t trong kt qu thay i r rt.

    Tng kt li, trong tng Conv Kch thc d liu u vo: W!H!D! Cc siu tham s:

    S cc n-ron (b lc) : K. Kch thc b lc FF. Bc nhy S. Gi tr bin 0: P.

  • Kch thc kt qu ca tng Conv: W! = !!!!!!"! + 1 H! = !!! !!!"! + 1 D! = K

    Mi n-rn c s lng trng s: F F D! Tng s trng s trong tng ny: (F F D!) K Trong su th d trong ma trn u ra l kt qu ca b lc (n-ron) th d. Cu hnh thng dng c s dng l F = 3, S = 1,P = 1

    Hnh 5. V d v tng Conv

    Hnh 5 m t v d v mt tng Conv n gin. Nh trong hnh, ta c nh u vo m c kch thc 553. Tng Conv c 2 n-ron l w!,w! vi kch thc l 333 vi S = 2 v P = 1. Ma trn kt qu c kch thc 332.

  • Tng Pooling

    Hnh 6. V d v tng Pooling (ton t MAX)

    Thng thng, tng Pooling s c ci t sau tng Conv. Chc nng chnh ca tng ny nhm gim kch thc khng gian tham s v tnh ton trong mng, v ng thi iu khin hin tng qu khp (overfitting). Tng Pooling tnh ton c lp trn tng su ca d liu u vo. Thc cht, mi n-ron trong tng Pooling cng l 1 b lc c kch thc c nh v s dng cc ton t lm gim khng gian (min, max, average). Kch thc thng dng ca b lc ny l 22 v s dng ton t MAX, tng ny s gim 75% cc gi tr kch hot. Thng tin chiu di v rng b thay i, tuy nhin gi tr su khng thay i.

    Cu hnh tng Pooling Kch thc u vo W!H!D! Siu tham s:

    o Kch thc b lc F. o Bc nhy S.

    Kch thc u ra: W!H!D! o W! = !!!!! + 1 o H! = !!!!! + 1 o D! = D!

    Trong thc t, tng Pooling khng s dng k thut zero-padding.

    Tng chun ho C nhiu loi chun ho khc nhau c xut trong kin trc ConvNet. Tuy nhin, trong thi gian gn y tng ny c lc b v ng gp vo trong kt qu kh nh.

    Tng kt ni y Nhng n-ron trong tng kt ni y s lin kt vi tt c cc phn t ca tng trc , nh trong mt mng n-ron thng thng. ng gp chnh ca tng ny chnh l hun luyn ra b phn lp cho h thng.

  • Cc cng trnh tiu biu 1. Region with CNN features[1].

    y hin gi l h thng state-of-the-art trong nhn dng i tng bng cch xy dng b c trng phong ph xy dng bng mng convolution.

    2. DeepFace [2]

    Nhm tc gi xy dng mng convolution ti u trong bi ton nhn dng mt ngi, t kt qu cao v gn t chnh xc so vi mt ngi.

    3. Deep Reinforcement Learning [3]

    Xy dng mng ConvNet cho php my tnh c kh nng chi cc tr chi in t Atari

    4. Deep Visual-semantic Alignments for Generating Image Descriptions [4]

  • Mng Convnet vi chc nng m t mt tm hnh, ngoi ra cn c th tm kim hnh nh da trn mt cu m t cho trc.

    Cc th vin ph bin 1. Caffe 2. Torch7 3. Cuda-convnet 4. Matconvnet 5. Theano 6. Deeplearning4j

    Th vin CuDNN Th vin CuDNN (NVIDIA CUDA Deep Neural Network ) l b th vin tng tc GPU bao gm cc thnh phn c bn cho mng deep learning. Th vin nhn mnh v hiu sut, tnh d s dng, v qun l chi ph b nh. cuDNN c thit k tch hp vi cc nn tng my hc cp cao nh Caffe, Theano, Torch. Thit k n gin v d s dng cho php cc nh pht trin nhn mnh v vic thit k v ci t m hnh mng hn l lo lng v hiu sut m vn t c kt qu mong mun t phn cng tnh ton song song.

  • Cc tnh nng chnh Ci t cc thnh phn c bn ca mng n-ron, m hnh lan

    truyn thng v lan truyn ngc. Ci tin cho kin trc NVIDIA GPU mi nht. Tch hp ci t kin trc ca mng n-ron. Cc hm kch hot (ReLU, Sigmoid, Tanh), cc hm pooling

    (max, min, average), hm chi ph. H tr cc h iu hnh Windows, Linux, MacOS cng nh cc

    kin trc GPU Kepler, Maxwell, Tegra

    Ti liu tham kho 1. Girshick, Ross, et al. "Rich feature hierarchies for accurate object

    detection and semantic segmentation." Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. IEEE, 2014.

    2. Taigman, Yaniv, et al. "Deepface: Closing the gap to human-level performance in face verification." Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. IEEE, 2014.

    3. Mnih, Volodymyr, et al. "Playing atari with deep reinforcement learning." arXiv preprint arXiv:1312.5602 (2013).

    4. Karpathy, Andrej, and Li Fei-Fei. "Deep visual-semantic alignments for generating image descriptions." arXiv preprint arXiv:1412.2306 (2014).

    5. http://vision.stanford.edu/teaching/cs231n/syllabus.html 6. https://developer.nvidia.com/cuDNN

top related