[dl輪読会]generating wikipedia by summarizing long sequences

1

DEEP LEARNING JP[DL Papers]

http://deeplearning.jp/

Generating Wikipedia by Summarizing Long Sequences(ICLR 2018)

Toru Fujino, scalab, UTokyo

��

• . /0•s ( -:: 8 2 I

• : p 2>2 > goo.gl/wSuuS9 k

•1 2ILCG B iI rd ie

• a Ird i I R l• G ItW )• B W DC noI2>> > : g

https://goo.gl/wSuuS9

��

• , ) 1• ,

•• : ,•

•• :•

•• , , ( )• , ,

1

• .G DC L N•

• 51 3 2 : 0 6 4 1 (

•

• R• // 1 2 : /1 1:1 4 1 ) •

••

1) Rush et al. “A Neural Attention Model for Sentence Summarization”, EMNLP 2015

2) Nallapati et al. “Abstractive Text Summarization using Sequence-to-Sequence RNNs and Beyond”, CoNLL 2016

• (,,, ),( )

•e /• S

•/ • 2 A / /

2) Nallapati et al. “Abstractive Text Summarization using Sequence-to-Sequence RNNs and Beyond”, CoNLL 2016

2)

��

• 2. 1/

• 2

• 1 W Ra

•• 2 G• 00 . c

•• 1 d

��

https://en.wikipedia.org/wiki/Deep_learning

• goo.gl/wSuuS9 (��)

https://goo.gl/wSuuS9

• ,

��1

��2

��3

��4

��

••• -•

• 3 43 4 Y p• CD 4 M i• ac c 43 , 4 a• d ac c n nY

r Ly

• ot ldA CN m m3 43 e

• 4 , 3 43 m u• ) 24 : 42 3 43 m

• s e3) A. Vaswani et al. “Attention is All You Need”, NIPS 2017

4) N. Shazzer et al. “Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer”, ICLR 2017

• . .

• 3 .) • .) A • ( 3 .) : :

3) A. Vaswani et al. “Attention is All You Need”, NIPS 2017

• ( 2• E !" = [!%", !'", … , !)"" ] • E !+ = [!%+ , !'+ , … , !)++ ] • ) E

•A : D5) M.-T. Luong et al. “Effective Approaches to Attention-based Neural Machine translation”, EMNLP 2015

5)

• ( - E• : D• : D• ) : D

• E : A

• • A , -

•

• K K V 25 /6 ) ,

• ( A 6

• 6

• L K 3 ASA

• V = • . • 11,/1 /

•

) (

• )

•/(

4) N. Shazzer et al. “Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer”, ICLR 2017

4)

• 9 L• 2 / 02 2 / 02 2 /

• 9• M - - - 9 =• -1 5

•��

• - - • - • -

• :

• ��

)& (

• ( ()

•��• ��

��

•M• / / -,/ p k lsr im e• W lsr f :• n s k• a a > - - /

• - / y• -2 2 - -, - / Wy

•• ot C L c A• L M d L C• lsr : L

[dl輪読会]generating wikipedia by summarizing long sequences

Technology