[dl輪読会]generating wikipedia by summarizing long sequences

27
1 DEEP LEARNING JP [DL Papers] http://deeplearning.jp/ Generating Wikipedia by Summarizing Long Sequences (ICLR 2018) Toru Fujino, scalab, UTokyo

Upload: deep-learning-jp

Post on 15-Mar-2018

41 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

1

DEEP LEARNING JP[DL Papers]

http://deeplearning.jp/

Generating Wikipedia by Summarizing Long Sequences(ICLR 2018)

Toru Fujino, scalab, UTokyo

Page 2: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

��

• . /0•s ( -:: 8 2 I

• : p 2>2 > goo.gl/wSuuS9 k

•1 2ILCG B iI rd ie

• a Ird i I R l• G ItW )• B W DC noI2>> > : g

Page 3: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

������

• , ) 1• ,

•• : ,•

•• :•

•• , , ( )• , ,

Page 4: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

1

• .G DC L N•

• 51 3 2 : 0 6 4 1 (

• R• // 1 2 : /1 1:1 4 1 ) •

••

1) Rush et al. “A Neural Attention Model for Sentence Summarization”, EMNLP 2015

2) Nallapati et al. “Abstractive Text Summarization using Sequence-to-Sequence RNNs and Beyond”, CoNLL 2016

Page 5: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

• (,,, ),( )

•e /• S

•/ • 2 A / /

2) Nallapati et al. “Abstractive Text Summarization using Sequence-to-Sequence RNNs and Beyond”, CoNLL 2016

2)

Page 6: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

�������

• 2. 1/

• 2

Page 7: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

• 1 W Ra

•• 2 G• 00 . c

•• 1 d

��

https://en.wikipedia.org/wiki/Deep_learning

Page 8: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

• goo.gl/wSuuS9 (���)

Page 9: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

• ,

��1

��2

��3

��4

���� �����

Page 10: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

••• -•

Page 11: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

• 3 43 4 Y p• CD 4 M i• ac c 43 , 4 a• d ac c n nY

r Ly

• ot ldA CN m m3 43 e

• 4 , 3 43 m u• ) 24 : 42 3 43 m

• s e3) A. Vaswani et al. “Attention is All You Need”, NIPS 2017

4) N. Shazzer et al. “Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer”, ICLR 2017

Page 12: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

• . .

• 3 .) • .) A • ( 3 .) : :

3) A. Vaswani et al. “Attention is All You Need”, NIPS 2017

Page 13: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

• ( 2• E !" = [!%", !'", … , !)"" ] • E !+ = [!%+ , !'+ , … , !)++ ] • ) E

•A : D5) M.-T. Luong et al. “Effective Approaches to Attention-based Neural Machine translation”, EMNLP 2015

5)

Page 14: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

• ( - E• : D• : D• ) : D

• E : A

Page 15: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

• • A , -

Page 16: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

• K K V 25 /6 ) ,

• ( A 6

• 6

Page 17: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

• L K 3 ASA

• V = • . • 11,/1 /

Page 18: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

) (

• )

•/(

4) N. Shazzer et al. “Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer”, ICLR 2017

4)

Page 19: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

• 9 L• 2 / 02 2 / 02 2 /

• 9• M - - - 9 =• -1 5

Page 20: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

•����������� ���

Page 21: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

• - - • - • -

• :

Page 22: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

• ����������������� ���

Page 23: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences
Page 24: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

)& (

• ( ()

Page 25: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences
Page 26: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

•�������• ���� ��������

Page 27: [DL輪読会]Generating Wikipedia by Summarizing Long Sequences

������

•M• / / -,/ p k lsr im e• W lsr f :• n s k• a a > - - /

• - / y• -2 2 - -, - / Wy

•• ot C L c A• L M d L C• lsr : L