[dl輪読会]generating wikipedia by summarizing long sequences
TRANSCRIPT
1
DEEP LEARNING JP[DL Papers]
http://deeplearning.jp/
Generating Wikipedia by Summarizing Long Sequences(ICLR 2018)
Toru Fujino, scalab, UTokyo
��
• . /0•s ( -:: 8 2 I
• : p 2>2 > goo.gl/wSuuS9 k
•1 2ILCG B iI rd ie
• a Ird i I R l• G ItW )• B W DC noI2>> > : g
������
• , ) 1• ,
•• : ,•
•• :•
•• , , ( )• , ,
1
• .G DC L N•
• 51 3 2 : 0 6 4 1 (
•
• R• // 1 2 : /1 1:1 4 1 ) •
••
1) Rush et al. “A Neural Attention Model for Sentence Summarization”, EMNLP 2015
2) Nallapati et al. “Abstractive Text Summarization using Sequence-to-Sequence RNNs and Beyond”, CoNLL 2016
• (,,, ),( )
•e /• S
•/ • 2 A / /
2) Nallapati et al. “Abstractive Text Summarization using Sequence-to-Sequence RNNs and Beyond”, CoNLL 2016
2)
�������
• 2. 1/
• 2
• 1 W Ra
•• 2 G• 00 . c
•• 1 d
��
https://en.wikipedia.org/wiki/Deep_learning
• goo.gl/wSuuS9 (���)
• ,
��1
��2
��3
��4
���� �����
••• -•
• 3 43 4 Y p• CD 4 M i• ac c 43 , 4 a• d ac c n nY
r Ly
• ot ldA CN m m3 43 e
• 4 , 3 43 m u• ) 24 : 42 3 43 m
• s e3) A. Vaswani et al. “Attention is All You Need”, NIPS 2017
4) N. Shazzer et al. “Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer”, ICLR 2017
• . .
• 3 .) • .) A • ( 3 .) : :
3) A. Vaswani et al. “Attention is All You Need”, NIPS 2017
• ( 2• E !" = [!%", !'", … , !)"" ] • E !+ = [!%+ , !'+ , … , !)++ ] • ) E
•A : D5) M.-T. Luong et al. “Effective Approaches to Attention-based Neural Machine translation”, EMNLP 2015
5)
• ( - E• : D• : D• ) : D
• E : A
• • A , -
•
• K K V 25 /6 ) ,
• ( A 6
• 6
• L K 3 ASA
• V = • . • 11,/1 /
•
) (
• )
•/(
4) N. Shazzer et al. “Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer”, ICLR 2017
4)
• 9 L• 2 / 02 2 / 02 2 /
• 9• M - - - 9 =• -1 5
•����������� ���
• - - • - • -
• :
• ����������������� ���
)& (
• ( ()
•�������• ���� ��������
������
•M• / / -,/ p k lsr im e• W lsr f :• n s k• a a > - - /
• - / y• -2 2 - -, - / Wy
•• ot C L c A• L M d L C• lsr : L