towards betterudparsing:deep contextualized word...

1

Wanxiang Che · Yijia Liu · Yuxuan Wang · Bo Zheng · Ting Liu Harbin Institute of Technology Towards Better UD Parsing: Deep Contextualized Word Embeddings, Ensemble, and Treebank Concatenation Contributions of Each Technique § ELMo: +0.84 § Ensemble: +0.55 § Treebank Concat.: +0.42 (estimated on Dev set.) Training Details for ELMo § Supporting Unicode range § Training with sample softmax § use 8,192 surrounding words as negative samples § more stable training and better performance § Training one language takes 3 days on Nvidia P100 OOV Rate against ELMo Improvements Post-contest Evaluation on ELMo § On gold segmentation 95 95.5 96 96.5 97 UPOS Baseline +ELMo 84 84.5 85 85.5 86 LAS Baseline +ELMo 0.9 1.1 OOV Words Visualization Results Systems Word Seg. UPOS LAS MLAS BLEX ours 98.12 90.19 75.84 59.78 65.33 Uppsala 98.18 90.91 72.37 59.20 32.09 UDPipe F. 97.04 89.37 73.11 61.25 64.49 TurkuNLP 97.42 89.81 73.28 60.99 66.09 Conclusions § Several enhancement to Dozat et al. (2017) including: ELMo, ensemble, and treebank concatenation. § ELMo improves the OOV via better abstraction. Other Helpful Techniques § Improve POS tagger with ELMo § Improve tokenization with ELMo and semi-supervised features (zh_gsd: +6.6, ja_gsd: +4.1, vi_vtb: +7.2) ELMo on IV and OOV § ELMo improves more on OOV than IV 85 90 95 100 IV OOV Basline +ELMo 76 81 86 IV OOV Basline +ELMo Effect of Treebank Concatenation Dutch Swedish apino lassysmall lines talbanken Single 91.87 86.82 84.64 86.39 Concate. 92.08 89.34 85.76 86.77 Korean Italian gsd kaist isdt postwita Single 82.05 87.83 92.01 80.79 Concate. 83.73 87.61 91.80 82.54 Released ELMo @ https://github.com/HIT-SCIR/ELMoForManyLangs Final Evaluation of Shared Task § Rank 1 st according to LAS (+2.6 than the 2 nd ) Extensions to Dozat et al. (2017) § Using ELMo as additional input § Using ensemble on both the tagger and parser ELMo ELMo Word2vec

Upload: others

Post on 18-Jul-2020

2 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Page 1: Towards BetterUDParsing:Deep Contextualized Word ...yjliu.net/cv/res/2018-10-12-conll-poster.compressed.pdf · WanxiangChe · Yijia Liu·Yuxuan Wang· BoZheng· TingLiu Harbin Institute

Wanxiang Che · Yijia Liu · Yuxuan Wang · Bo Zheng · Ting Liu

Harbin Institute of Technology

Towards Better UD Parsing: Deep Contextualized Word Embeddings,

Ensemble, and Treebank Concatenation

Contributions of Each Technique

§ ELMo: +0.84

§ Ensemble: +0.55

§ Treebank Concat.: +0.42 (estimated on Dev set.)

Training Details for ELMo

§ Supporting Unicode range

§ Training with sample softmax

§ use 8,192 surrounding words as negative samples

§ more stable training and better performance

§ Training one language takes 3 days on Nvidia P100

OOV Rate against ELMo Improvements

Post-contest Evaluation on ELMo

§ On gold segmentation

95

95.5

96

96.5

97

UPOS

Baseline +ELMo

84

84.5

85

85.5

86

LAS

Baseline +ELMo

0.9 1.1

OOV Words Visualization

Results

Systems Word Seg. UPOS LAS MLAS BLEX

ours 98.12 90.19 75.84 59.78 65.33

Uppsala 98.18 90.91 72.37 59.20 32.09

UDPipe F. 97.04 89.37 73.11 61.25 64.49

TurkuNLP 97.42 89.81 73.28 60.99 66.09

Conclusions

§ Several enhancement to Dozat et al. (2017) including:

ELMo, ensemble, and treebank concatenation.

§ ELMo improves the OOV via better abstraction.

Other Helpful Techniques

§ Improve POS tagger with ELMo

§ Improve tokenization with ELMo and semi-supervised

features (zh_gsd: +6.6, ja_gsd: +4.1, vi_vtb: +7.2)

ELMo on IV and OOV

§ ELMo improves more on OOV than IV

85

90

95

100

IV OOV

Basline +ELMo

76

81

86

IV OOV

Basline +ELMo

Effect of Treebank Concatenation

Dutch Swedish

apino lassysmall lines talbanken

Single 91.87 86.82 84.64 86.39

Concate. 92.08 89.34 85.76 86.77

Korean Italian

gsd kaist isdt postwita

Single 82.05 87.83 92.01 80.79

Concate. 83.73 87.61 91.80 82.54

Released ELMo @ https://github.com/HIT-SCIR/ELMoForManyLangs

Final Evaluation of Shared Task

§ Rank 1st according to LAS (+2.6 than the 2nd)

Extensions to Dozat et al. (2017)

§ Using ELMo as additional input

§ Using ensemble on both the tagger and parser

ELMo

ELMo Word2vec

https://github.com/HIT-SCIR/ELMoForManyLangs

RoSeq: Robust Sequence LabelingNER—88.07% on CoNLL-2002 Dutch, 87.33% on CoNLL-2002 Spanish, 52.94% on WNUT-2016 Twitter, and 43.03% on WNUT-2017 Twitter without using the additional

Chris Dyer - 2017 - CoNLL Invited Talk: Should Neural Network Architecture Reflect Linguistic Structure?

CoNLL-SIGMORPHON 2017 Shared Task: Universal Morphological ...jason/papers/cotterell+al.conll17.pdf · CoNLL-SIGMORPHON 2017 Shared Task: Universal Morphological Reinﬂection in

Reishi Brochure zh - YIJIA · 2019. 7. 31. · Title: Reishi_Brochure_zh Created Date: 6/18/2019 5:15:33 PM

Homework Yijia Liu - Kansas State University

Parsing de Dependencias CoNLL 2006 y 2007 Guillermo Nebot Troyano

Xin Li , Yijia He , Jinlong Lin , Xiao Liu · 2020-04-28 · Xin Li 1; 2, Yijia He , Jinlong Lin , Xiao Liu2 Abstract—With monocular Visual-Inertial Odometry (VIO) system, 3D point

Proceedings of the 23rd Conference on Computational Natural Language Learning · 2019. 11. 2. · Introduction The 2019 Conference on Computational Natural Language Learning (CoNLL)

Fast Nonparametric Machine Learning Algorithms …tingliu/proposal/proposal.pdfFast Nonparametric Machine Learning Algorithms for High-dimensional Massive Data and Applications | a

HSC Outcomes · 2020. 3. 12. · HSC OUTCOMES PAGE 1. College Dux and HSC Dux Eva Yijia Wang It is a pleasure to announce Yijia (Eva) Wang as College Dux and HSC Dux in 2018, with

1 Search Engine Statistics Beyond the n-gram: Application to Noun Compound Bracketing CoNLL-2005 Preslav Nakov EECS, Computer Science Division University

CoNLL 2018 Shared Task: Multilingual Parsing from Raw …

The Ecophysiological Significance of Leaf Movements in ... · THE ECOPHYSIOLOGICAL SIGNIFICANCE OF LEAF MOVEMENTS IN RHODODENDRON MAXIMUM1 YIJIA BAO AND ERIK T. NILSEN2 Biology Department,

CoNLL-2010: Shared Task Fourteenth Conference on

Named-Entity Recognition with Character-Level Models Dan Klein, Joseph Smarr, Huy Nguyen, and Christopher D. Manning Stanford University CoNLL-2003: Seventh

Logo Software as a service (Saas) Group D Fong Hui Yun Kyung Jung Yijia Li Roxana Hernandez UC-Berkeley Strategic Computing and Communications Technology

Xin Li , Yijia He , Jinlong Lin , Xiao Liu · Leveraging Planar Regularities for Point Line Visual-Inertial Odometry Xin Li 1; 2, Yijia He , Jinlong Lin , Xiao Liu2 Abstract—With

Mortality Regimes and Pricing Samuel H. Cox University of Manitoba Yijia Lin University of Nebraska - Lincoln Andreas Milidonis University of Cyprus &

Exploring Segment Representations for Neural Segmentation … · 2016-06-28 · Exploring Segment Representations for Neural Segmentation Models Yijia Liu, Wanxiang Che ⇤, Jiang

The operation of an Economy (1) Sun yijia. 14.03.2016 what is most important to you when choosing 1.…

SECURITIZATION OF MORTALITY RISKS IN LIFE ANNUITIES YIJIA LIN AND SAMUEL H. COX Доклад подготовила студентка 61УРАМ Ящук М

CONLL 2014 Omer Levy and Yoav Goldberg. Linguistic Regularities in Sparse and Explicit Word Representations

Zhang yijia 611692 final journal

Market Research (1) Sun Yijia. Margaret Thatcher 1. margaret thatcher is the daughter of a local businessman. Her family operated a grocery store and

Pension Risk Management in the Enterprise Risk Management Framework Yijia Lin University of Nebraska - Lincoln Richard MacMinn Illinois State University

This file has been cleaned of potential threats. If you confirm that …cse.iitkgp.ac.in/~sourangshu/coursefiles/IR18A/chap16... · 2018-11-12 · CoNLL-2002 and CoNLL-2003 (British

shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/43484/7/phd … · Web viewSo far, many such formats have been devised, for instance; CONLL-X (see Table. 2)

Reishi Brochure - YIJIA · Title: Reishi_Brochure_ Created Date: 6/18/2019 5:14:53 PM

Signcomplex - italtras.us · 3528 SMD LED 5050 SMD LED P1 Tel:+86-755-27608650 Fax:+86-755-27608651 [email protected] Yijia Industrial Park

Costas Arkolakis teaching assistant: Yijia Luka265/teaching/UndergradFinance/Spr11/Slides... · Costas Arkolakis teaching assistant: Yijia Lu Economics 407, ... Exchange Rate is the

BankCapitalandLending: EvidencefromSyndicated …...BankCapitalandLending: EvidencefromSyndicated Loans Yongqiang Chu, Donghang Zhang, and Yijia Zhao∗ This Version: June, 2014 Abstract

Fast Nonparametric Machine Learning Algorithms for …tingliu/thesis/tingliu_thesis.pdfI started to be interested in working on nearest-neighbor algorithms when I worked on a phamathutical

1. Go to 2. Click on Log In...C a yijia.ca MKT Events YIJIA I ENG I Tiéng Viet u I Log in Apps Contact us I MIRIKEL Balance Five is available now! YIJIA NORTH AMERICA PromotionsApps

Proceedings of the CoNLL 2018 Shared Task: Multilingual ...universaldependencies.org/conll18/proceedings/K18-2.pdf · CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to

Foshan Baitian Building Materials Co.,Ltd - TradeKeyimgusr.tradekey.com/images/uploadedimages/brochures/9/0/4911018... · Office add: Yingke Yijia Ceramic & Sanitary Ware Center,