representations natural language and...
TRANSCRIPT
![Page 1: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/1.jpg)
Slav Petrovon behalf of theLanguage Team @ Google Research
Natural Language Representations and Challenges
![Page 2: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/2.jpg)
Outline
●○○○
●○○○
![Page 3: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/3.jpg)
State-of-the-art in Natural Language Understanding in 2017
→ → Custom (Recurrent) Architectures
P 3
![Page 4: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/4.jpg)
Oct. 2018: One Model with Task-specific Tuning in Minutes
P 4
![Page 5: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/5.jpg)
Question Answering (SQuAD 1.1)
P 5
![Page 6: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/6.jpg)
Pre-Training in NLP
●
king
[-0.5, -0.9, 1.4, …]
queen
[-0.6, -0.8, -0.2, …] king wore a crown
Inner Product
queen wore a crown
Inner Product
[0.3, 0.2, -0.8, …]
open a bank account on the river bank
●
![Page 7: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/7.jpg)
History of Contextual Representations
●
Train LSTMLanguage Model
LSTM
<s>
open
LSTM
open
a
LSTM
a
bank
LSTM
very
LSTM
funny
LSTM
movie
POSITIVE
...
Fine-tune on Classification Task
![Page 8: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/8.jpg)
History of Contextual Representations
●
Train Separate Left-to-Right and Right-to-Left LMs
LSTM
<s>
open
LSTM
open
a
LSTM
a
bank
Apply as “Pre-trained Embeddings”
LSTM
open
<s>
LSTM
a
open
LSTM
bank
aExisting Model Architecture
![Page 9: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/9.jpg)
History of Contextual Representations
●
Transformer
<s>
open
open
a
a
bank
Transformer Transformer
POSITIVE
Fine-tune on Classification Task
Transformer
<s> open a
Transformer Transformer
Train Deep (12-layer) Transformer LM
![Page 10: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/10.jpg)
Unidirectional vs. Bidirectional Models
P 10
Layer 2
<s>
Layer 2
open
Layer 2
open
Layer 2
a
Layer 2
a
Layer 2
bank
Unidirectional contextBuild representation incrementally
Layer 2
<s>
Layer 2
open
Layer 2
open
Layer 2
a
Layer 2
a
Layer 2
bank
Bidirectional contextWords can “see themselves”
![Page 11: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/11.jpg)
Masked Language Model (Fill-in-the-blank)
P 11
● Solution: Mask out k% of the input words, and then predict the masked words○ We always use k = 15%
● Too little masking: Too expensive to train● Too much masking: Not enough context
the man went to the [MASK] to buy a [MASK] of milk
store gallon
![Page 12: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/12.jpg)
Next Sentence Prediction
P 12
● To learn relationships between sentences, predict whether Sentence B is actual sentence that proceeds Sentence A, or a random sentence
![Page 13: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/13.jpg)
Next Sentence Prediction
P 13
● Use 30,000 WordPiece vocabulary on input● Each token is sum of three embeddings● Single sequence is much more efficient
![Page 14: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/14.jpg)
Transformer Architecture
● Multi-headed self attention○
● Feed-forward layers○
● Layer norm and residuals○
● Positional embeddings○
![Page 15: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/15.jpg)
From One-Hot Vectors to Word Embeddings & Self-Attention
P 15
on ... river bank
0 0
0 0
…1
0 0
0 1…
01 0
0 0
…0
one-hot
1.4 … 3.7
4.9 … 6.4
2.5 … 8.0
embedding
The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
![Page 16: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/16.jpg)
From One-Hot Vectors to Word Embeddings & Self-Attention
P 16
0 0
0 0
…1
0 0
0 1…
01 0
0 0
…0
one-hot
1.4 … 3.7
4.9 … 6.4
2.5 … 8.0
embedding
The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
on ... river bank
![Page 17: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/17.jpg)
query, key, value
From One-Hot Vectors to Word Embeddings & Self-Attention
P 17
0 0
0 0
…1
0 0
0 1…
01 0
0 0
…0
one-hot
1.4 … 3.7
4.9 … 6.4
2.5 … 8.0
embedding
The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
on ... river bank
![Page 18: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/18.jpg)
query, key, value
From One-Hot Vectors to Word Embeddings & Self-Attention
P 18
0 0
0 0
…1
0 0
0 1…
01 0
0 0
…0
one-hot
0.1
0.2
0.7
(self-)attention
1.4 … 3.7
4.9 … 6.4
2.5 … 8.0
embedding
The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
on ... river bank
![Page 19: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/19.jpg)
query, key, value
From One-Hot Vectors to Word Embeddings & Self-Attention
P 19
0 0
0 0
…1
0 0
0 1…
01 0
0 0
…0
one-hot
0.1
0.2
0.7
(self-)attention
1.4 … 3.7
4.9 … 6.4
2.5 … 8.0
embedding
The Annotated Transformer, The Illustrated Transformer, The Illustrated BERT
on ... river bank
![Page 20: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/20.jpg)
Transformer vs LSTM
P 20
● Self-attention == no locality bias○ Long-distance context has “equal opportunity”
● Single multiplication per layer == efficiency on TPU○ Effective batch size is number of words, not sequences
X_0_0 X_0_1 X_0_2 X_0_3
X_1_0 X_1_1 X_1_2 X_1_3
✕ W
X_0_0 X_0_1 X_0_2 X_0_3
X_1_0 X_1_1 X_1_2 X_1_3
✕ W
Transformer LSTM
![Page 21: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/21.jpg)
Basic BERT Recipe
P 21
![Page 22: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/22.jpg)
Basic BERT Recipe
P 22
![Page 23: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/23.jpg)
Basic BERT Recipe
P 23
![Page 24: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/24.jpg)
GLUE Benchmark
MultiNLIPremise: Hills and mountains are especially sanctified in Jainism.Hypothesis: Jainism hates nature.Label: Contradiction
CoLaSentence: The wagon rumbled down the road.Label: Acceptable
Sentence: The car honked down the road.Label: Unacceptable
![Page 25: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/25.jpg)
SWAG: Zellers, Bisk, Schwartz, Choi, EMNLP 2018 SQuAD: Rajpurkar, Zhang, Lopyrev, Liang, EMNLP 2016
Results: Commonsense Reasoning and Question Answering
P 25
![Page 26: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/26.jpg)
Ablation Experiments
● Masked LM (compared to left-to-right LM) is very important on some tasks, Next Sentence Prediction is important on other tasks.
● Left-to-right model does very poorly on word-level task (SQuAD), although this is mitigated by BiLSTM
![Page 27: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/27.jpg)
● More data (and training longer) helps => not yet asymptoted
● Bigger model helps a lot
More is Better
![Page 28: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/28.jpg)
Try It Out, Get Faster Training with TPUs
P 28
![Page 29: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/29.jpg)
Do I Need Full BERT Models for All My Tasks?
P 29Houlsby, Giurgiu, Jastrzebski, Morrone, de Laroussilhe, Gesmundo, Attariyan, Gelly, arxiv Feb 2019
![Page 30: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/30.jpg)
Recently: Even Better Pretraining
P 30
![Page 31: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/31.jpg)
BERT vs. XLNet
BERT XLNet
Objective Masked LM + NextSentence Autoregressive LM
Masking Random 15% Last ⅙ in permutation order
Position Encoding Absolute Relative
Data 13 GB of text 126 GB of text
![Page 32: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/32.jpg)
Other BERT-inspired work
P 32
What does BERT learn - Tenney et al., ACL 2019Relation learning - Baldini-Soares et al., ACL 2019Passage representations - Lee et al., ACL 2019
![Page 33: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/33.jpg)
What does BERT know about language?
Tenney et al., ICLR 2019, ACL 2019
● What linguistic relationships?● Where in the model are they
computed?
Classifier-based probing: project model activations into space of linguistic annotations (graph edges)
![Page 34: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/34.jpg)
BERT Rediscovers the Classical NLP Pipeline
Probing Representations
High weights for POS in lower layers, then constituents, dependencies, and SRL, followed by entities and coreference as we move up the stack!
BERT improves coref over ELMo (84->91, or 39% relative)
We can trace hypotheses on individual sentences!
“he smoked toronto in the playoffs with six hits, seven walks …”
![Page 35: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/35.jpg)
Representing Entities and Relations
BERT
BERT
BLANK BLANKBERT
≈
≉
Baldini-Soares et al., ACL 2019
![Page 36: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/36.jpg)
Matching The Blanks Results
few-shot relation extraction
Baldini-Soares et al., ACL 2019
![Page 37: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/37.jpg)
Open Retrieval Question Answering
What does the zip in zip code
stand for?
Zone Improvement
Plan
How many districts are in
Alabama?Wikipedia 7
Input OutputLatent Retrieval
Wikipedia
Goal: Learn to efficiently read Wikipedia without any retrieval data.
Motivation: Best known recipe for latent retrieval is TF-IDF filtering + brute force. We can do better.
Key Insight: Pre-train an unsupervised ScaM neural retriever. This enables efficient end-to-end fine-tuning with standard latent variable learning methods.
Lee et al., ACL 2019
![Page 38: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/38.jpg)
Zebras have four gaits: walk, trot, canter and gallop. They are generally slower than horses, but their great stamina helps them outrun predators. When chased, a zebra will zig-zag from side to side, making it more difficult for the predator to attack...
Pseudo EvidenceZebras have four gaits: walk, trot, canter and gallop. When chased, a zebra will zig-zag from side to side, making it more difficult for the predator to attack...
Pseudo QueryThey are generally slower than horses, but their great stamina helps them outrun predators.
In-batch negative examplePoe capitalized on the success of "The Raven" by following it up with his essay "The Philosophy of Composition" (1846), in which he detailed the poem's creation….
In-batch negative exampleGagarin was further selected for an elite training group known as the Sochi Six, from which the first cosmonauts of the Vostok programme would be chosen...
Open Retrieval Question Answering
Inverse Cloze Task (ICT): Given a sentence (pseudo-query), predict the context (pseudo-evidence)
Can you be charged for the same crime in two different states?
Wikipedia ?
Progress towards one of the hardest binary text classification tasks today:
![Page 39: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/39.jpg)
Realistic Challenge Sets
Natural Questions - Kwiatkowski et al., TACL 2019Yes/No Questions - Clark et al., NAACL 2019Identifying Commands - Elkahky et al., EMNLP 2018
P 39
![Page 40: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/40.jpg)
Natural Questions Motivation
Kwiatkowski et al. TACL 2019
Question: The success of the Britain Can Make It exhibition led to the planning of what exhibition in 1951?
Evidence: ... The success of this exhibition led to the planning of the Festival of Britain (1951). By 1948 most of the collections had been returned to the museum.
Answer: Festival of Britain
Question: Can you make and receive calls in airplane mode?
Evidence: Airplane mode, …. suspends radio-frequency signal transmission by the device, thereby disabling Bluetooth, telephony, and Wi-Fi. GPS may or may not be disabled, because it does not involve transmitting radio waves.
Answer: No
Goal: Provide academia with first question answering dataset that represents a real question answering problem.
Previous question answering datasets are contrived. E.g. SQuAD's questions often paraphrase evidence text.
Answering real user queries requires much deeper language understanding and world knowledge.
Many questions have multiple acceptable answers: last hurricane in Massachusetts has a formal meaning (eye of the storm in MA) and a different colloquial meaning (hurricane force winds in MA). NQ embraces this acceptable variability. Solutions should model the full distribution of possible answers.
SQuAD dataset
Natural Questions
![Page 41: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/41.jpg)
P 41
Challenge:Many Correct Answers
NQ annotators are encouraged to pick the first good answer.
In practice we sometimes get many different answer locations for the same question.
Question: name the substance used to make the filament of bulb
![Page 42: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/42.jpg)
P 42
Defining CorrectnessWrong annotations are often the result of annotators trying to find an answer when the evidence isn't sufficient.
When all annotators agree that there is enough evidence available to answer a question, the annotations are overwhelmingly correct.
Long answers Short answers
X-axis - proportion of annotations that are non-null for question.Y-axis - expectation that a non-null annotation's question is in this bucket.
Also broken down, conditioned on annotation being: Correct (C ); Correct but debatable (C
d ); or Wrong (W ).
![Page 43: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/43.jpg)
Natural Questions - Status
● First ever release of Google queries.
● 300k training items, 16k for evaluation. Upper bound of 87% on long answers, 76% on short.
● Leaderboard seeing good activity, task is quite a bit harder than Squad.
![Page 44: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/44.jpg)
Williams, Nangia, Bowman, 2018
Prior Approaches to Testing Inference/Reasoning Abilities
P 44
![Page 45: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/45.jpg)
Clark, Lee, Chang, Kwiatkowski, Collins, Toutanova, NAACL 2019
BoolQ: Naturally Occurring Yes-No Questions
P 45
![Page 46: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/46.jpg)
Real Problems that Naturally Require Inference to Solve
Question
Passage
Answer
P 46
![Page 47: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/47.jpg)
Collecting Passages
P 47BoolQ
Document Selection Paragraph Selection Answer Selection
Are there blue whales in the Atlantic Ocean?
YesNo
Pipeline from Natural Questions (Kwiatkowski et al., 2019)
![Page 48: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/48.jpg)
Test Set Results
P 48
![Page 49: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/49.jpg)
Sample Efficiency: MultiNLI > BERT for Small Data
P 49
![Page 50: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/50.jpg)
Noun-Verb Ambiguity
“lives” / Noun → /laIvz/
“lives” / Verb → /lIvz/
fliesNOUN
Mark VERB
P 50Elkahky, Webster, Andor, Pitler, EMNLP 2018
![Page 51: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/51.jpg)
Certain insects can damage plumerias, such as mites, flies, or aphids. NOUNMark which area you want to distress. VERB
P 51
“A Challenge Set and Methods for Noun-Verb Ambiguity”, EMNLP 2018
![Page 52: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/52.jpg)
Accuracy on Noun-Verb Disambiguation
P 52
![Page 53: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/53.jpg)
Pronunciation of Homographs Accuracy
P 53
![Page 54: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/54.jpg)
Webster, Recasens, Axelrod, Baldridge, TACL 2019 Kwiatkowski, Palomaki, Redfield, Collins, Parikh, Alberti, Epstein, Polosukhin, Kelcey, Devlin, Lee, Toutanova, Jones, Chang, Dai, Uszkoreit, Le, Petrov, TACL 2019
Released Datasets with “In-the-Wild” Natural Challenges
P 54
Mark VERB
Are there blue whales in the Atlantic Ocean? YES
![Page 55: Representations Natural Language and Challengeslxmls.it.pt/2019/Natural_Language_Representations... · Natural Language Representations ... Natural Questions - Kwiatkowski et al.,](https://reader030.vdocuments.mx/reader030/viewer/2022041009/5eb4f5791ee4b16a4826ad36/html5/thumbnails/55.jpg)
Summary
P 55