computational linguistics and nlp: how far from generic ... · computational linguistics and nlp:...

189
Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology Group with thanks to Joachim Nivre and Abigail See January 17, 2018

Upload: others

Post on 01-Jan-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Computational linguistics and NLP:How far from generic linguistics?

Andrey KutuzovUniversity of Oslo

Language Technology Groupwith thanks to Joachim Nivre and Abigail See

January 17, 2018

Page 2: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Contents

1 What is NLP?

2 Case 1: Redefining parts of speech

3 Case 2: Tracing diachronic semantic shifts

1

Page 3: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Defining the field

I Computational Linguistics (CL);

I Natural Language Processing (NLP);I Natural Language Understanding (NLU);I More or less the same academic field:I scientific study of language from a computational perspective.

I Dates back probably to medieval mystics looking for regularities insacred texts;

I In the modern sense of the word, starts in the XX century:I George Zipf (studied statistics of natural language);I Noam Chomsky (introduced transformational grammar);I machine translation hype in the 1950s.

2

Page 4: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Defining the field

I Computational Linguistics (CL);I Natural Language Processing (NLP);

I Natural Language Understanding (NLU);I More or less the same academic field:I scientific study of language from a computational perspective.

I Dates back probably to medieval mystics looking for regularities insacred texts;

I In the modern sense of the word, starts in the XX century:I George Zipf (studied statistics of natural language);I Noam Chomsky (introduced transformational grammar);I machine translation hype in the 1950s.

2

Page 5: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Defining the field

I Computational Linguistics (CL);I Natural Language Processing (NLP);I Natural Language Understanding (NLU);

I More or less the same academic field:I scientific study of language from a computational perspective.

I Dates back probably to medieval mystics looking for regularities insacred texts;

I In the modern sense of the word, starts in the XX century:I George Zipf (studied statistics of natural language);I Noam Chomsky (introduced transformational grammar);I machine translation hype in the 1950s.

2

Page 6: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Defining the field

I Computational Linguistics (CL);I Natural Language Processing (NLP);I Natural Language Understanding (NLU);I More or less the same academic field:

I scientific study of language from a computational perspective.

I Dates back probably to medieval mystics looking for regularities insacred texts;

I In the modern sense of the word, starts in the XX century:I George Zipf (studied statistics of natural language);I Noam Chomsky (introduced transformational grammar);I machine translation hype in the 1950s.

2

Page 7: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Defining the field

I Computational Linguistics (CL);I Natural Language Processing (NLP);I Natural Language Understanding (NLU);I More or less the same academic field:I scientific study of language from a computational perspective.

I Dates back probably to medieval mystics looking for regularities insacred texts;

I In the modern sense of the word, starts in the XX century:I George Zipf (studied statistics of natural language);I Noam Chomsky (introduced transformational grammar);I machine translation hype in the 1950s.

2

Page 8: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Defining the field

I Computational Linguistics (CL);I Natural Language Processing (NLP);I Natural Language Understanding (NLU);I More or less the same academic field:I scientific study of language from a computational perspective.

I Dates back probably to medieval mystics looking for regularities insacred texts;

I In the modern sense of the word, starts in the XX century:I George Zipf (studied statistics of natural language);I Noam Chomsky (introduced transformational grammar);I machine translation hype in the 1950s.

2

Page 9: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Defining the field

I Computational Linguistics (CL);I Natural Language Processing (NLP);I Natural Language Understanding (NLU);I More or less the same academic field:I scientific study of language from a computational perspective.

I Dates back probably to medieval mystics looking for regularities insacred texts;

I In the modern sense of the word, starts in the XX century:

I George Zipf (studied statistics of natural language);I Noam Chomsky (introduced transformational grammar);I machine translation hype in the 1950s.

2

Page 10: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Defining the field

I Computational Linguistics (CL);I Natural Language Processing (NLP);I Natural Language Understanding (NLU);I More or less the same academic field:I scientific study of language from a computational perspective.

I Dates back probably to medieval mystics looking for regularities insacred texts;

I In the modern sense of the word, starts in the XX century:I George Zipf (studied statistics of natural language);

I Noam Chomsky (introduced transformational grammar);I machine translation hype in the 1950s.

2

Page 11: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Defining the field

I Computational Linguistics (CL);I Natural Language Processing (NLP);I Natural Language Understanding (NLU);I More or less the same academic field:I scientific study of language from a computational perspective.

I Dates back probably to medieval mystics looking for regularities insacred texts;

I In the modern sense of the word, starts in the XX century:I George Zipf (studied statistics of natural language);I Noam Chomsky (introduced transformational grammar);

I machine translation hype in the 1950s.

2

Page 12: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Defining the field

I Computational Linguistics (CL);I Natural Language Processing (NLP);I Natural Language Understanding (NLU);I More or less the same academic field:I scientific study of language from a computational perspective.

I Dates back probably to medieval mystics looking for regularities insacred texts;

I In the modern sense of the word, starts in the XX century:I George Zipf (studied statistics of natural language);I Noam Chomsky (introduced transformational grammar);I machine translation hype in the 1950s.

2

Page 13: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

NLP/CL is booming

The number of submissions to the annual Association forComputational Linguistics conference (ACL)

3

Page 14: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Recent boost

I In the last 20 years, NLP sees an incredible boost.I The main reason is information: ‘oil of the XXI century’;

I Business wants to process information (especially IT companies);I Information very often occurs in the form of (digital) texts.

I Important: both academic and industrial field!I people drifting from universities to companies and back.

I Computational linguists contribute to many working systems:I machine translationI speech recognitionI web search enginesI grammar and spell checkingI virtual personal assistants (Siri, Alexa, Cortana)I etc.

4

Page 15: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Recent boost

I In the last 20 years, NLP sees an incredible boost.I The main reason is information: ‘oil of the XXI century’;I Business wants to process information (especially IT companies);

I Information very often occurs in the form of (digital) texts.I Important: both academic and industrial field!

I people drifting from universities to companies and back.I Computational linguists contribute to many working systems:

I machine translationI speech recognitionI web search enginesI grammar and spell checkingI virtual personal assistants (Siri, Alexa, Cortana)I etc.

4

Page 16: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Recent boost

I In the last 20 years, NLP sees an incredible boost.I The main reason is information: ‘oil of the XXI century’;I Business wants to process information (especially IT companies);

I Information very often occurs in the form of (digital) texts.

I Important: both academic and industrial field!I people drifting from universities to companies and back.

I Computational linguists contribute to many working systems:I machine translationI speech recognitionI web search enginesI grammar and spell checkingI virtual personal assistants (Siri, Alexa, Cortana)I etc.

4

Page 17: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Recent boost

I In the last 20 years, NLP sees an incredible boost.I The main reason is information: ‘oil of the XXI century’;I Business wants to process information (especially IT companies);

I Information very often occurs in the form of (digital) texts.

I Important: both academic and industrial field!I people drifting from universities to companies and back.

I Computational linguists contribute to many working systems:I machine translationI speech recognitionI web search enginesI grammar and spell checkingI virtual personal assistants (Siri, Alexa, Cortana)I etc.

4

Page 18: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Recent boost

I In the last 20 years, NLP sees an incredible boost.I The main reason is information: ‘oil of the XXI century’;I Business wants to process information (especially IT companies);

I Information very often occurs in the form of (digital) texts.I Important: both academic and industrial field!

I people drifting from universities to companies and back.

I Computational linguists contribute to many working systems:I machine translationI speech recognitionI web search enginesI grammar and spell checkingI virtual personal assistants (Siri, Alexa, Cortana)I etc.

4

Page 19: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Recent boost

I In the last 20 years, NLP sees an incredible boost.I The main reason is information: ‘oil of the XXI century’;I Business wants to process information (especially IT companies);

I Information very often occurs in the form of (digital) texts.I Important: both academic and industrial field!

I people drifting from universities to companies and back.I Computational linguists contribute to many working systems:

I machine translationI speech recognitionI web search enginesI grammar and spell checkingI virtual personal assistants (Siri, Alexa, Cortana)I etc.

4

Page 20: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Recent boost

I In the last 20 years, NLP sees an incredible boost.I The main reason is information: ‘oil of the XXI century’;I Business wants to process information (especially IT companies);

I Information very often occurs in the form of (digital) texts.I Important: both academic and industrial field!

I people drifting from universities to companies and back.I Computational linguists contribute to many working systems:

I machine translation

I speech recognitionI web search enginesI grammar and spell checkingI virtual personal assistants (Siri, Alexa, Cortana)I etc.

4

Page 21: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Recent boost

I In the last 20 years, NLP sees an incredible boost.I The main reason is information: ‘oil of the XXI century’;I Business wants to process information (especially IT companies);

I Information very often occurs in the form of (digital) texts.I Important: both academic and industrial field!

I people drifting from universities to companies and back.I Computational linguists contribute to many working systems:

I machine translationI speech recognition

I web search enginesI grammar and spell checkingI virtual personal assistants (Siri, Alexa, Cortana)I etc.

4

Page 22: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Recent boost

I In the last 20 years, NLP sees an incredible boost.I The main reason is information: ‘oil of the XXI century’;I Business wants to process information (especially IT companies);

I Information very often occurs in the form of (digital) texts.I Important: both academic and industrial field!

I people drifting from universities to companies and back.I Computational linguists contribute to many working systems:

I machine translationI speech recognitionI web search engines

I grammar and spell checkingI virtual personal assistants (Siri, Alexa, Cortana)I etc.

4

Page 23: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Recent boost

I In the last 20 years, NLP sees an incredible boost.I The main reason is information: ‘oil of the XXI century’;I Business wants to process information (especially IT companies);

I Information very often occurs in the form of (digital) texts.I Important: both academic and industrial field!

I people drifting from universities to companies and back.I Computational linguists contribute to many working systems:

I machine translationI speech recognitionI web search enginesI grammar and spell checking

I virtual personal assistants (Siri, Alexa, Cortana)I etc.

4

Page 24: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Recent boost

I In the last 20 years, NLP sees an incredible boost.I The main reason is information: ‘oil of the XXI century’;I Business wants to process information (especially IT companies);

I Information very often occurs in the form of (digital) texts.I Important: both academic and industrial field!

I people drifting from universities to companies and back.I Computational linguists contribute to many working systems:

I machine translationI speech recognitionI web search enginesI grammar and spell checkingI virtual personal assistants (Siri, Alexa, Cortana)I etc.

4

Page 25: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Is it linguistics at all?

Differences from ‘traditional’ or ‘generic’ linguistics

I Traditional linguistics usually describes and compares languages.I NLP is closer to mathematics and engineering: we calculate.I Building computational models of linguistic phenomena:

1. ‘rule-based’ (‘hand-crafted’);2. ‘data-driven’ (statistical).

I Statistics is at the core of today’s NLP.I We run experiments to test hypotheses:

I ‘there are 10 parts of speech in this language’,I ‘word co-occurrence information improves document classification’.

I Replicability (the same experiment must always yield the sameresult);

I Reproducibility (similar experiments should yield comparableresults).

5

Page 26: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Is it linguistics at all?

Differences from ‘traditional’ or ‘generic’ linguisticsI Traditional linguistics usually describes and compares languages.

I NLP is closer to mathematics and engineering: we calculate.I Building computational models of linguistic phenomena:

1. ‘rule-based’ (‘hand-crafted’);2. ‘data-driven’ (statistical).

I Statistics is at the core of today’s NLP.I We run experiments to test hypotheses:

I ‘there are 10 parts of speech in this language’,I ‘word co-occurrence information improves document classification’.

I Replicability (the same experiment must always yield the sameresult);

I Reproducibility (similar experiments should yield comparableresults).

5

Page 27: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Is it linguistics at all?

Differences from ‘traditional’ or ‘generic’ linguisticsI Traditional linguistics usually describes and compares languages.I NLP is closer to mathematics and engineering: we calculate.

I Building computational models of linguistic phenomena:1. ‘rule-based’ (‘hand-crafted’);2. ‘data-driven’ (statistical).

I Statistics is at the core of today’s NLP.I We run experiments to test hypotheses:

I ‘there are 10 parts of speech in this language’,I ‘word co-occurrence information improves document classification’.

I Replicability (the same experiment must always yield the sameresult);

I Reproducibility (similar experiments should yield comparableresults).

5

Page 28: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Is it linguistics at all?

Differences from ‘traditional’ or ‘generic’ linguisticsI Traditional linguistics usually describes and compares languages.I NLP is closer to mathematics and engineering: we calculate.I Building computational models of linguistic phenomena:

1. ‘rule-based’ (‘hand-crafted’);2. ‘data-driven’ (statistical).

I Statistics is at the core of today’s NLP.I We run experiments to test hypotheses:

I ‘there are 10 parts of speech in this language’,I ‘word co-occurrence information improves document classification’.

I Replicability (the same experiment must always yield the sameresult);

I Reproducibility (similar experiments should yield comparableresults).

5

Page 29: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Is it linguistics at all?

Differences from ‘traditional’ or ‘generic’ linguisticsI Traditional linguistics usually describes and compares languages.I NLP is closer to mathematics and engineering: we calculate.I Building computational models of linguistic phenomena:

1. ‘rule-based’ (‘hand-crafted’);

2. ‘data-driven’ (statistical).I Statistics is at the core of today’s NLP.I We run experiments to test hypotheses:

I ‘there are 10 parts of speech in this language’,I ‘word co-occurrence information improves document classification’.

I Replicability (the same experiment must always yield the sameresult);

I Reproducibility (similar experiments should yield comparableresults).

5

Page 30: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Is it linguistics at all?

Differences from ‘traditional’ or ‘generic’ linguisticsI Traditional linguistics usually describes and compares languages.I NLP is closer to mathematics and engineering: we calculate.I Building computational models of linguistic phenomena:

1. ‘rule-based’ (‘hand-crafted’);2. ‘data-driven’ (statistical).

I Statistics is at the core of today’s NLP.I We run experiments to test hypotheses:

I ‘there are 10 parts of speech in this language’,I ‘word co-occurrence information improves document classification’.

I Replicability (the same experiment must always yield the sameresult);

I Reproducibility (similar experiments should yield comparableresults).

5

Page 31: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Is it linguistics at all?

Differences from ‘traditional’ or ‘generic’ linguisticsI Traditional linguistics usually describes and compares languages.I NLP is closer to mathematics and engineering: we calculate.I Building computational models of linguistic phenomena:

1. ‘rule-based’ (‘hand-crafted’);2. ‘data-driven’ (statistical).

I Statistics is at the core of today’s NLP.

I We run experiments to test hypotheses:I ‘there are 10 parts of speech in this language’,I ‘word co-occurrence information improves document classification’.

I Replicability (the same experiment must always yield the sameresult);

I Reproducibility (similar experiments should yield comparableresults).

5

Page 32: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Is it linguistics at all?

Differences from ‘traditional’ or ‘generic’ linguisticsI Traditional linguistics usually describes and compares languages.I NLP is closer to mathematics and engineering: we calculate.I Building computational models of linguistic phenomena:

1. ‘rule-based’ (‘hand-crafted’);2. ‘data-driven’ (statistical).

I Statistics is at the core of today’s NLP.I We run experiments to test hypotheses:

I ‘there are 10 parts of speech in this language’,I ‘word co-occurrence information improves document classification’.

I Replicability (the same experiment must always yield the sameresult);

I Reproducibility (similar experiments should yield comparableresults).

5

Page 33: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Is it linguistics at all?

Differences from ‘traditional’ or ‘generic’ linguisticsI Traditional linguistics usually describes and compares languages.I NLP is closer to mathematics and engineering: we calculate.I Building computational models of linguistic phenomena:

1. ‘rule-based’ (‘hand-crafted’);2. ‘data-driven’ (statistical).

I Statistics is at the core of today’s NLP.I We run experiments to test hypotheses:

I ‘there are 10 parts of speech in this language’,

I ‘word co-occurrence information improves document classification’.I Replicability (the same experiment must always yield the same

result);I Reproducibility (similar experiments should yield comparable

results).

5

Page 34: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Is it linguistics at all?

Differences from ‘traditional’ or ‘generic’ linguisticsI Traditional linguistics usually describes and compares languages.I NLP is closer to mathematics and engineering: we calculate.I Building computational models of linguistic phenomena:

1. ‘rule-based’ (‘hand-crafted’);2. ‘data-driven’ (statistical).

I Statistics is at the core of today’s NLP.I We run experiments to test hypotheses:

I ‘there are 10 parts of speech in this language’,I ‘word co-occurrence information improves document classification’.

I Replicability (the same experiment must always yield the sameresult);

I Reproducibility (similar experiments should yield comparableresults).

5

Page 35: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Is it linguistics at all?

Differences from ‘traditional’ or ‘generic’ linguisticsI Traditional linguistics usually describes and compares languages.I NLP is closer to mathematics and engineering: we calculate.I Building computational models of linguistic phenomena:

1. ‘rule-based’ (‘hand-crafted’);2. ‘data-driven’ (statistical).

I Statistics is at the core of today’s NLP.I We run experiments to test hypotheses:

I ‘there are 10 parts of speech in this language’,I ‘word co-occurrence information improves document classification’.

I Replicability (the same experiment must always yield the sameresult);

I Reproducibility (similar experiments should yield comparableresults).

5

Page 36: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Is it linguistics at all?

Differences from ‘traditional’ or ‘generic’ linguisticsI Traditional linguistics usually describes and compares languages.I NLP is closer to mathematics and engineering: we calculate.I Building computational models of linguistic phenomena:

1. ‘rule-based’ (‘hand-crafted’);2. ‘data-driven’ (statistical).

I Statistics is at the core of today’s NLP.I We run experiments to test hypotheses:

I ‘there are 10 parts of speech in this language’,I ‘word co-occurrence information improves document classification’.

I Replicability (the same experiment must always yield the sameresult);

I Reproducibility (similar experiments should yield comparableresults).

5

Page 37: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Stress on practice

I Research should be practical.I ‘Show me your code!’I ‘Show me the scores of your system!’I Empirical evaluation on particular

problems.I Test data sets.I Shared tasks (competitions).

6

Page 38: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Stress on practice

I Research should be practical.

I ‘Show me your code!’I ‘Show me the scores of your system!’I Empirical evaluation on particular

problems.I Test data sets.I Shared tasks (competitions).

6

Page 39: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Stress on practice

I Research should be practical.I ‘Show me your code!’

I ‘Show me the scores of your system!’I Empirical evaluation on particular

problems.I Test data sets.I Shared tasks (competitions).

6

Page 40: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Stress on practice

I Research should be practical.I ‘Show me your code!’I ‘Show me the scores of your system!’

I Empirical evaluation on particularproblems.

I Test data sets.I Shared tasks (competitions).

6

Page 41: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Stress on practice

I Research should be practical.I ‘Show me your code!’I ‘Show me the scores of your system!’I Empirical evaluation on particular

problems.

I Test data sets.I Shared tasks (competitions).

6

Page 42: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Stress on practice

I Research should be practical.I ‘Show me your code!’I ‘Show me the scores of your system!’I Empirical evaluation on particular

problems.I Test data sets.

I Shared tasks (competitions).

6

Page 43: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Stress on practice

I Research should be practical.I ‘Show me your code!’I ‘Show me the scores of your system!’I Empirical evaluation on particular

problems.I Test data sets.I Shared tasks (competitions).

6

Page 44: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Publishing activitiesI Conferences:

I ACL

I EMNLPI EACLI NAACLI COLINGI LREC...

I Journals:I ‘Computational Linguistics’ (CL);I ‘Transactions of the Association for Computational Linguistics’

(TACL).I Unlike in other fields, journals are not that important.

7

Page 45: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Publishing activitiesI Conferences:

I ACLI EMNLP

I EACLI NAACLI COLINGI LREC...

I Journals:I ‘Computational Linguistics’ (CL);I ‘Transactions of the Association for Computational Linguistics’

(TACL).I Unlike in other fields, journals are not that important.

7

Page 46: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Publishing activitiesI Conferences:

I ACLI EMNLPI EACL

I NAACLI COLINGI LREC...

I Journals:I ‘Computational Linguistics’ (CL);I ‘Transactions of the Association for Computational Linguistics’

(TACL).I Unlike in other fields, journals are not that important.

7

Page 47: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Publishing activitiesI Conferences:

I ACLI EMNLPI EACLI NAACL

I COLINGI LREC...

I Journals:I ‘Computational Linguistics’ (CL);I ‘Transactions of the Association for Computational Linguistics’

(TACL).I Unlike in other fields, journals are not that important.

7

Page 48: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Publishing activitiesI Conferences:

I ACLI EMNLPI EACLI NAACLI COLING

I LREC...I Journals:

I ‘Computational Linguistics’ (CL);I ‘Transactions of the Association for Computational Linguistics’

(TACL).I Unlike in other fields, journals are not that important.

7

Page 49: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Publishing activitiesI Conferences:

I ACLI EMNLPI EACLI NAACLI COLINGI LREC...

I Journals:

I ‘Computational Linguistics’ (CL);I ‘Transactions of the Association for Computational Linguistics’

(TACL).I Unlike in other fields, journals are not that important.

7

Page 50: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Publishing activitiesI Conferences:

I ACLI EMNLPI EACLI NAACLI COLINGI LREC...

I Journals:I ‘Computational Linguistics’ (CL);

I ‘Transactions of the Association for Computational Linguistics’(TACL).

I Unlike in other fields, journals are not that important.

7

Page 51: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Publishing activitiesI Conferences:

I ACLI EMNLPI EACLI NAACLI COLINGI LREC...

I Journals:I ‘Computational Linguistics’ (CL);I ‘Transactions of the Association for Computational Linguistics’

(TACL).

I Unlike in other fields, journals are not that important.

7

Page 52: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Publishing activitiesI Conferences:

I ACLI EMNLPI EACLI NAACLI COLINGI LREC...

I Journals:I ‘Computational Linguistics’ (CL);I ‘Transactions of the Association for Computational Linguistics’

(TACL).I Unlike in other fields, journals are not that important.

7

Page 53: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Publishing activitiesI Conferences:

I ACLI EMNLPI EACLI NAACLI COLINGI LREC...

I Journals:I ‘Computational Linguistics’ (CL);I ‘Transactions of the Association for Computational Linguistics’

(TACL).I Unlike in other fields, journals are not that important.

7

Page 54: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Publishing activitiesI Most of the papers can be found in the Association for

Computational Linguistics (ACL) Anthology:I https://aclanthology.info/

I Double blind peer review almost everywhere...I ...recent years: open preprints published online:

I https://arxiv.org/list/cs.CL/recent

8

Page 55: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Publishing activitiesI Most of the papers can be found in the Association for

Computational Linguistics (ACL) Anthology:I https://aclanthology.info/

I Double blind peer review almost everywhere...

I ...recent years: open preprints published online:I https://arxiv.org/list/cs.CL/recent

8

Page 56: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Publishing activitiesI Most of the papers can be found in the Association for

Computational Linguistics (ACL) Anthology:I https://aclanthology.info/

I Double blind peer review almost everywhere...I ...recent years: open preprints published online:

I https://arxiv.org/list/cs.CL/recent

8

Page 57: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Publishing activitiesI Most of the papers can be found in the Association for

Computational Linguistics (ACL) Anthology:I https://aclanthology.info/

I Double blind peer review almost everywhere...I ...recent years: open preprints published online:

I https://arxiv.org/list/cs.CL/recent

8

Page 58: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Publishing activitiesI Most of the papers can be found in the Association for

Computational Linguistics (ACL) Anthology:I https://aclanthology.info/

I Double blind peer review almost everywhere...I ...recent years: open preprints published online:

I https://arxiv.org/list/cs.CL/recent

8

Page 59: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Machine learningI NLP is now being rapidly transformed by another field(s):

I data science and machine learning.I Some problems are so complex that we can‘t formulate exact

algorithms for them.I To solve such problems, one can use machine learning:

I programs which learn to make correct decisions on some trainingmaterial and improve with experience;

I thus, we train our systems on linguistic data (usually large textcollections: corpora).

I Artificial neural networks are one of popular machine learningapproaches for language modeling.

Deep learning renaissance

I ‘Deep learning’ is training and using multi-layered artificial neural networks.

I After long ‘winter’ (since 60s and 70s), it is now again popular.

I Deep neural approaches are very efficient in NLP.

I ‘Do we need anything except neural networks now?’

I Another reason for the recent boost of interest towards our discipline.

9

Page 60: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Machine learningI NLP is now being rapidly transformed by another field(s):I data science and machine learning.

I Some problems are so complex that we can‘t formulate exactalgorithms for them.

I To solve such problems, one can use machine learning:I programs which learn to make correct decisions on some training

material and improve with experience;I thus, we train our systems on linguistic data (usually large text

collections: corpora).I Artificial neural networks are one of popular machine learning

approaches for language modeling.

Deep learning renaissance

I ‘Deep learning’ is training and using multi-layered artificial neural networks.

I After long ‘winter’ (since 60s and 70s), it is now again popular.

I Deep neural approaches are very efficient in NLP.

I ‘Do we need anything except neural networks now?’

I Another reason for the recent boost of interest towards our discipline.

9

Page 61: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Machine learningI NLP is now being rapidly transformed by another field(s):I data science and machine learning.

I Some problems are so complex that we can‘t formulate exactalgorithms for them.

I To solve such problems, one can use machine learning:I programs which learn to make correct decisions on some training

material and improve with experience;I thus, we train our systems on linguistic data (usually large text

collections: corpora).I Artificial neural networks are one of popular machine learning

approaches for language modeling.

Deep learning renaissance

I ‘Deep learning’ is training and using multi-layered artificial neural networks.

I After long ‘winter’ (since 60s and 70s), it is now again popular.

I Deep neural approaches are very efficient in NLP.

I ‘Do we need anything except neural networks now?’

I Another reason for the recent boost of interest towards our discipline.

9

Page 62: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Machine learningI NLP is now being rapidly transformed by another field(s):I data science and machine learning.

I Some problems are so complex that we can‘t formulate exactalgorithms for them.

I To solve such problems, one can use machine learning:

I programs which learn to make correct decisions on some trainingmaterial and improve with experience;

I thus, we train our systems on linguistic data (usually large textcollections: corpora).

I Artificial neural networks are one of popular machine learningapproaches for language modeling.

Deep learning renaissance

I ‘Deep learning’ is training and using multi-layered artificial neural networks.

I After long ‘winter’ (since 60s and 70s), it is now again popular.

I Deep neural approaches are very efficient in NLP.

I ‘Do we need anything except neural networks now?’

I Another reason for the recent boost of interest towards our discipline.

9

Page 63: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Machine learningI NLP is now being rapidly transformed by another field(s):I data science and machine learning.

I Some problems are so complex that we can‘t formulate exactalgorithms for them.

I To solve such problems, one can use machine learning:I programs which learn to make correct decisions on some training

material and improve with experience;

I thus, we train our systems on linguistic data (usually large textcollections: corpora).

I Artificial neural networks are one of popular machine learningapproaches for language modeling.

Deep learning renaissance

I ‘Deep learning’ is training and using multi-layered artificial neural networks.

I After long ‘winter’ (since 60s and 70s), it is now again popular.

I Deep neural approaches are very efficient in NLP.

I ‘Do we need anything except neural networks now?’

I Another reason for the recent boost of interest towards our discipline.

9

Page 64: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Machine learningI NLP is now being rapidly transformed by another field(s):I data science and machine learning.

I Some problems are so complex that we can‘t formulate exactalgorithms for them.

I To solve such problems, one can use machine learning:I programs which learn to make correct decisions on some training

material and improve with experience;I thus, we train our systems on linguistic data (usually large text

collections: corpora).

I Artificial neural networks are one of popular machine learningapproaches for language modeling.

Deep learning renaissance

I ‘Deep learning’ is training and using multi-layered artificial neural networks.

I After long ‘winter’ (since 60s and 70s), it is now again popular.

I Deep neural approaches are very efficient in NLP.

I ‘Do we need anything except neural networks now?’

I Another reason for the recent boost of interest towards our discipline.

9

Page 65: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Machine learningI NLP is now being rapidly transformed by another field(s):I data science and machine learning.

I Some problems are so complex that we can‘t formulate exactalgorithms for them.

I To solve such problems, one can use machine learning:I programs which learn to make correct decisions on some training

material and improve with experience;I thus, we train our systems on linguistic data (usually large text

collections: corpora).I Artificial neural networks are one of popular machine learning

approaches for language modeling.

Deep learning renaissance

I ‘Deep learning’ is training and using multi-layered artificial neural networks.

I After long ‘winter’ (since 60s and 70s), it is now again popular.

I Deep neural approaches are very efficient in NLP.

I ‘Do we need anything except neural networks now?’

I Another reason for the recent boost of interest towards our discipline.

9

Page 66: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Machine learningI NLP is now being rapidly transformed by another field(s):I data science and machine learning.

I Some problems are so complex that we can‘t formulate exactalgorithms for them.

I To solve such problems, one can use machine learning:I programs which learn to make correct decisions on some training

material and improve with experience;I thus, we train our systems on linguistic data (usually large text

collections: corpora).I Artificial neural networks are one of popular machine learning

approaches for language modeling.

Deep learning renaissance

I ‘Deep learning’ is training and using multi-layered artificial neural networks.

I After long ‘winter’ (since 60s and 70s), it is now again popular.

I Deep neural approaches are very efficient in NLP.

I ‘Do we need anything except neural networks now?’

I Another reason for the recent boost of interest towards our discipline.

9

Page 67: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Machine learningI NLP is now being rapidly transformed by another field(s):I data science and machine learning.

I Some problems are so complex that we can‘t formulate exactalgorithms for them.

I To solve such problems, one can use machine learning:I programs which learn to make correct decisions on some training

material and improve with experience;I thus, we train our systems on linguistic data (usually large text

collections: corpora).I Artificial neural networks are one of popular machine learning

approaches for language modeling.

Deep learning renaissance

I ‘Deep learning’ is training and using multi-layered artificial neural networks.

I After long ‘winter’ (since 60s and 70s), it is now again popular.

I Deep neural approaches are very efficient in NLP.

I ‘Do we need anything except neural networks now?’

I Another reason for the recent boost of interest towards our discipline.

9

Page 68: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Machine learningI NLP is now being rapidly transformed by another field(s):I data science and machine learning.

I Some problems are so complex that we can‘t formulate exactalgorithms for them.

I To solve such problems, one can use machine learning:I programs which learn to make correct decisions on some training

material and improve with experience;I thus, we train our systems on linguistic data (usually large text

collections: corpora).I Artificial neural networks are one of popular machine learning

approaches for language modeling.

Deep learning renaissance

I ‘Deep learning’ is training and using multi-layered artificial neural networks.

I After long ‘winter’ (since 60s and 70s), it is now again popular.

I Deep neural approaches are very efficient in NLP.

I ‘Do we need anything except neural networks now?’

I Another reason for the recent boost of interest towards our discipline.

9

Page 69: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Machine learningI NLP is now being rapidly transformed by another field(s):I data science and machine learning.

I Some problems are so complex that we can‘t formulate exactalgorithms for them.

I To solve such problems, one can use machine learning:I programs which learn to make correct decisions on some training

material and improve with experience;I thus, we train our systems on linguistic data (usually large text

collections: corpora).I Artificial neural networks are one of popular machine learning

approaches for language modeling.

Deep learning renaissance

I ‘Deep learning’ is training and using multi-layered artificial neural networks.

I After long ‘winter’ (since 60s and 70s), it is now again popular.

I Deep neural approaches are very efficient in NLP.

I ‘Do we need anything except neural networks now?’

I Another reason for the recent boost of interest towards our discipline.

9

Page 70: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Machine learningI NLP is now being rapidly transformed by another field(s):I data science and machine learning.

I Some problems are so complex that we can‘t formulate exactalgorithms for them.

I To solve such problems, one can use machine learning:I programs which learn to make correct decisions on some training

material and improve with experience;I thus, we train our systems on linguistic data (usually large text

collections: corpora).I Artificial neural networks are one of popular machine learning

approaches for language modeling.

Deep learning renaissance

I ‘Deep learning’ is training and using multi-layered artificial neural networks.

I After long ‘winter’ (since 60s and 70s), it is now again popular.

I Deep neural approaches are very efficient in NLP.

I ‘Do we need anything except neural networks now?’

I Another reason for the recent boost of interest towards our discipline.9

Page 71: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Problems and challenges

NLP has its problems

I equity and diversity;I traditional reviewing schemes conflicting with ArXiv:

I how to preserve anonymity?I preprint publishing is good in disseminating science and making it

open, but...I ...people can use ArXiv for flag-planting, and to simply circumvent the

peer-review process.I machine learning models amplifying biases and discrimination in

data [Zhao et al., 2017]I sometimes research success depends on computational power:

I ‘...do we have enough GPUs?’

10

Page 72: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Problems and challenges

NLP has its problemsI equity and diversity;

I traditional reviewing schemes conflicting with ArXiv:I how to preserve anonymity?I preprint publishing is good in disseminating science and making it

open, but...I ...people can use ArXiv for flag-planting, and to simply circumvent the

peer-review process.I machine learning models amplifying biases and discrimination in

data [Zhao et al., 2017]I sometimes research success depends on computational power:

I ‘...do we have enough GPUs?’

10

Page 73: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Problems and challenges

NLP has its problemsI equity and diversity;I traditional reviewing schemes conflicting with ArXiv:

I how to preserve anonymity?I preprint publishing is good in disseminating science and making it

open, but...I ...people can use ArXiv for flag-planting, and to simply circumvent the

peer-review process.I machine learning models amplifying biases and discrimination in

data [Zhao et al., 2017]I sometimes research success depends on computational power:

I ‘...do we have enough GPUs?’

10

Page 74: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Problems and challenges

NLP has its problemsI equity and diversity;I traditional reviewing schemes conflicting with ArXiv:

I how to preserve anonymity?

I preprint publishing is good in disseminating science and making itopen, but...

I ...people can use ArXiv for flag-planting, and to simply circumvent thepeer-review process.

I machine learning models amplifying biases and discrimination indata [Zhao et al., 2017]

I sometimes research success depends on computational power:I ‘...do we have enough GPUs?’

10

Page 75: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Problems and challenges

NLP has its problemsI equity and diversity;I traditional reviewing schemes conflicting with ArXiv:

I how to preserve anonymity?I preprint publishing is good in disseminating science and making it

open, but...

I ...people can use ArXiv for flag-planting, and to simply circumvent thepeer-review process.

I machine learning models amplifying biases and discrimination indata [Zhao et al., 2017]

I sometimes research success depends on computational power:I ‘...do we have enough GPUs?’

10

Page 76: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Problems and challenges

NLP has its problemsI equity and diversity;I traditional reviewing schemes conflicting with ArXiv:

I how to preserve anonymity?I preprint publishing is good in disseminating science and making it

open, but...I ...people can use ArXiv for flag-planting, and to simply circumvent the

peer-review process.

I machine learning models amplifying biases and discrimination indata [Zhao et al., 2017]

I sometimes research success depends on computational power:I ‘...do we have enough GPUs?’

10

Page 77: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Problems and challenges

NLP has its problemsI equity and diversity;I traditional reviewing schemes conflicting with ArXiv:

I how to preserve anonymity?I preprint publishing is good in disseminating science and making it

open, but...I ...people can use ArXiv for flag-planting, and to simply circumvent the

peer-review process.I machine learning models amplifying biases and discrimination in

data [Zhao et al., 2017]

I sometimes research success depends on computational power:I ‘...do we have enough GPUs?’

10

Page 78: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Problems and challenges

NLP has its problemsI equity and diversity;I traditional reviewing schemes conflicting with ArXiv:

I how to preserve anonymity?I preprint publishing is good in disseminating science and making it

open, but...I ...people can use ArXiv for flag-planting, and to simply circumvent the

peer-review process.I machine learning models amplifying biases and discrimination in

data [Zhao et al., 2017]I sometimes research success depends on computational power:

I ‘...do we have enough GPUs?’

10

Page 79: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Science?

I People wonder:I ‘What are your research questions?’I ‘Just lots of numbers with very small differences?’

I Is it a science or an engineering discipline?I Or may be CL is a science and NLP is its application towards

empirical problems?I Motivation for research can be different:

1. trying to provide a computational explanation for linguistic orpsycholinguistic phenomenon;

2. trying to provide a working component of a speech or naturallanguage system.

I Do our top-tier conferences belong to CL or to NLP then?I The overwhelming majority of papers are empirical today.I No final answer yet.

11

Page 80: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Science?

I People wonder:I ‘What are your research questions?’I ‘Just lots of numbers with very small differences?’

I Is it a science or an engineering discipline?

I Or may be CL is a science and NLP is its application towardsempirical problems?

I Motivation for research can be different:1. trying to provide a computational explanation for linguistic or

psycholinguistic phenomenon;2. trying to provide a working component of a speech or natural

language system.I Do our top-tier conferences belong to CL or to NLP then?I The overwhelming majority of papers are empirical today.I No final answer yet.

11

Page 81: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Science?

I People wonder:I ‘What are your research questions?’I ‘Just lots of numbers with very small differences?’

I Is it a science or an engineering discipline?I Or may be CL is a science and NLP is its application towards

empirical problems?

I Motivation for research can be different:1. trying to provide a computational explanation for linguistic or

psycholinguistic phenomenon;2. trying to provide a working component of a speech or natural

language system.I Do our top-tier conferences belong to CL or to NLP then?I The overwhelming majority of papers are empirical today.I No final answer yet.

11

Page 82: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Science?

I People wonder:I ‘What are your research questions?’I ‘Just lots of numbers with very small differences?’

I Is it a science or an engineering discipline?I Or may be CL is a science and NLP is its application towards

empirical problems?I Motivation for research can be different:

1. trying to provide a computational explanation for linguistic orpsycholinguistic phenomenon;

2. trying to provide a working component of a speech or naturallanguage system.

I Do our top-tier conferences belong to CL or to NLP then?I The overwhelming majority of papers are empirical today.I No final answer yet.

11

Page 83: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Science?

I People wonder:I ‘What are your research questions?’I ‘Just lots of numbers with very small differences?’

I Is it a science or an engineering discipline?I Or may be CL is a science and NLP is its application towards

empirical problems?I Motivation for research can be different:

1. trying to provide a computational explanation for linguistic orpsycholinguistic phenomenon;

2. trying to provide a working component of a speech or naturallanguage system.

I Do our top-tier conferences belong to CL or to NLP then?I The overwhelming majority of papers are empirical today.I No final answer yet.

11

Page 84: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Science?

I People wonder:I ‘What are your research questions?’I ‘Just lots of numbers with very small differences?’

I Is it a science or an engineering discipline?I Or may be CL is a science and NLP is its application towards

empirical problems?I Motivation for research can be different:

1. trying to provide a computational explanation for linguistic orpsycholinguistic phenomenon;

2. trying to provide a working component of a speech or naturallanguage system.

I Do our top-tier conferences belong to CL or to NLP then?I The overwhelming majority of papers are empirical today.I No final answer yet.

11

Page 85: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Science?

I People wonder:I ‘What are your research questions?’I ‘Just lots of numbers with very small differences?’

I Is it a science or an engineering discipline?I Or may be CL is a science and NLP is its application towards

empirical problems?I Motivation for research can be different:

1. trying to provide a computational explanation for linguistic orpsycholinguistic phenomenon;

2. trying to provide a working component of a speech or naturallanguage system.

I Do our top-tier conferences belong to CL or to NLP then?

I The overwhelming majority of papers are empirical today.I No final answer yet.

11

Page 86: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Science?

I People wonder:I ‘What are your research questions?’I ‘Just lots of numbers with very small differences?’

I Is it a science or an engineering discipline?I Or may be CL is a science and NLP is its application towards

empirical problems?I Motivation for research can be different:

1. trying to provide a computational explanation for linguistic orpsycholinguistic phenomenon;

2. trying to provide a working component of a speech or naturallanguage system.

I Do our top-tier conferences belong to CL or to NLP then?I The overwhelming majority of papers are empirical today.

I No final answer yet.

11

Page 87: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Science?

I People wonder:I ‘What are your research questions?’I ‘Just lots of numbers with very small differences?’

I Is it a science or an engineering discipline?I Or may be CL is a science and NLP is its application towards

empirical problems?I Motivation for research can be different:

1. trying to provide a computational explanation for linguistic orpsycholinguistic phenomenon;

2. trying to provide a working component of a speech or naturallanguage system.

I Do our top-tier conferences belong to CL or to NLP then?I The overwhelming majority of papers are empirical today.I No final answer yet.

11

Page 88: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Language IS complicated

‘...human language is magnificent, and complex, and challenging. Ithas tons of nuances, and corners, and oddities, and surprises.While natural language processing researchers, and natural languagegeneration researchers—and linguists! who do a lot of the heavylifting—made some impressive advances towards our understanding oflanguage and how to process it, we are still just barely scratching thesurface on this.’

12

Page 89: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Language IS complicated

‘...human language is magnificent, and complex, and challenging. Ithas tons of nuances, and corners, and oddities, and surprises.

While natural language processing researchers, and natural languagegeneration researchers—and linguists! who do a lot of the heavylifting—made some impressive advances towards our understanding oflanguage and how to process it, we are still just barely scratching thesurface on this.’

12

Page 90: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Language IS complicated

‘...human language is magnificent, and complex, and challenging. Ithas tons of nuances, and corners, and oddities, and surprises.While natural language processing researchers, and natural languagegeneration researchers—and linguists! who do a lot of the heavylifting—made some impressive advances towards our understanding oflanguage and how to process it, we are still just barely scratching thesurface on this.’

12

Page 91: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Interaction with traditional linguistics

Linguistics is back

I NLP is re-embracing linguistic structure now;I Even the strongest proponents of purely data-driven approaches

acknowledge it;I Linguistic structures induced into machine learning systems reduce

search space, bringing improvements [Dyer, 2017];I Language is not just sequences of words / characters / bytes.

I But what can NLP give to traditional linguistics?I Or to humanities in general?I I will now outline 2 case studies from my own research.

13

Page 92: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Interaction with traditional linguistics

Linguistics is backI NLP is re-embracing linguistic structure now;

I Even the strongest proponents of purely data-driven approachesacknowledge it;

I Linguistic structures induced into machine learning systems reducesearch space, bringing improvements [Dyer, 2017];

I Language is not just sequences of words / characters / bytes.

I But what can NLP give to traditional linguistics?I Or to humanities in general?I I will now outline 2 case studies from my own research.

13

Page 93: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Interaction with traditional linguistics

Linguistics is backI NLP is re-embracing linguistic structure now;I Even the strongest proponents of purely data-driven approaches

acknowledge it;

I Linguistic structures induced into machine learning systems reducesearch space, bringing improvements [Dyer, 2017];

I Language is not just sequences of words / characters / bytes.

I But what can NLP give to traditional linguistics?I Or to humanities in general?I I will now outline 2 case studies from my own research.

13

Page 94: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Interaction with traditional linguistics

Linguistics is backI NLP is re-embracing linguistic structure now;I Even the strongest proponents of purely data-driven approaches

acknowledge it;I Linguistic structures induced into machine learning systems reduce

search space, bringing improvements [Dyer, 2017];

I Language is not just sequences of words / characters / bytes.

I But what can NLP give to traditional linguistics?I Or to humanities in general?I I will now outline 2 case studies from my own research.

13

Page 95: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Interaction with traditional linguistics

Linguistics is backI NLP is re-embracing linguistic structure now;I Even the strongest proponents of purely data-driven approaches

acknowledge it;I Linguistic structures induced into machine learning systems reduce

search space, bringing improvements [Dyer, 2017];I Language is not just sequences of words / characters / bytes.

I But what can NLP give to traditional linguistics?I Or to humanities in general?I I will now outline 2 case studies from my own research.

13

Page 96: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Interaction with traditional linguistics

Linguistics is backI NLP is re-embracing linguistic structure now;I Even the strongest proponents of purely data-driven approaches

acknowledge it;I Linguistic structures induced into machine learning systems reduce

search space, bringing improvements [Dyer, 2017];I Language is not just sequences of words / characters / bytes.

I But what can NLP give to traditional linguistics?

I Or to humanities in general?I I will now outline 2 case studies from my own research.

13

Page 97: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Interaction with traditional linguistics

Linguistics is backI NLP is re-embracing linguistic structure now;I Even the strongest proponents of purely data-driven approaches

acknowledge it;I Linguistic structures induced into machine learning systems reduce

search space, bringing improvements [Dyer, 2017];I Language is not just sequences of words / characters / bytes.

I But what can NLP give to traditional linguistics?I Or to humanities in general?

I I will now outline 2 case studies from my own research.

13

Page 98: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Interaction with traditional linguistics

Linguistics is backI NLP is re-embracing linguistic structure now;I Even the strongest proponents of purely data-driven approaches

acknowledge it;I Linguistic structures induced into machine learning systems reduce

search space, bringing improvements [Dyer, 2017];I Language is not just sequences of words / characters / bytes.

I But what can NLP give to traditional linguistics?I Or to humanities in general?I I will now outline 2 case studies from my own research.

13

Page 99: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Contents

1 What is NLP?

2 Case 1: Redefining parts of speech

3 Case 2: Tracing diachronic semantic shifts

13

Page 100: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

‘Redefining parts of speech with word embeddings’

(Presented at the CoNLL2016, [Kutuzov et al., 2016])

‘Grammatical categories exist along a continuum which does notexhibit sharp boundaries between the categories’ [Houston, 1985]

I In natural languages, parts of speech boundaries are flexible:I Participles in English are in many respects both verbs and adjectivesI Determiners and possessive pronouns overlap

I Finding groups of words ‘on the verge’ between different PoS canreveal inconsistencies in corpus annotation.

I For that, we employed word embeddings (as in word2vec).

14

Page 101: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

‘Redefining parts of speech with word embeddings’

(Presented at the CoNLL2016, [Kutuzov et al., 2016])

‘Grammatical categories exist along a continuum which does notexhibit sharp boundaries between the categories’ [Houston, 1985]

I In natural languages, parts of speech boundaries are flexible:I Participles in English are in many respects both verbs and adjectivesI Determiners and possessive pronouns overlap

I Finding groups of words ‘on the verge’ between different PoS canreveal inconsistencies in corpus annotation.

I For that, we employed word embeddings (as in word2vec).

14

Page 102: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

‘Redefining parts of speech with word embeddings’

(Presented at the CoNLL2016, [Kutuzov et al., 2016])

‘Grammatical categories exist along a continuum which does notexhibit sharp boundaries between the categories’ [Houston, 1985]

I In natural languages, parts of speech boundaries are flexible:

I Participles in English are in many respects both verbs and adjectivesI Determiners and possessive pronouns overlap

I Finding groups of words ‘on the verge’ between different PoS canreveal inconsistencies in corpus annotation.

I For that, we employed word embeddings (as in word2vec).

14

Page 103: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

‘Redefining parts of speech with word embeddings’

(Presented at the CoNLL2016, [Kutuzov et al., 2016])

‘Grammatical categories exist along a continuum which does notexhibit sharp boundaries between the categories’ [Houston, 1985]

I In natural languages, parts of speech boundaries are flexible:I Participles in English are in many respects both verbs and adjectives

I Determiners and possessive pronouns overlap

I Finding groups of words ‘on the verge’ between different PoS canreveal inconsistencies in corpus annotation.

I For that, we employed word embeddings (as in word2vec).

14

Page 104: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

‘Redefining parts of speech with word embeddings’

(Presented at the CoNLL2016, [Kutuzov et al., 2016])

‘Grammatical categories exist along a continuum which does notexhibit sharp boundaries between the categories’ [Houston, 1985]

I In natural languages, parts of speech boundaries are flexible:I Participles in English are in many respects both verbs and adjectivesI Determiners and possessive pronouns overlap

I Finding groups of words ‘on the verge’ between different PoS canreveal inconsistencies in corpus annotation.

I For that, we employed word embeddings (as in word2vec).

14

Page 105: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

‘Redefining parts of speech with word embeddings’

(Presented at the CoNLL2016, [Kutuzov et al., 2016])

‘Grammatical categories exist along a continuum which does notexhibit sharp boundaries between the categories’ [Houston, 1985]

I In natural languages, parts of speech boundaries are flexible:I Participles in English are in many respects both verbs and adjectivesI Determiners and possessive pronouns overlap

I Finding groups of words ‘on the verge’ between different PoS canreveal inconsistencies in corpus annotation.

I For that, we employed word embeddings (as in word2vec).

14

Page 106: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

‘Redefining parts of speech with word embeddings’

(Presented at the CoNLL2016, [Kutuzov et al., 2016])

‘Grammatical categories exist along a continuum which does notexhibit sharp boundaries between the categories’ [Houston, 1985]

I In natural languages, parts of speech boundaries are flexible:I Participles in English are in many respects both verbs and adjectivesI Determiners and possessive pronouns overlap

I Finding groups of words ‘on the verge’ between different PoS canreveal inconsistencies in corpus annotation.

I For that, we employed word embeddings (as in word2vec).

14

Page 107: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Brief recap on distributional semantic models

I based on distributions of word co-occurrences in large trainingcorpora;

I represent word meaning as dense lexical vectors (wordembeddings);

I words occurring in similar contexts have similar vectors;I vector representations are continuous:

I words are in common vector space and can be more or less close toeach other.

I one can find nearest semantic neighbors of a given word bycalculating cosine similarity between vectors.

I more on word embeddings elsewhere:I https://www.academia.edu/35685709/Teaching_computers_what_

words_mean_modern_word_embedding_models

15

Page 108: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Brief recap on distributional semantic models

I based on distributions of word co-occurrences in large trainingcorpora;

I represent word meaning as dense lexical vectors (wordembeddings);

I words occurring in similar contexts have similar vectors;I vector representations are continuous:

I words are in common vector space and can be more or less close toeach other.

I one can find nearest semantic neighbors of a given word bycalculating cosine similarity between vectors.

I more on word embeddings elsewhere:I https://www.academia.edu/35685709/Teaching_computers_what_

words_mean_modern_word_embedding_models

15

Page 109: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Brief recap on distributional semantic models

I based on distributions of word co-occurrences in large trainingcorpora;

I represent word meaning as dense lexical vectors (wordembeddings);

I words occurring in similar contexts have similar vectors;

I vector representations are continuous:I words are in common vector space and can be more or less close to

each other.I one can find nearest semantic neighbors of a given word by

calculating cosine similarity between vectors.I more on word embeddings elsewhere:

I https://www.academia.edu/35685709/Teaching_computers_what_

words_mean_modern_word_embedding_models

15

Page 110: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Brief recap on distributional semantic models

I based on distributions of word co-occurrences in large trainingcorpora;

I represent word meaning as dense lexical vectors (wordembeddings);

I words occurring in similar contexts have similar vectors;I vector representations are continuous:

I words are in common vector space and can be more or less close toeach other.

I one can find nearest semantic neighbors of a given word bycalculating cosine similarity between vectors.

I more on word embeddings elsewhere:I https://www.academia.edu/35685709/Teaching_computers_what_

words_mean_modern_word_embedding_models

15

Page 111: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Brief recap on distributional semantic models

I based on distributions of word co-occurrences in large trainingcorpora;

I represent word meaning as dense lexical vectors (wordembeddings);

I words occurring in similar contexts have similar vectors;I vector representations are continuous:

I words are in common vector space and can be more or less close toeach other.

I one can find nearest semantic neighbors of a given word bycalculating cosine similarity between vectors.

I more on word embeddings elsewhere:I https://www.academia.edu/35685709/Teaching_computers_what_

words_mean_modern_word_embedding_models

15

Page 112: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Brief recap on distributional semantic models

I based on distributions of word co-occurrences in large trainingcorpora;

I represent word meaning as dense lexical vectors (wordembeddings);

I words occurring in similar contexts have similar vectors;I vector representations are continuous:

I words are in common vector space and can be more or less close toeach other.

I one can find nearest semantic neighbors of a given word bycalculating cosine similarity between vectors.

I more on word embeddings elsewhere:I https://www.academia.edu/35685709/Teaching_computers_what_

words_mean_modern_word_embedding_models

15

Page 113: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Brief recap on distributional semantic models

I based on distributions of word co-occurrences in large trainingcorpora;

I represent word meaning as dense lexical vectors (wordembeddings);

I words occurring in similar contexts have similar vectors;I vector representations are continuous:

I words are in common vector space and can be more or less close toeach other.

I one can find nearest semantic neighbors of a given word bycalculating cosine similarity between vectors.

I more on word embeddings elsewhere:I https://www.academia.edu/35685709/Teaching_computers_what_

words_mean_modern_word_embedding_models

15

Page 114: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Try yourself!

Word embedding models for English and Norwegian onlineYou can try our WebVectors demo service:

http://vectors.nlpl.eu/explore/embeddings/

(mobile-friendly)

16

Page 115: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 1: Redefining parts of speech

General ideaI PoS can be inferred from word embeddings;

I Classification problem: word vector as input, PoS as output;I Words with incorrect predictions are ‘outliers’: their distributional

patterns differ from other words in the same class.

DataBritish National Corpus (BNC):I About 89M words;I We replaced words with their lemmas and Universal PoS tags:

I ‘love_VERB’I ‘love_NOUN ’

Tags used:ADJ, ADP, ADV, AUX, CONJ, DET, INTJ, NOUN, NUM, PART, PRON,PROPN, SCONJ, SYM, VERB, X

17

Page 116: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 1: Redefining parts of speech

General ideaI PoS can be inferred from word embeddings;I Classification problem: word vector as input, PoS as output;

I Words with incorrect predictions are ‘outliers’: their distributionalpatterns differ from other words in the same class.

DataBritish National Corpus (BNC):I About 89M words;I We replaced words with their lemmas and Universal PoS tags:

I ‘love_VERB’I ‘love_NOUN ’

Tags used:ADJ, ADP, ADV, AUX, CONJ, DET, INTJ, NOUN, NUM, PART, PRON,PROPN, SCONJ, SYM, VERB, X

17

Page 117: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 1: Redefining parts of speech

General ideaI PoS can be inferred from word embeddings;I Classification problem: word vector as input, PoS as output;I Words with incorrect predictions are ‘outliers’: their distributional

patterns differ from other words in the same class.

DataBritish National Corpus (BNC):I About 89M words;I We replaced words with their lemmas and Universal PoS tags:

I ‘love_VERB’I ‘love_NOUN ’

Tags used:ADJ, ADP, ADV, AUX, CONJ, DET, INTJ, NOUN, NUM, PART, PRON,PROPN, SCONJ, SYM, VERB, X

17

Page 118: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 1: Redefining parts of speech

General ideaI PoS can be inferred from word embeddings;I Classification problem: word vector as input, PoS as output;I Words with incorrect predictions are ‘outliers’: their distributional

patterns differ from other words in the same class.

DataBritish National Corpus (BNC):I About 89M words;I We replaced words with their lemmas and Universal PoS tags:

I ‘love_VERB’I ‘love_NOUN ’

Tags used:ADJ, ADP, ADV, AUX, CONJ, DET, INTJ, NOUN, NUM, PART, PRON,PROPN, SCONJ, SYM, VERB, X

17

Page 119: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

WorkflowI Continuous Skipgram model [Mikolov et al., 2013] to learn word

embeddings from the BNC;

I Logistic regression multinomial classifier trained to predict PoSbased on embeddings:

I The training set: the BNC 10 000 most frequent words;I The test set: the next 17 000 top frequent words;I The 2nd test set: tokens from Universal Dependencies Treebank

18

Page 120: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

WorkflowI Continuous Skipgram model [Mikolov et al., 2013] to learn word

embeddings from the BNC;I Logistic regression multinomial classifier trained to predict PoS

based on embeddings:

I The training set: the BNC 10 000 most frequent words;I The test set: the next 17 000 top frequent words;I The 2nd test set: tokens from Universal Dependencies Treebank

18

Page 121: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

WorkflowI Continuous Skipgram model [Mikolov et al., 2013] to learn word

embeddings from the BNC;I Logistic regression multinomial classifier trained to predict PoS

based on embeddings:I The training set: the BNC 10 000 most frequent words;

I The test set: the next 17 000 top frequent words;I The 2nd test set: tokens from Universal Dependencies Treebank

18

Page 122: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

WorkflowI Continuous Skipgram model [Mikolov et al., 2013] to learn word

embeddings from the BNC;I Logistic regression multinomial classifier trained to predict PoS

based on embeddings:I The training set: the BNC 10 000 most frequent words;I The test set: the next 17 000 top frequent words;

I The 2nd test set: tokens from Universal Dependencies Treebank

18

Page 123: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

WorkflowI Continuous Skipgram model [Mikolov et al., 2013] to learn word

embeddings from the BNC;I Logistic regression multinomial classifier trained to predict PoS

based on embeddings:I The training set: the BNC 10 000 most frequent words;I The test set: the next 17 000 top frequent words;I The 2nd test set: tokens from Universal Dependencies Treebank

18

Page 124: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Classification results

Classifier F-score (training set)

Training setBaseline 1-feature classifier 0.22Logistic regression 0.98

Test setLogistic regression 0.91

UD Treebank test setLogistic regression (OOV words omitted) 0.99

19

Page 125: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Classification results

Classifier F-score (training set)

Training setBaseline 1-feature classifier 0.22Logistic regression 0.98

Test setLogistic regression 0.91

UD Treebank test setLogistic regression (OOV words omitted) 0.99

19

Page 126: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Not from this crowd: analyzing outliers

Interesting errors with high frequencyI VERB → ADJ set of verbs dominantly used in passive: ‘to

intertwine’, ‘to disillusion’;

I NOUN → NUM reveals amounts and percentages (‘£70’, ‘33%’,‘$1’) tagged as nouns: a controversial decision;

I NUM → NOUN is mostly years and decades (‘the sixties’) tagged inthe BNC as numerals;

I In the BNC, ‘£50’ is a noun, but ‘1776’ a numeral!I Possible minor inconsistencies in the annotation strategy;I similar problem in the Penn Treebank [Manning, 2011].

I ADV → ADJ systematic error: adjectives like ‘plain’, ‘clear ’ or‘sharp’ erroneously tagged in the corpus as adverbs.

20

Page 127: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Not from this crowd: analyzing outliers

Interesting errors with high frequencyI VERB → ADJ set of verbs dominantly used in passive: ‘to

intertwine’, ‘to disillusion’;I NOUN → NUM reveals amounts and percentages (‘£70’, ‘33%’,

‘$1’) tagged as nouns: a controversial decision;

I NUM → NOUN is mostly years and decades (‘the sixties’) tagged inthe BNC as numerals;

I In the BNC, ‘£50’ is a noun, but ‘1776’ a numeral!I Possible minor inconsistencies in the annotation strategy;I similar problem in the Penn Treebank [Manning, 2011].

I ADV → ADJ systematic error: adjectives like ‘plain’, ‘clear ’ or‘sharp’ erroneously tagged in the corpus as adverbs.

20

Page 128: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Not from this crowd: analyzing outliers

Interesting errors with high frequencyI VERB → ADJ set of verbs dominantly used in passive: ‘to

intertwine’, ‘to disillusion’;I NOUN → NUM reveals amounts and percentages (‘£70’, ‘33%’,

‘$1’) tagged as nouns: a controversial decision;I NUM → NOUN is mostly years and decades (‘the sixties’) tagged in

the BNC as numerals;

I In the BNC, ‘£50’ is a noun, but ‘1776’ a numeral!I Possible minor inconsistencies in the annotation strategy;I similar problem in the Penn Treebank [Manning, 2011].

I ADV → ADJ systematic error: adjectives like ‘plain’, ‘clear ’ or‘sharp’ erroneously tagged in the corpus as adverbs.

20

Page 129: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Not from this crowd: analyzing outliers

Interesting errors with high frequencyI VERB → ADJ set of verbs dominantly used in passive: ‘to

intertwine’, ‘to disillusion’;I NOUN → NUM reveals amounts and percentages (‘£70’, ‘33%’,

‘$1’) tagged as nouns: a controversial decision;I NUM → NOUN is mostly years and decades (‘the sixties’) tagged in

the BNC as numerals;I In the BNC, ‘£50’ is a noun, but ‘1776’ a numeral!

I Possible minor inconsistencies in the annotation strategy;I similar problem in the Penn Treebank [Manning, 2011].

I ADV → ADJ systematic error: adjectives like ‘plain’, ‘clear ’ or‘sharp’ erroneously tagged in the corpus as adverbs.

20

Page 130: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Not from this crowd: analyzing outliers

Interesting errors with high frequencyI VERB → ADJ set of verbs dominantly used in passive: ‘to

intertwine’, ‘to disillusion’;I NOUN → NUM reveals amounts and percentages (‘£70’, ‘33%’,

‘$1’) tagged as nouns: a controversial decision;I NUM → NOUN is mostly years and decades (‘the sixties’) tagged in

the BNC as numerals;I In the BNC, ‘£50’ is a noun, but ‘1776’ a numeral!I Possible minor inconsistencies in the annotation strategy;

I similar problem in the Penn Treebank [Manning, 2011].I ADV → ADJ systematic error: adjectives like ‘plain’, ‘clear ’ or

‘sharp’ erroneously tagged in the corpus as adverbs.

20

Page 131: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Not from this crowd: analyzing outliers

Interesting errors with high frequencyI VERB → ADJ set of verbs dominantly used in passive: ‘to

intertwine’, ‘to disillusion’;I NOUN → NUM reveals amounts and percentages (‘£70’, ‘33%’,

‘$1’) tagged as nouns: a controversial decision;I NUM → NOUN is mostly years and decades (‘the sixties’) tagged in

the BNC as numerals;I In the BNC, ‘£50’ is a noun, but ‘1776’ a numeral!I Possible minor inconsistencies in the annotation strategy;I similar problem in the Penn Treebank [Manning, 2011].

I ADV → ADJ systematic error: adjectives like ‘plain’, ‘clear ’ or‘sharp’ erroneously tagged in the corpus as adverbs.

20

Page 132: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Not from this crowd: analyzing outliers

Interesting errors with high frequencyI VERB → ADJ set of verbs dominantly used in passive: ‘to

intertwine’, ‘to disillusion’;I NOUN → NUM reveals amounts and percentages (‘£70’, ‘33%’,

‘$1’) tagged as nouns: a controversial decision;I NUM → NOUN is mostly years and decades (‘the sixties’) tagged in

the BNC as numerals;I In the BNC, ‘£50’ is a noun, but ‘1776’ a numeral!I Possible minor inconsistencies in the annotation strategy;I similar problem in the Penn Treebank [Manning, 2011].

I ADV → ADJ systematic error: adjectives like ‘plain’, ‘clear ’ or‘sharp’ erroneously tagged in the corpus as adverbs.

20

Page 133: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Not from this crowd: analyzing outliers

Interesting errors with high coverageI SCONJ → ADV : ‘seeing’ and ‘immediately ’. Clear tagging errors,

mostly in initial positions: ‘Immediately, she lowered the gun’;

I ADP → ADJ separate word group: ‘cross’, ‘pre’ and ‘pro’ (‘Didanyone encounter any trouble from Hibs fans in Edinburgh preseason?’). Closer to adjectives or adverbs than to prepositions?

Intermediate findings

1. ‘Boundary cases’ detected by classifying embeddings revealsub-classes of words on the verge between different PoS.

2. We can quickly discover systematic errors or inconsistencies in PoSannotations, whether they be automatic or manual.

21

Page 134: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Not from this crowd: analyzing outliers

Interesting errors with high coverageI SCONJ → ADV : ‘seeing’ and ‘immediately ’. Clear tagging errors,

mostly in initial positions: ‘Immediately, she lowered the gun’;I ADP → ADJ separate word group: ‘cross’, ‘pre’ and ‘pro’ (‘Did

anyone encounter any trouble from Hibs fans in Edinburgh preseason?’). Closer to adjectives or adverbs than to prepositions?

Intermediate findings

1. ‘Boundary cases’ detected by classifying embeddings revealsub-classes of words on the verge between different PoS.

2. We can quickly discover systematic errors or inconsistencies in PoSannotations, whether they be automatic or manual.

21

Page 135: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Not from this crowd: analyzing outliers

Interesting errors with high coverageI SCONJ → ADV : ‘seeing’ and ‘immediately ’. Clear tagging errors,

mostly in initial positions: ‘Immediately, she lowered the gun’;I ADP → ADJ separate word group: ‘cross’, ‘pre’ and ‘pro’ (‘Did

anyone encounter any trouble from Hibs fans in Edinburgh preseason?’). Closer to adjectives or adverbs than to prepositions?

Intermediate findings

1. ‘Boundary cases’ detected by classifying embeddings revealsub-classes of words on the verge between different PoS.

2. We can quickly discover systematic errors or inconsistencies in PoSannotations, whether they be automatic or manual.

21

Page 136: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Not from this crowd: analyzing outliers

Interesting errors with high coverageI SCONJ → ADV : ‘seeing’ and ‘immediately ’. Clear tagging errors,

mostly in initial positions: ‘Immediately, she lowered the gun’;I ADP → ADJ separate word group: ‘cross’, ‘pre’ and ‘pro’ (‘Did

anyone encounter any trouble from Hibs fans in Edinburgh preseason?’). Closer to adjectives or adverbs than to prepositions?

Intermediate findings

1. ‘Boundary cases’ detected by classifying embeddings revealsub-classes of words on the verge between different PoS.

2. We can quickly discover systematic errors or inconsistencies in PoSannotations, whether they be automatic or manual.

21

Page 137: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Embeddings as PoS predictors

What if we employ KNN classifier instead of logistic regression?

I Worse: accuracy 0.913 on the training set, 0.81 on the test set;I K-neighbors fails to separate important features from all the others,

uses all the dimensions equally;I logistic regression learns to find the relevant features.

How many features are really important for the classifier?I We ranked all embedding components (features, vector dimensions)

by their correlation to PoS class;I Then, trained classifiers on more and more of top-ranked features;I Measured their accuracy on the training set.

22

Page 138: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Embeddings as PoS predictors

What if we employ KNN classifier instead of logistic regression?I Worse: accuracy 0.913 on the training set, 0.81 on the test set;

I K-neighbors fails to separate important features from all the others,uses all the dimensions equally;

I logistic regression learns to find the relevant features.

How many features are really important for the classifier?I We ranked all embedding components (features, vector dimensions)

by their correlation to PoS class;I Then, trained classifiers on more and more of top-ranked features;I Measured their accuracy on the training set.

22

Page 139: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Embeddings as PoS predictors

What if we employ KNN classifier instead of logistic regression?I Worse: accuracy 0.913 on the training set, 0.81 on the test set;I K-neighbors fails to separate important features from all the others,

uses all the dimensions equally;

I logistic regression learns to find the relevant features.

How many features are really important for the classifier?I We ranked all embedding components (features, vector dimensions)

by their correlation to PoS class;I Then, trained classifiers on more and more of top-ranked features;I Measured their accuracy on the training set.

22

Page 140: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Embeddings as PoS predictors

What if we employ KNN classifier instead of logistic regression?I Worse: accuracy 0.913 on the training set, 0.81 on the test set;I K-neighbors fails to separate important features from all the others,

uses all the dimensions equally;I logistic regression learns to find the relevant features.

How many features are really important for the classifier?I We ranked all embedding components (features, vector dimensions)

by their correlation to PoS class;I Then, trained classifiers on more and more of top-ranked features;I Measured their accuracy on the training set.

22

Page 141: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Embeddings as PoS predictors

What if we employ KNN classifier instead of logistic regression?I Worse: accuracy 0.913 on the training set, 0.81 on the test set;I K-neighbors fails to separate important features from all the others,

uses all the dimensions equally;I logistic regression learns to find the relevant features.

How many features are really important for the classifier?

I We ranked all embedding components (features, vector dimensions)by their correlation to PoS class;

I Then, trained classifiers on more and more of top-ranked features;I Measured their accuracy on the training set.

22

Page 142: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Embeddings as PoS predictors

What if we employ KNN classifier instead of logistic regression?I Worse: accuracy 0.913 on the training set, 0.81 on the test set;I K-neighbors fails to separate important features from all the others,

uses all the dimensions equally;I logistic regression learns to find the relevant features.

How many features are really important for the classifier?I We ranked all embedding components (features, vector dimensions)

by their correlation to PoS class;

I Then, trained classifiers on more and more of top-ranked features;I Measured their accuracy on the training set.

22

Page 143: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Embeddings as PoS predictors

What if we employ KNN classifier instead of logistic regression?I Worse: accuracy 0.913 on the training set, 0.81 on the test set;I K-neighbors fails to separate important features from all the others,

uses all the dimensions equally;I logistic regression learns to find the relevant features.

How many features are really important for the classifier?I We ranked all embedding components (features, vector dimensions)

by their correlation to PoS class;I Then, trained classifiers on more and more of top-ranked features;

I Measured their accuracy on the training set.

22

Page 144: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Embeddings as PoS predictors

What if we employ KNN classifier instead of logistic regression?I Worse: accuracy 0.913 on the training set, 0.81 on the test set;I K-neighbors fails to separate important features from all the others,

uses all the dimensions equally;I logistic regression learns to find the relevant features.

How many features are really important for the classifier?I We ranked all embedding components (features, vector dimensions)

by their correlation to PoS class;I Then, trained classifiers on more and more of top-ranked features;I Measured their accuracy on the training set.

22

Page 145: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Embeddings as PoS predictors

Classifier accuracy depending on the number of used vectorcomponents

Part of speech affiliation is distributed among many components of theword embeddings: not concentrated in few features.

23

Page 146: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Embeddings as PoS predictors

Classifier accuracy depending on the number of used vectorcomponents

Part of speech affiliation is distributed among many components of theword embeddings: not concentrated in few features.

23

Page 147: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 1: Redefining parts of speech

Summary for the Case 1I Word co-occurrences yield robust data about part of speech word

clusters;

I This is precisely because part of speech boundaries are acontinuum;

I PoS is rather a non-categorical linguistic phenomenon;I This knowledge is distributed among many vector components (at

least 100 in our case of a 300-dimensional model).

24

Page 148: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 1: Redefining parts of speech

Summary for the Case 1I Word co-occurrences yield robust data about part of speech word

clusters;I This is precisely because part of speech boundaries are a

continuum;I PoS is rather a non-categorical linguistic phenomenon;

I This knowledge is distributed among many vector components (atleast 100 in our case of a 300-dimensional model).

24

Page 149: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 1: Redefining parts of speech

Summary for the Case 1I Word co-occurrences yield robust data about part of speech word

clusters;I This is precisely because part of speech boundaries are a

continuum;I PoS is rather a non-categorical linguistic phenomenon;I This knowledge is distributed among many vector components (at

least 100 in our case of a 300-dimensional model).

24

Page 150: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Contents

1 What is NLP?

2 Case 1: Redefining parts of speech

3 Case 2: Tracing diachronic semantic shifts

24

Page 151: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Armed conflicts in time‘Temporal dynamics of semantic relations in word embeddings:an application to predicting armed conflict participants’(presented at the EMNLP2017, [Kutuzov et al., 2017])

General overviewWe employed diachronic word embedding models in the task oftemporal analogical reasoning for armed conflicts relations, spanningover 16 years (1994–2010).

UCDP data

25

Page 152: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Armed conflicts in time‘Temporal dynamics of semantic relations in word embeddings:an application to predicting armed conflict participants’(presented at the EMNLP2017, [Kutuzov et al., 2017])

General overviewWe employed diachronic word embedding models in the task oftemporal analogical reasoning for armed conflicts relations, spanningover 16 years (1994–2010).

UCDP data

25

Page 153: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Armed conflicts in time‘Temporal dynamics of semantic relations in word embeddings:an application to predicting armed conflict participants’(presented at the EMNLP2017, [Kutuzov et al., 2017])

General overviewWe employed diachronic word embedding models in the task oftemporal analogical reasoning for armed conflicts relations, spanningover 16 years (1994–2010).

UCDP data

25

Page 154: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Armed conflicts in time

Gold standardI Ground truth: the UCDP/PRIO Armed Conflict Dataset

(http://ucdp.uu.se/) maintained by the Uppsala Conflict DataProgram and the Peace Research Institute Oslo.

I A manually annotated geographical and temporal dataset withinformation on armed conflicts all over the world from 1946 to thepresent [Gleditsch et al., 2002].

I Particularly, the UCDP Conflict Termination dataset [Kreutz, 2010]:I starting and ending dates of armed conflicts between years 1994 and

2010;I 2 sides in each: sideA is a government, sideB is an insurgent group.

I Resulting test set of 673 conflicts, with 137 uniqueLocation–Insurgent pairs.

I We know what armed groups were active and when;I ...now we can try to extract the same data directly from texts.

26

Page 155: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Armed conflicts in time

Gold standardI Ground truth: the UCDP/PRIO Armed Conflict Dataset

(http://ucdp.uu.se/) maintained by the Uppsala Conflict DataProgram and the Peace Research Institute Oslo.

I A manually annotated geographical and temporal dataset withinformation on armed conflicts all over the world from 1946 to thepresent [Gleditsch et al., 2002].

I Particularly, the UCDP Conflict Termination dataset [Kreutz, 2010]:I starting and ending dates of armed conflicts between years 1994 and

2010;I 2 sides in each: sideA is a government, sideB is an insurgent group.

I Resulting test set of 673 conflicts, with 137 uniqueLocation–Insurgent pairs.

I We know what armed groups were active and when;I ...now we can try to extract the same data directly from texts.

26

Page 156: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Armed conflicts in time

Gold standardI Ground truth: the UCDP/PRIO Armed Conflict Dataset

(http://ucdp.uu.se/) maintained by the Uppsala Conflict DataProgram and the Peace Research Institute Oslo.

I A manually annotated geographical and temporal dataset withinformation on armed conflicts all over the world from 1946 to thepresent [Gleditsch et al., 2002].

I Particularly, the UCDP Conflict Termination dataset [Kreutz, 2010]:

I starting and ending dates of armed conflicts between years 1994 and2010;

I 2 sides in each: sideA is a government, sideB is an insurgent group.I Resulting test set of 673 conflicts, with 137 unique

Location–Insurgent pairs.I We know what armed groups were active and when;I ...now we can try to extract the same data directly from texts.

26

Page 157: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Armed conflicts in time

Gold standardI Ground truth: the UCDP/PRIO Armed Conflict Dataset

(http://ucdp.uu.se/) maintained by the Uppsala Conflict DataProgram and the Peace Research Institute Oslo.

I A manually annotated geographical and temporal dataset withinformation on armed conflicts all over the world from 1946 to thepresent [Gleditsch et al., 2002].

I Particularly, the UCDP Conflict Termination dataset [Kreutz, 2010]:I starting and ending dates of armed conflicts between years 1994 and

2010;

I 2 sides in each: sideA is a government, sideB is an insurgent group.I Resulting test set of 673 conflicts, with 137 unique

Location–Insurgent pairs.I We know what armed groups were active and when;I ...now we can try to extract the same data directly from texts.

26

Page 158: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Armed conflicts in time

Gold standardI Ground truth: the UCDP/PRIO Armed Conflict Dataset

(http://ucdp.uu.se/) maintained by the Uppsala Conflict DataProgram and the Peace Research Institute Oslo.

I A manually annotated geographical and temporal dataset withinformation on armed conflicts all over the world from 1946 to thepresent [Gleditsch et al., 2002].

I Particularly, the UCDP Conflict Termination dataset [Kreutz, 2010]:I starting and ending dates of armed conflicts between years 1994 and

2010;I 2 sides in each: sideA is a government, sideB is an insurgent group.

I Resulting test set of 673 conflicts, with 137 uniqueLocation–Insurgent pairs.

I We know what armed groups were active and when;I ...now we can try to extract the same data directly from texts.

26

Page 159: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Armed conflicts in time

Gold standardI Ground truth: the UCDP/PRIO Armed Conflict Dataset

(http://ucdp.uu.se/) maintained by the Uppsala Conflict DataProgram and the Peace Research Institute Oslo.

I A manually annotated geographical and temporal dataset withinformation on armed conflicts all over the world from 1946 to thepresent [Gleditsch et al., 2002].

I Particularly, the UCDP Conflict Termination dataset [Kreutz, 2010]:I starting and ending dates of armed conflicts between years 1994 and

2010;I 2 sides in each: sideA is a government, sideB is an insurgent group.

I Resulting test set of 673 conflicts, with 137 uniqueLocation–Insurgent pairs.

I We know what armed groups were active and when;I ...now we can try to extract the same data directly from texts.

26

Page 160: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Armed conflicts in time

Gold standardI Ground truth: the UCDP/PRIO Armed Conflict Dataset

(http://ucdp.uu.se/) maintained by the Uppsala Conflict DataProgram and the Peace Research Institute Oslo.

I A manually annotated geographical and temporal dataset withinformation on armed conflicts all over the world from 1946 to thepresent [Gleditsch et al., 2002].

I Particularly, the UCDP Conflict Termination dataset [Kreutz, 2010]:I starting and ending dates of armed conflicts between years 1994 and

2010;I 2 sides in each: sideA is a government, sideB is an insurgent group.

I Resulting test set of 673 conflicts, with 137 uniqueLocation–Insurgent pairs.

I We know what armed groups were active and when;

I ...now we can try to extract the same data directly from texts.

26

Page 161: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Armed conflicts in time

Gold standardI Ground truth: the UCDP/PRIO Armed Conflict Dataset

(http://ucdp.uu.se/) maintained by the Uppsala Conflict DataProgram and the Peace Research Institute Oslo.

I A manually annotated geographical and temporal dataset withinformation on armed conflicts all over the world from 1946 to thepresent [Gleditsch et al., 2002].

I Particularly, the UCDP Conflict Termination dataset [Kreutz, 2010]:I starting and ending dates of armed conflicts between years 1994 and

2010;I 2 sides in each: sideA is a government, sideB is an insurgent group.

I Resulting test set of 673 conflicts, with 137 uniqueLocation–Insurgent pairs.

I We know what armed groups were active and when;I ...now we can try to extract the same data directly from texts.

26

Page 162: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Our task: temporal analogical reasoning

Diachronic cultural shifts and one-to-many armed conflict relationsbetween typed named entities.

I Locations:1. India20032. India20033. Uganda20034. Iraq2004

I Armed groups:1. Kashmir Liberation Front20032. ULFA20033. Lord’s Resistance Army20034. ???

(the correct answer(s): Ansar al-Islam, al-Mahdi Army and IslamicState).

27

Page 163: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Our task: temporal analogical reasoning

Diachronic cultural shifts and one-to-many armed conflict relationsbetween typed named entities.

I Locations:1. India20032. India20033. Uganda20034. Iraq2004

I Armed groups:1. Kashmir Liberation Front20032. ULFA20033. Lord’s Resistance Army20034. ???

(the correct answer(s): Ansar al-Islam, al-Mahdi Army and IslamicState).

27

Page 164: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Our task: temporal analogical reasoning

Diachronic cultural shifts and one-to-many armed conflict relationsbetween typed named entities.

I Locations:1. India20032. India20033. Uganda20034. Iraq2004

I Armed groups:1. Kashmir Liberation Front20032. ULFA20033. Lord’s Resistance Army20034. ???

(the correct answer(s): Ansar al-Islam, al-Mahdi Army and IslamicState).

27

Page 165: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Our task: temporal analogical reasoning

Diachronic cultural shifts and one-to-many armed conflict relationsbetween typed named entities.

I Locations:1. India20032. India20033. Uganda20034. Iraq2004

I Armed groups:1. Kashmir Liberation Front20032. ULFA20033. Lord’s Resistance Army20034. ???

(the correct answer(s): Ansar al-Islam, al-Mahdi Army and IslamicState).

27

Page 166: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Our task: temporal analogical reasoning

Diachronic cultural shifts and one-to-many armed conflict relationsbetween typed named entities.

I Locations:1. India20032. India20033. Uganda20034. Iraq2004

I Armed groups:1. Kashmir Liberation Front20032. ULFA20033. Lord’s Resistance Army20034. ???

(the correct answer(s): Ansar al-Islam, al-Mahdi Army and IslamicState).

27

Page 167: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 2: Tracing diachronic semantic shifts

Incremental diachronic word embeddings

1994

CBOW CBOW CBOW

1995 1996

model 1994

model 1995

model 1996

Yearly corpora

Yearly corpora

28

Page 168: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 2: Tracing diachronic semantic shifts

The essence of the approachI Diachronic CBOW word embeddings models [Mikolov et al., 2013];

I trained incrementally on the English Gigaword news corpus[Parker et al., 2011];

I years 1994–2010 (yearly subcorpora about 250–320M contentwords each);

I learned linear projections from the embeddings of locations to theembeddings of armed groups in each year;

I projections (transformation matrices) are applied to the model fromthe next year.

29

Page 169: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 2: Tracing diachronic semantic shifts

The essence of the approachI Diachronic CBOW word embeddings models [Mikolov et al., 2013];I trained incrementally on the English Gigaword news corpus

[Parker et al., 2011];

I years 1994–2010 (yearly subcorpora about 250–320M contentwords each);

I learned linear projections from the embeddings of locations to theembeddings of armed groups in each year;

I projections (transformation matrices) are applied to the model fromthe next year.

29

Page 170: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 2: Tracing diachronic semantic shifts

The essence of the approachI Diachronic CBOW word embeddings models [Mikolov et al., 2013];I trained incrementally on the English Gigaword news corpus

[Parker et al., 2011];I years 1994–2010 (yearly subcorpora about 250–320M content

words each);

I learned linear projections from the embeddings of locations to theembeddings of armed groups in each year;

I projections (transformation matrices) are applied to the model fromthe next year.

29

Page 171: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 2: Tracing diachronic semantic shifts

The essence of the approachI Diachronic CBOW word embeddings models [Mikolov et al., 2013];I trained incrementally on the English Gigaword news corpus

[Parker et al., 2011];I years 1994–2010 (yearly subcorpora about 250–320M content

words each);I learned linear projections from the embeddings of locations to the

embeddings of armed groups in each year;

I projections (transformation matrices) are applied to the model fromthe next year.

29

Page 172: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 2: Tracing diachronic semantic shifts

The essence of the approachI Diachronic CBOW word embeddings models [Mikolov et al., 2013];I trained incrementally on the English Gigaword news corpus

[Parker et al., 2011];I years 1994–2010 (yearly subcorpora about 250–320M content

words each);I learned linear projections from the embeddings of locations to the

embeddings of armed groups in each year;I projections (transformation matrices) are applied to the model from

the next year.

29

Page 173: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 2: Tracing diachronic semantic shifts

Location–insurgent relations (‘semantic directions’) in t-SNE

Year 2000 model Year 2001 model Year 2002 model

30

Page 174: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 2: Tracing diachronic semantic shifts

Linear projectionsI Linear regression minimizing the error in transforming one set of

vectors into another:

Input vectors Learning trans-formation

Target vectors Result

Location1994_1Solving normalequations

Armed group1994_1 Lineartransformationmatrix(projection)

Location1994_2 Armed group1994_2... ...Location1994_n Armed group1994_n

I The learned transformation matrix from 1994 can predict an armedgroup vector from a location vector for 1995, etc;

I can be trained either on all the conflicts from the past and presentyears (up-to-now)...

I ...or only on the salient conflicts: active in the last year (previous)

31

Page 175: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 2: Tracing diachronic semantic shifts

Linear projectionsI Linear regression minimizing the error in transforming one set of

vectors into another:

Input vectors Learning trans-formation

Target vectors Result

Location1994_1Solving normalequations

Armed group1994_1 Lineartransformationmatrix(projection)

Location1994_2 Armed group1994_2... ...Location1994_n Armed group1994_n

I The learned transformation matrix from 1994 can predict an armedgroup vector from a location vector for 1995, etc;

I can be trained either on all the conflicts from the past and presentyears (up-to-now)...

I ...or only on the salient conflicts: active in the last year (previous)

31

Page 176: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 2: Tracing diachronic semantic shifts

Linear projectionsI Linear regression minimizing the error in transforming one set of

vectors into another:

Input vectors Learning trans-formation

Target vectors Result

Location1994_1Solving normalequations

Armed group1994_1 Lineartransformationmatrix(projection)

Location1994_2 Armed group1994_2... ...Location1994_n Armed group1994_n

I The learned transformation matrix from 1994 can predict an armedgroup vector from a location vector for 1995, etc;

I can be trained either on all the conflicts from the past and presentyears (up-to-now)...

I ...or only on the salient conflicts: active in the last year (previous)

31

Page 177: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 2: Tracing diachronic semantic shifts

Linear projectionsI Linear regression minimizing the error in transforming one set of

vectors into another:

Input vectors Learning trans-formation

Target vectors Result

Location1994_1Solving normalequations

Armed group1994_1 Lineartransformationmatrix(projection)

Location1994_2 Armed group1994_2... ...Location1994_n Armed group1994_n

I The learned transformation matrix from 1994 can predict an armedgroup vector from a location vector for 1995, etc;

I can be trained either on all the conflicts from the past and presentyears (up-to-now)...

I ...or only on the salient conflicts: active in the last year (previous)

31

Page 178: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 2: Tracing diachronic semantic shifts

Linear projectionsI Linear regression minimizing the error in transforming one set of

vectors into another:

Input vectors Learning trans-formation

Target vectors Result

Location1994_1Solving normalequations

Armed group1994_1 Lineartransformationmatrix(projection)

Location1994_2 Armed group1994_2... ...Location1994_n Armed group1994_n

I The learned transformation matrix from 1994 can predict an armedgroup vector from a location vector for 1995, etc;

I can be trained either on all the conflicts from the past and presentyears (up-to-now)...

I ...or only on the salient conflicts: active in the last year (previous)

31

Page 179: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 2: Tracing diachronic semantic shifts

Evaluation scores

Only in-vocabulary pairs All pairs, including OOV

up-to-now previous up-to-now previous

Training mode @1 @5 @10 @1 @5 @10 @1 @5 @10 @1 @5 @10

Separate 0.0 0.7 2.1 0.5 1.1 2.4 0.0 0.5 1.6 0.4 0.8 1.8Cumulative 1.7 8.3 13.8 2.9 9.6 15.2 1.5 7.4 12.2 2.5 8.5 13.4Incremental static 54.9 82.8 90.1 60.4 79.6 84.8 20.8 31.5 34.2 23.0 30.3 32.2Incremental dynamic 32.5 64.5 72.2 42.6 64.8 71.5 28.1 56.1 62.9 37.3 56.7 62.6

I Average accuracies of predicting next-year armed groups fromlocations.

I 3 baselines and the proposed incremental dynamic approach.

32

Page 180: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 2: Tracing diachronic semantic shifts

Evaluation scores

Only in-vocabulary pairs All pairs, including OOV

up-to-now previous up-to-now previous

Training mode @1 @5 @10 @1 @5 @10 @1 @5 @10 @1 @5 @10

Separate 0.0 0.7 2.1 0.5 1.1 2.4 0.0 0.5 1.6 0.4 0.8 1.8Cumulative 1.7 8.3 13.8 2.9 9.6 15.2 1.5 7.4 12.2 2.5 8.5 13.4Incremental static 54.9 82.8 90.1 60.4 79.6 84.8 20.8 31.5 34.2 23.0 30.3 32.2Incremental dynamic 32.5 64.5 72.2 42.6 64.8 71.5 28.1 56.1 62.9 37.3 56.7 62.6

I Average accuracies of predicting next-year armed groups fromlocations.

I 3 baselines and the proposed incremental dynamic approach.

32

Page 181: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 2: Tracing diachronic semantic shifts

Summary for the Case 21. Word embeddings can be used to trace temporal dynamics of

semantic relations in word pairs.

I This can help people in political science and peace studies toautomatize mining their data.

2. The necessary prerequisites:I incremental updating of the models with new textual data (not training

from scratch);I expanding the models’ vocabulary.

Now you can decide for yourself, whether NLP/CL is a science or anengineering discipline :-)

33

Page 182: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 2: Tracing diachronic semantic shifts

Summary for the Case 21. Word embeddings can be used to trace temporal dynamics of

semantic relations in word pairs.I This can help people in political science and peace studies to

automatize mining their data.

2. The necessary prerequisites:I incremental updating of the models with new textual data (not training

from scratch);I expanding the models’ vocabulary.

Now you can decide for yourself, whether NLP/CL is a science or anengineering discipline :-)

33

Page 183: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 2: Tracing diachronic semantic shifts

Summary for the Case 21. Word embeddings can be used to trace temporal dynamics of

semantic relations in word pairs.I This can help people in political science and peace studies to

automatize mining their data.2. The necessary prerequisites:

I incremental updating of the models with new textual data (not trainingfrom scratch);

I expanding the models’ vocabulary.

Now you can decide for yourself, whether NLP/CL is a science or anengineering discipline :-)

33

Page 184: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Case 2: Tracing diachronic semantic shifts

Summary for the Case 21. Word embeddings can be used to trace temporal dynamics of

semantic relations in word pairs.I This can help people in political science and peace studies to

automatize mining their data.2. The necessary prerequisites:

I incremental updating of the models with new textual data (not trainingfrom scratch);

I expanding the models’ vocabulary.

Now you can decide for yourself, whether NLP/CL is a science or anengineering discipline :-)

33

Page 185: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

Q&A

Thank you for your attention!Questions are welcome.

Computational linguistics and NLP:How far are they from generic linguistics?

http://vectors.nlpl.eu/explore/embeddings/

Andrey Kutuzov ([email protected])Language Technology Group

University of Oslo

33

Page 186: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

References I

Dyer, C. (2017).Should neural network architecture reflect linguistic structure?In Proceedings of the 21st Conference on Computational NaturalLanguage Learning (CoNLL 2017), page 1. Association forComputational Linguistics.

Gleditsch, N. P., Wallensteen, P., Eriksson, M., Sollenberg, M., andStrand, H. (2002).Armed conflict 1946-2001: A new dataset.Journal of peace research, 39(5):615–637.

Houston, A. C. (1985).Continuity and change in English morphology: The variable (ING).PhD thesis, University of Pennsylvania.

34

Page 187: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

References II

Kreutz, J. (2010).How and when armed conflicts end: Introducing the UCDP conflicttermination dataset.Journal of Peace Research, 47(2):243–250.

Kutuzov, A., Velldal, E., and Øvrelid, L. (2016).Redefining part-of-speech classes with distributional semanticmodels.In Proceedings of The 20th SIGNLL Conference on ComputationalNatural Language Learning, pages 115–125. Association forComputational Linguistics.

35

Page 188: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

References III

Kutuzov, A., Velldal, E., and Øvrelid, L. (2017).Temporal dynamics of semantic relations in word embeddings: anapplication to predicting armed conflict participants.In Proceedings of the 2017 Conference on Empirical Methods inNatural Language Processing, pages 1824–1829. Association forComputational Linguistics.

Manning, C. D. (2011).Part-of-speech tagging from 97% to 100%: is it time for somelinguistics?In Computational Linguistics and Intelligent Text Processing, pages171–189. Springer.

36

Page 189: Computational linguistics and NLP: How far from generic ... · Computational linguistics and NLP: How far from generic linguistics? Andrey Kutuzov University of Oslo Language Technology

References IV

Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J.(2013).Distributed representations of words and phrases and theircompositionality.Advances in Neural Information Processing Systems 26.

Parker, R., Graff, D., Kong, J., Chen, K., and Maeda, K. (2011).English gigaword fifth edition ldc2011t07.Technical report, Linguistic Data Consortium, Philadelphia.

Zhao, J., Wang, T., Yatskar, M., Ordonez, V., and Chang, K.-W.(2017).Men also like shopping: Reducing gender bias amplification usingcorpus-level constraints.In Proceedings of the 2017 Conference on Empirical Methods inNatural Language Processing, pages 2979–2989. Association forComputational Linguistics.

37