carolin odebrecht & florian zipser humboldt-universität zu ... · carolin odebrecht &...
TRANSCRIPT
![Page 1: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/1.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
1
Carolin Odebrecht &Florian Zipser
Humboldt-Universität zu Berlin
ANNIS workshop
2014-08-26
![Page 2: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/2.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
2
A brief introduction
● Search and Visualization in Multilayer Linguistic Corpora– Imports existing corpora
● Corpora already have to be annotated, ANNIS only uses what's there
● No NLP!
![Page 3: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/3.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
3
A brief introduction
● Search and Visualization in Multilayer Linguistic Corpora– Makes corpora searchable
● One query language for all corpora (AQL)● Abstraction over linguistic data necessary● But: Corpora have different annotations → query has to
match the annotations
![Page 4: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/4.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
4
A brief introduction
● Search and Visualization in Multilayer Linguistic Corpora– Displays corpora
● Many visualizations available● Corresponding to type of annotation (syntactic trees,
phrase trees (RST), grids, coreferences ...)
![Page 5: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/5.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
5
A brief introduction
● What ANNIS cannot do– Does not know how to speak natural language
→ so you have to learn AQL
![Page 6: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/6.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
6
A brief introduction
● What ANNIS cannot do– Does not know how to speak natural language
→ so you have to learn AQL
– ANNIS does not know any semantics
→ „NN“, „NP“, „sentence“, „word“, „my favorite annotation“ … are just sequences of characters
![Page 7: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/7.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
7
A brief introduction
● What ANNIS cannot do– Does not know how to speak natural language
→ so you have to learn AQL
– ANNIS does not know any semantics
→ „NN“, „NP“, „sentence“, „word“, „my favorite annotation“ … are just sequences of characters
– You need to be exact
→ e.g. „POS“ != „pos“ and „NN“ != „NN “ (regard the blank)
![Page 8: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/8.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
8
ANNIS basics
ANNIS basics
![Page 9: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/9.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
9
Enter query
Corpus list
Previous queries
Virtual Keyboard (e.g. arabic)
![Page 10: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/10.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
10
Sample queries (corresponding to corpus)
![Page 11: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/11.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
11
Query result
Visualizations
![Page 12: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/12.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
12
Corpus metadata
Corpus metadata window
![Page 13: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/13.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
13
Document metadata
Document metadata window
![Page 14: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/14.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
14
ANNIS basics
● Basic principles of AQL (ANNIS Query Language)– Attributes and values
● Searching for exact character sequences● Searching for patterns
– Combinatory search
![Page 15: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/15.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
15
Demo corpus
● Corpus for demonstration: pcc2 (a sub corpus of pcc)
https://korpling.german.hu-berlin.de/annis3/#_c=cGNjMg
● Potsdam Commentary Corpus– German Newspaper commentaries
'Märkische Allgemeine Zeitung'https://www.ling.uni-potsdam.de/acl-lab/Forsch/pcc/pcc.html
– Multiple annotations
![Page 16: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/16.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
16
ANNIS basics
● Different types of annotations– Token annotation
– Span annotation
– Pointing relation
– Hierarchy annotation
(trees)
To k e n To k e n To k e n To k e n To k e n To k e n
S p a n S p a n
S p a n
N o d e
E d g e
K e y
K e y
K e y
![Page 17: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/17.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
17
ANNIS basics
● Different types of annotations– Token annotation
– Span annotation
– Pointing relation
– Hierarchy annotation
(trees)
To k e n To k e n To k e n To k e n To k e n To k e n
S p a n S p a n
S p a n
N o d e
E d g e
K e y
K e y
K e y To k e n To k e n To k e n To k e n To k e n To k e n
S p a n S p a n
S p a n
N o d e
E d g e
K e y
K e y
K e y
![Page 18: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/18.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
18
Exact word forms
● Token annotation– Exact sequence
searching for a word form
"Jugendlichen"
"jugendlichen"
![Page 19: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/19.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
19
Exact word forms
● Token annotation– Exact sequence
searching for a word form
"Jugendlichen" 3 hits
"jugendlichen" 0 hits
→ tok="jugendlichen"
![Page 20: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/20.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
20
Exact token annotation
● Token annotation– Exact sequence
searching for an exact part of speech tag
pos = "NN"
attribute value
– Attributes can have more than one value
– Searching for all values of an attribute
![Page 21: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/21.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
21
Exact token annotation
● Token annotation– Exact sequence
searching for an exact part of speech tag
pos="NN"
pos="ADJA"
![Page 22: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/22.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
22
Exact token annotation
● Token annotation– Exact sequence
searching for an exact part of speech tag
pos="NN" 62 hits
pos="ADJA" 18 hits
searching for all values of an attribute
pos 399 hits
![Page 23: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/23.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
23
Exact span annotation
● Span annotation– Exact sequence
searching for sentences
Sent="s"
![Page 24: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/24.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
24
Exact span annotation
● Span annotation– Exact sequence
searching for sentences
Sent="s" 28 hits
![Page 25: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/25.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
25
Metadata
● Sent="s" 28 hits– necessary to know which annotations are in a
corpus
![Page 26: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/26.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
26
Pattern
● Token annotation– Patterns
. matches any single character
* zero or more of the preceding element
searching for the beginning a of word
/Jugend.*/
/jugend.*/
![Page 27: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/27.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
27
Pattern
● Token annotation– Patterns
. matches any single character
* zero or more of the preceding element
searching for the beginning a of word
/Jugend.*/ 5 hits ("Jugendlichen" 3 hits)
Jugendlichen Jugendliche
/jugend.*/ 0 hits ("jugendlichen" 0 hits)
![Page 28: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/28.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
28
Pattern
● Token annotation– patterns
searching for all nouns
pos=/N./ includes NN & NE
searching for all adjectives
pos=/ADJ./ includes ADJA & ADJD
![Page 29: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/29.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
29
Pattern
● Token annotation– patterns
searching for all nouns
pos=/N./ 73 hits (pos="NN" 62 hits)
searching for all adjectives
pos=/ADJ./ 32 hits (pos="ADJA" 18 hits)
![Page 30: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/30.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
30
Relations between annotations
● Span annotation
searching for all NPs
cat="NP" 41 hits (pos="NN" 62 hits)
e.g. Die Jugendlichen in Zossen
![Page 31: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/31.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
31
Relations between annotations
● Relations between attributes
searching for all NPs which contain a preposition
cat="NP" 41 hits
pos="APPR" 19 hits
e.g. Die Jugendlichen in Zossen
→ no relation between the two information!
![Page 32: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/32.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
32
Relations between annotations
● Relations between attributes
searching for all NPs which contain a preposition
cat="NP" #1
pos="APPR" #2
e.g. Die Jugendlichen in Zossen
→ NP includes APPR
![Page 33: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/33.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
33
Relations between annotations
● Relations between attributes
searching for all NPs which contain a preposition
cat="NP" &
pos="APPR" &
#1_i_#2
e.g. Die Jugendlichen in Zossen
![Page 34: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/34.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
34
Hierarchy relations
● Relations between attributes
searching for all NPs which are objects
cat="NP"
e.g. Die Jugendlichen in Zossen -->subject!
![Page 35: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/35.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
35
Hierarchy relations
● Relations between attributes
searching all NPs which are objects
– NP → node annotation
– OA → edge annotation
To k e n To k e n To k e n
S p a n
N o d e
E d g e
![Page 36: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/36.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
36
Hierarchy relations
● Relations between attributes
searching all NPs which are objects
cat="NP"
the syntactic function in the tree
func="OA"
→ Note: At least there are two elements which relate in a way to each other!
![Page 37: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/37.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
37
Hierarchy relations
● Relations between attributes
searching all NPs which are objects
node & cat="NP" & #1 >[func="OA"] #2
e.g. ein Musikcafé -->object!
![Page 38: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/38.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
38
Used Relations
● Relations we used:
A _i_ B A includes B
A > B A dominates B
A >[func=“OA“] B A dominates B and B is an object
The full list of relations can be found in ANNIS
![Page 39: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/39.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
39
What's new in ANNIS
What's new in ANNIS version 3.1.7
![Page 40: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/40.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
40
What's new in ANNIS
● Simplified syntax (AQL)
● Frequency analysis (Visualisierung)
● Expand match context (Visualisierung)
● Equality and Inequality (AQL)
● Variables (AQL)
● Complex OR expression (AQL)
● Document browser (Visualisierung)
● CSV export (Visualisierung)
● Tooltip for corpus names (Visualisierung)
● Report problem (Visualisierung)
![Page 41: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/41.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
41
Simplified syntax
● Question:
„Die“ followed by „Jugendlichen“ both being dominated by a prepositional phrase which is dominated by a sentence
So far:cat="S" & cat="NP" & "Die" & "Jugendlichen" & #1 > #2 & #2 > #3 & #2 > #4 & #3 . #4
![Page 42: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/42.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
42
Simplified syntax
● Question:
„Die“ followed by „Jugendlichen“ both being dominated by a prepositional phrase which is dominated by a sentence
So far:cat="S" & cat="NP" & "Die" & "Jugendlichen" & #1 > #2 & #2 > #3 & #2 > #4 & #3 . #4
Simplified:cat="S" > cat="NP" > "Die" . "Jugendlichen" & #2 > #4
![Page 43: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/43.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
43
Frequency analysis
● Question:
– How many words tagged as „NN“, „ADJA“ or „ADV“ does a corpus contain?
– What are the most frequent part-of-speech tags followed by a noun?
– What are the most frequent part-of-speech tags in a prepositional phrase, which is in a sentence?
– ...
![Page 44: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/44.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
44
Frequency analysis
![Page 45: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/45.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
45
Frequency analysis
![Page 46: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/46.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
46
Frequency analysis
Attention:A frequency analysis has to be bound to a query!
![Page 47: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/47.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
47
Frequency analysis
● What are the most
frequent part-of-speech
tags followed by a noun?
● What are the most frequent
part-of-speech tags in a
prepositional phrase,
which is in a sentence?
pos . pos="NN"
cat="S" > cat="PP" > pos
![Page 48: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/48.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
48
Expand match context
● Even more than 25 is possible, it's a free text field
● Sometimes the context is too small
![Page 49: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/49.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
49
Equality and Inequality
● Equality „==“ and inequality „!=“ for attributes
Question (inequality):
two different part-of-speech tags, one directly following the other
pos . pos & #1 != #2
![Page 50: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/50.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
50
Equality and Inequality
● Equality „==“ and inequality „!=“ for attributes
● Question (equality):
two same part-of-speech tags, one directly following the other
pos . pos & #1 == #2
![Page 51: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/51.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
51
Equality and Inequality
● Equality „==“ and inequality „!=“ for attributes
Question (inequality):
two different part-of-speech tags, one directly following the other
pos . pos & #1 != #2
![Page 52: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/52.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
52
Variables
● Question:
„Die“ followed by „Jugendlichen“ both being dominated by a prepositional phrase which is dominated by a sentence
Simplified:cat="S" > cat="NP" > "Die" . "Jugendlichen" & #2 > #4
![Page 53: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/53.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
53
Variables
● Question:
„Die“ followed by „Jugendlichen“ both being dominated by a prepositional phrase which is dominated by a sentence
Simplified:cat="S" > np#cat="NP" > "Die" . jug#"Jugendlichen" & #np > #jug
![Page 54: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/54.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
54
Variables
● Question:
„Die“ followed by „Jugendlichen“ both being dominated by a prepositional phrase which is dominated by a sentence
Simplified:cat="S" > np#cat="NP" > "Die" . jug#"Jugendlichen" & #np > #jug
Variables and numbers can be mixed:cat="S" > np#cat="NP" > "Die" . "Jugendlichen" & #np > #4
![Page 55: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/55.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
55
Complex OR expression
● Question (simple OR):
A part-of-speech tag which is a noun, an attributive adjective or an article
pos=/(NN)|(ADJA)|(ART)/ (in pattern search)
![Page 56: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/56.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
56
Complex OR expression
pos="NN" | pos="ADJA" | pos= "ART"
● Question (simple OR):
A part-of-speech tag which is a noun, an attributive adjective or an article
● OR for expressions
pos=/(NN)|(ADJA)|(ART)/ (in pattern search)
![Page 57: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/57.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
57
Complex OR expression
(cat="S" > cat="PP") | cat="NP"
● Question (complex OR):
A prepositional phrase, which is dominated by a sentence, or just a nominal phrase
![Page 58: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/58.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
58
Complex OR expression
a#cat="PP" & (b#pos="NN" | b#pos="ADJA" | b#pos= "ART") & #a > #b
● Question (nested OR):
A prepositional phrase, which dominates a noun, an attributive adjective or an article
![Page 59: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/59.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
59
Complex OR expression
a#cat="PP" & (b#pos="NN" | b#pos="ADJA" | b#pos= "ART") & #a > #b
● Question (nested OR):
A prepositional phrase, which dominates a noun, an attributive adjective or an article
Attention:All expressions in brackets have to use the same variable… & (b#pos="NN" | b#pos="ADJA" | b#pos= "ART") & ...
![Page 60: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/60.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
60
Document browser
● Displays the entire text of a document
![Page 61: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/61.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
61
Document browser
![Page 62: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/62.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
62
CSV export
● Export data for futher processing
![Page 63: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/63.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
63
Tooltips for corpus names
● Sometimes corpus names can get very long
![Page 64: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/64.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
64
Report problem
![Page 65: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/65.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
65
Get ANNIS
● ANNIS comes in two flavors – A server version
– A desktop version (ANNIS kickstarter)
– Both are downloadable at: http://www.sfb632.uni-potsdam.de/annis/
● ANNIS is open source (Apache license 2.0) and hosted on github– https://github.com/korpling/ANNIS
![Page 66: Carolin Odebrecht & Florian Zipser Humboldt-Universität zu ... · Carolin Odebrecht & Florian Zipser ANNIS workshop ANNIS: Search and Visualization in Multilayer Linguistic Corpora](https://reader034.vdocuments.mx/reader034/viewer/2022042303/5ece8eaf27344b176036a8c0/html5/thumbnails/66.jpg)
ANNIS workshopCarolin Odebrecht & Florian Zipser
ANNIS: Search and Visualization in Multilayer Linguistic Corpora
66
Thanks for your attention!Any questions?