Download - FrameNet development for Latvian
FrameNet development for Latvian
Normunds GrūzītisGuntis Bārzdiņš
University of Latvia, Institute of Mathematics and Computer ScienceNational information agency LETA
2nd International FrameNet Workshop, Juiz de Fora, Brazil, 8-9 October 2016
Latvian• Member of the Baltic language group• Official language of European Union
• Around 2M speakers
• Typically classified as an under-resourced language Situation is rapidly improving in several directions of NLP
o Automatic speech recognitiono Machine translationo Natural language understandingo Natural language generation
Latvian FrameNet: a pilot• Application/domain-specific (LETA)
Facilitates the semi-automatic information extraction process for the media monitoring needso For populating and updating profiles of public persons and
organizations
• Covers 26 Berkeley FrameNet frames: Being_born, Being_employed, Change_of_leadership, Earnings_and_losses, Education_teaching, Hiring, Personal_relationship, Residence, Win_prize, etc.
• Nearly 5000 annotated sentences
FrameNet ontology: LETA frames
FrameNet annotationson top of dependency heads
Accuracy of automatic SRLParser / Year / Dataset
Frame identification FE identification
Precision Recall F1 Precision Recall F1
C6.0 / 2014 / LETA 63.5 62.7 63.1 65.9 76.8 70.9
C6.0 / 2014 / BFN 1.3 77.1 53.7 63.3 47.3 47.0 47.1
SEMAFOR / 2014 / BFN 1.3 69.7 54.9 61.4 58.1 38.8 46.5
LTH / 2007 / BFN 1.3 68.9 53.6 60.3 51.6 35.4 42.0
http://c60.ailab.lv
Exhaustive search binary classifier
Used to parse the entire LETA news archive (12M articles)
LETA IE and KB population system
Scalable Understandingof Multilingual MediA
Discover trends, emerging events, crucial new stories
H2020 grant No. 688139
Event-based summarization
Storyline highlights across a set of related articles
Multilingual / Cross-lingual apps
Full stack of language resources for NLU and NLG [in Latvian]
Full stack of language resources for NLU and NLG [in Latvian]
Full stack of language resources for NLU and NLG [in Latvian]
GF for implementing multilingual frames and constructions• FrameNet – semantic abstraction
BFN frames reused across languages Representation of valence patterns varies a lot FNs as such are semi-formal/computational
• GF – syntactic abstraction Grammar formalism and resource grammar library Towards a computational implementation of FNs
o In some aspects; for multilingual NLG Unified method to compare valence patterns across FNs
Latvian FrameNet++• Integrated: a part of a multi-layered corpus• Balanced
We anticipate that the corpus will represent at least 2000 common verbs with at least 10 examples for each of the 1000 most common verbs
• Manually verified at all layers Instead of adding e.g. the syntactic layer afterwards by
an erroneous probabilistic parser
• Computationally oriented• Accessible (open data)