exploring the "betrothed lovers" and other literary works
DESCRIPTION
DH Workshop in memory of Emanuele Pianta, Trento 10 December, 2013 As part of the activities of Digital Humanities group in FBK, a one-day workshop on "Digital Humanities: Current state and Future challenges". Exploring the “Betrothed Lovers” and other literary works by Andrea Bolioli, Riccardo Tasso 2 Our company: Cross Library:Spin-off of FBK (Trento) and CELI (Torino) Digital Humanities and School Our claim: If you enjoy it, you understand it! Our product: the "crunched" book 3 A propotype for literature: I promessi sposi 2.0 4 Exploring literary works 5 A research project: Sèduco 6 HLT tasks for literature processing 7 The Annotation Framework 8 Our Annotation Model: An annotation is a span of text characterized by a 9 Our Annotation Model: An annotation may have attributes 10 Our Annotation Model: An annotation may be classified 11 Our Annotation Model: An annotation may be related 12 Object Store 13-14 Text Store 15-20 The annotation query engine 21 Crunched Book SNA 22 Actors Graph 23 Pinocchio Actors (1) 24 Pinocchio Actors (1) 25 Speakers Graph 26 Promessi Sposi Speakers 27 Pinocchio Speakers 28 Romeo and Juliet 29 Crunched Book SNA (speakers) 30 Future works 31 Thank You! @CrossLib http://www.cross-library.comTRANSCRIPT
Exploring the “Betrothed Lovers”
and other literary works
Andrea Bolioli, Riccardo Tasso
”If you enjoy it, you understand it”
Our claim: If you enjoy it,
you understand it!
Our product: the "crunched" book
Spin-off of FBK (Trento)
and CELI (Torino)
Digital Humanities and School
www.cross-library.com
Our company: Cross Library
A propotype for literature: I promessi sposi 2.0
«The Betrothed», by Alessandro Manzoni www.crunchedbook.com
Exploring literary works
NARRATIVE SEQUENCES
CHARACTERS SOCIAL NETWORKS
LOCATIONS
A research project: Sèduco
Sharing Educational Content
www.seduco.it
Partners: Cross Library,
OpenContent,
FBK, IPRASE
and 4 high schools
«Exploring the Betrothed Lovers»,
A. Bolioli, M. Casu, M. Lana, R. Roda,
Computational Models of Narrative workshop CMN 2013,
Hamburg 4-6 august 2013
HLT tasks for literature processing
• Automatic text segmentation:
narrative sequences, quoted speech,
other text units
• Entity mention annotation:
speakers, mentions of characters
(agents) and locations (not only GPEs,
e.g. "castello dell'Innominato" - castle
of the Unnamed, osteria della Luna
piena" - tavern of the Full Moon)
• Quoted speech attribution
The Annotation Framework
Our Annotation Model
An annotation is a span of text characterized by
a <begin, end>
Our Annotation Model
An annotation may have attributes:
Our Annotation Model
An annotation may be classified:
Our Annotation Model
An annotation may be related:
Object Store
An annotation is persisted:
“A graph database stores data in a graph, the
most generic of data structures, capable of
elegantly representing any kind of data in a
highly accessible way”
An annotation is persisted:
Text Store
Annotations, annotations, annotations... But what about text?
Text Store
Annotations, annotations, annotations... But what about text?
The annotation query engine
And (finally) you can search and find annotations
The annotation query engine
Choose a MAIN annotation filter:
{ "main": { "@class": "Sequence" } }
Returns all the Annotations: whose class is Sequence
The annotation query engine
Specify annotation's attributes:
{ "main": { "@class": "Fragment", "type": "speech" } }
Returns all the Annotations: whose class is Fragment of (sub)type "speech"
The annotation query engine
Specify annotation's relations:
{ "main": { "@class": "Sequence", "out('actor')": "pinocchio", "out('place')": "paese_balocchi" } }
Returns all the Annotations: whose class is Sequence with an actor relation to "pinocchio" with a place relation to "paese_balocchi"
The annotation query engine
Choose second level filter:
{ "main": { "@class": "Sequence" }, "filter": { "@class": "@Fragment", "type": "speech" } }
Returns all the Annotations: whose class is Sequence which CONTAIN a given annotation (speech)
The annotation query engine
Full text search:
{ "main": { "@class": "Sequence", "out('actor')": "pinocchio" }, "@text": "storia" }
Returns all the Annotations: whose class is Sequence with an actor relation to "pinocchio" whose text contains "storia" keyword
Crunched Book SNA
Actors Graph
Pinocchio Actors (1)
Pinocchio Actors (2)
Speakers Graph
Promessi Sposi Speakers
Pinocchio Speakers
Romeo and Juliet
Crunched Book SNA (speakers)
Promessi Sposi Pinocchio Romeo & Juliet
nodes 86 62 35
edges 182 104 236
diameter 6 6 3
density 0.061 0.055 0.397
connected components 1 1 1
communities 6 11 3
clustering coefficient 0.528 0.614 0.813
avg. path length 2.814 2.395 1.64
Future works
Other crunched books (in January):
«Le avventure di Pinocchio», «Romeo and Juliet»
Next DH projects:
• Annotating and visualizing ancient places in latin literature
• A multilingual work (latin, english, italian and chinese)
Thank You!
@CrossLib
http://www.cross-library.com
”If you enjoy it, you understand it”