information-analytical system “manuscript”: technologies and tools of creation of electronic...

29
Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV • Linguistics Department Izhevsk State Technical University • Laboratory of Computer-Aided Philological Research Udmurtia State University

Post on 15-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Information-Analytical System “Manuscript”: technologies and tools of creation of electronic

collections of ancient and medieval documents

Victor BARANOV• Linguistics Department

Izhevsk State Technical University• Laboratory of Computer-Aided Philological Research

Udmurtia State University

Page 2: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 2

Title page of the portal of IAS “Manuscript”

Page 3: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 3

Model of hierarchies and subnets of manuscript and text units

Page 4: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 4

Net of linguistic relationships

се быша дроузи мои .

се

.

се быша дроузи мои

с е б ы ш а д р оу з и м о и

<…> се быша дроузи мои . <…>

быша дроузи мои

се быша дроузибыша дроузи

Дроузи мои

Text

Predicate part

Syntactic group

Word-form

Relationship

Средство связи

Εnd of the “single" relationship

Εnd of the “multiple" relationship

Mean of relationship

Word-combination

Co-ordinationDependence

Page 5: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 5

Model of the Manuscript system

Page 6: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 6

Editor OldEd: main panels

Page 7: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 7

Editor OldEd: Text input and editing

Page 8: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 8

Editor OldEd: Fragmentation of the manuscript texts into units and relationships with the dictionary units

Dictionary of fragments

Properties of fragments

Fragments

Page 9: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 9

Editor OldEd: Visualization of unit relationships

Symbol

Geometric hierarchy:

Line Page

Linguistic hierarchy:

word-form normalize forms

Dictionary:Lemma

Dictionary:word-forms of texts

Properties and values

of the Lemma

Page 10: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 10

Editor OldEd: Page layout

Page 11: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 11

Result of creation of the layout on the site

MarginaliaMarginaliaMarginalia

Page 12: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 12

Automated lemmatization and establishing relationships between words and lemmas

Page 13: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 13

Electronic edition: search page

Search criteria

Collections & Manuscripts

Search result

Page 14: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 14

Search result: word index and concordance

Page 15: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 15

Module of retrievals: selection of the text

Page 16: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 16

Module of retrievals: selection of the unit

Page 17: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 17

Module of retrievals: setting the unit properties and values

Page 18: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 18

Module of retrievals: saving the query

Page 19: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 19

Module of retrievals: specifying the composition of the query result

Page 20: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 20

Comparative index of the word forms

Page 21: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 21

Comparative index of the fragments

Page 22: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 22

Grammar dictionaries

Grammar dictionary of the modern Russian language

Grammar dictionary of the Old Russian language

Grammar dictionary of the Old Slavonic language

Grammar dictionary pseudo-elements

Text NText 6Text 5Text 4Text 3Text 2Text 1

Page 23: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 23

Grammar dictionaries: retrieval form

Page 24: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 24

Grammar dictionaries: bringing the Old Russian word-forms to the lemma

Page 25: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 25

Grammar dictionaries: оbtaining paradigm of lemma

Page 26: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 26

Electronic editions

Page 27: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 27

Electronic edition: reverse index of word-forms and context

Page 28: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 28

Acknowledgment

The work on the creation of IRS Manuscript is being carried out with the support from the Russian Foundation of Basic Research (Grant # 05-07-90217в).

Τhe work on the creation of the automated morphologic analyzer with the support of the Russian Foundation for the Humanities (Grant # 05-04-12408в).

Page 29: Information-Analytical System “Manuscript”: technologies and tools of creation of electronic collections of ancient and medieval documents Victor BARANOV

Dagstuhl, December, 2006 Digital Historical Corpora 29

Contacts

Victor Baranov - [email protected]

http://manuscripts.ru/index_en.html

Laboratory of Computer-Aided Philological Research Udmurtia State University

Linguistics DepartmentIzhevsk State Technical UniversityIzhevsk, Russia