information-analytical system “manuscript”: technologies and tools of creation of electronic...

Post on 15-Jan-2016

221 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Information-Analytical System “Manuscript”: technologies and tools of creation of electronic

collections of ancient and medieval documents

Victor BARANOV• Linguistics Department

Izhevsk State Technical University• Laboratory of Computer-Aided Philological Research

Udmurtia State University

Dagstuhl, December, 2006 Digital Historical Corpora 2

Title page of the portal of IAS “Manuscript”

Dagstuhl, December, 2006 Digital Historical Corpora 3

Model of hierarchies and subnets of manuscript and text units

Dagstuhl, December, 2006 Digital Historical Corpora 4

Net of linguistic relationships

се быша дроузи мои .

се

.

се быша дроузи мои

с е б ы ш а д р оу з и м о и

<…> се быша дроузи мои . <…>

быша дроузи мои

се быша дроузибыша дроузи

Дроузи мои

Text

Predicate part

Syntactic group

Word-form

Relationship

Средство связи

Εnd of the “single" relationship

Εnd of the “multiple" relationship

Mean of relationship

Word-combination

Co-ordinationDependence

Dagstuhl, December, 2006 Digital Historical Corpora 5

Model of the Manuscript system

Dagstuhl, December, 2006 Digital Historical Corpora 6

Editor OldEd: main panels

Dagstuhl, December, 2006 Digital Historical Corpora 7

Editor OldEd: Text input and editing

Dagstuhl, December, 2006 Digital Historical Corpora 8

Editor OldEd: Fragmentation of the manuscript texts into units and relationships with the dictionary units

Dictionary of fragments

Properties of fragments

Fragments

Dagstuhl, December, 2006 Digital Historical Corpora 9

Editor OldEd: Visualization of unit relationships

Symbol

Geometric hierarchy:

Line Page

Linguistic hierarchy:

word-form normalize forms

Dictionary:Lemma

Dictionary:word-forms of texts

Properties and values

of the Lemma

Dagstuhl, December, 2006 Digital Historical Corpora 10

Editor OldEd: Page layout

Dagstuhl, December, 2006 Digital Historical Corpora 11

Result of creation of the layout on the site

MarginaliaMarginaliaMarginalia

Dagstuhl, December, 2006 Digital Historical Corpora 12

Automated lemmatization and establishing relationships between words and lemmas

Dagstuhl, December, 2006 Digital Historical Corpora 13

Electronic edition: search page

Search criteria

Collections & Manuscripts

Search result

Dagstuhl, December, 2006 Digital Historical Corpora 14

Search result: word index and concordance

Dagstuhl, December, 2006 Digital Historical Corpora 15

Module of retrievals: selection of the text

Dagstuhl, December, 2006 Digital Historical Corpora 16

Module of retrievals: selection of the unit

Dagstuhl, December, 2006 Digital Historical Corpora 17

Module of retrievals: setting the unit properties and values

Dagstuhl, December, 2006 Digital Historical Corpora 18

Module of retrievals: saving the query

Dagstuhl, December, 2006 Digital Historical Corpora 19

Module of retrievals: specifying the composition of the query result

Dagstuhl, December, 2006 Digital Historical Corpora 20

Comparative index of the word forms

Dagstuhl, December, 2006 Digital Historical Corpora 21

Comparative index of the fragments

Dagstuhl, December, 2006 Digital Historical Corpora 22

Grammar dictionaries

Grammar dictionary of the modern Russian language

Grammar dictionary of the Old Russian language

Grammar dictionary of the Old Slavonic language

Grammar dictionary pseudo-elements

Text NText 6Text 5Text 4Text 3Text 2Text 1

Dagstuhl, December, 2006 Digital Historical Corpora 23

Grammar dictionaries: retrieval form

Dagstuhl, December, 2006 Digital Historical Corpora 24

Grammar dictionaries: bringing the Old Russian word-forms to the lemma

Dagstuhl, December, 2006 Digital Historical Corpora 25

Grammar dictionaries: оbtaining paradigm of lemma

Dagstuhl, December, 2006 Digital Historical Corpora 26

Electronic editions

Dagstuhl, December, 2006 Digital Historical Corpora 27

Electronic edition: reverse index of word-forms and context

Dagstuhl, December, 2006 Digital Historical Corpora 28

Acknowledgment

The work on the creation of IRS Manuscript is being carried out with the support from the Russian Foundation of Basic Research (Grant # 05-07-90217в).

Τhe work on the creation of the automated morphologic analyzer with the support of the Russian Foundation for the Humanities (Grant # 05-04-12408в).

Dagstuhl, December, 2006 Digital Historical Corpora 29

Contacts

Victor Baranov - baranov@udm.ru

http://manuscripts.ru/index_en.html

Laboratory of Computer-Aided Philological Research Udmurtia State University

Linguistics DepartmentIzhevsk State Technical UniversityIzhevsk, Russia

top related