xml and general dutch dictionary (anw)

17
XML and General Dutch Dictionary (ANW) Van der Kamp, Lexical databases and digital tools, april 29 th , 2005, 1 Peter van der Kamp www.inl.nl [email protected]

Upload: upton

Post on 04-Feb-2016

56 views

Category:

Documents


0 download

DESCRIPTION

  . XML and General Dutch Dictionary (ANW). Peter van der Kamp www.inl.nl [email protected]. Van der Kamp, Lexical databases and digital tools, april 29 th , 2005, 1.   . Topics. Characteristics Schema XML Dictionary Editor Problems to be solved. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: XML and General Dutch Dictionary (ANW)

XML and General Dutch Dictionary (ANW)

Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 1

Peter van der Kampwww.inl.nl

[email protected]

Page 2: XML and General Dutch Dictionary (ANW)

Topics

• Characteristics

• Schema

• XML Dictionary Editor

• Problems to be solved

Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 2

Page 3: XML and General Dutch Dictionary (ANW)

Characteristics

Online dictionary, no printed version

Dutch language (incl. Flanders) from 1970 - 2018

Based on a corpus of 100 mio words

Elaborated microstructure

XML

Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 3

Page 4: XML and General Dutch Dictionary (ANW)

Schema characteristics

• Divided into 12 subschemas• Currently all elements: zero or more occurrences except headword• Currently 186 atomic elements• Many enumerations (378, to be used as controlled vocabulary)• Some elements allowed at different levels

Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 4

Page 5: XML and General Dutch Dictionary (ANW)

Schema

Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 5

Entry

PoS Sense

Entry

PoS

Page 6: XML and General Dutch Dictionary (ANW)

XML Dictionary Editor

User requirements:

• Don’t want to work with tags• Tags invisible

Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 6

Page 7: XML and General Dutch Dictionary (ANW)

XML Dictionary Editor (cont’d)

User requirements:

• Form like input• Use of predefined lists (controlled vocabulary)

Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 7

Page 8: XML and General Dutch Dictionary (ANW)

XML Dictionary Editor (cont’d)

User requirements:

• Insert, add and remove elements must be easy

Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 8

Page 9: XML and General Dutch Dictionary (ANW)

XML Dictionary Editor (cont’d)

Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 9

User requirements: • Hide/show elements

Technical requirements• Subschema enabled

Page 10: XML and General Dutch Dictionary (ANW)

XML Dictionary Editor (cont’d)

XML editor, but…

…which one?

XMLWriter

Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 10

Page 11: XML and General Dutch Dictionary (ANW)

XML Dictionary Editor (cont’d)

Currently the best possible solution:

Authentic (free XML content editor from Altova)StyleVision (e-forms and stylesheet designer from Altova)(http://www.altova.com)

Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 11

Page 12: XML and General Dutch Dictionary (ANW)
Page 13: XML and General Dutch Dictionary (ANW)

XML Dictionary Editor: problems

Problem: hide element = delete elementHide element important due to size of entry

Solution (to be implemented):• Extra element <hide> in schema• Checkbox as ‘data entry device’• When unchecked: perform hide

Disadvantage:<hide> is noise in dictionary entry

Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 13

Page 14: XML and General Dutch Dictionary (ANW)
Page 15: XML and General Dutch Dictionary (ANW)
Page 16: XML and General Dutch Dictionary (ANW)

XML Dictionary Editor: problems

Problem: visualize difference between container elements and atomic elements.

Current implementation requires some schema knowledge

Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 16

Page 17: XML and General Dutch Dictionary (ANW)

Conclusion / future work

Developing forms easyCurrent implementation satisfying

Database solution (relational vs. xml)RetrievalEasy use of (X)query language

Van der Kamp, Lexical databases and digital tools, april 29th, 2005, 17