d-square digital databases and digital tools for wbd and wld folkert de vriend 17-05-06 digital...
TRANSCRIPT
D-Square
Digital Databases and Digital Tools
for
WBD and WLD
Folkert de Vriend
17-05-06
Outline
• Digitisation project (shortly)• Plans and ideas for papers
A. Data driven clusteringB. Open Language ResourcesC. Cartography
People
CLSTLou BovesHenk van den HeuvelFolkert de Vriend
CLSRoeland van HoutJoep KruijsenJos Swanenberg
PolderlandTheo van de Heuvel
WBD page ->
Data conversion overview ->
Deel IIIMS-Word
Editors
/
Management
Users
Analog Digital Digital Analog
(parts of)Vol. I+IIMS-Word
Filing cards
WebsiteWBD/WLD
with tools forsearching andcartography
EnricheddataXML
Raw dataFileM Pro
(parts of)Vol. I+IIMacWrite
Questionnaires Nijmegen and
Leuven
Questionnaires (chiefly) Meertens
Raw data
Vol. I + II Vol. III
Edited data
Specializedprint editions (dialect atlas
or local dictionary)
Online DB WBD
(Polderland)
Edited dataXML
Vol. IIIFileM Pro
SGV on CD(Polderland)
Vol. III
Web access
Taxonomic acces to dataSearch interface
Research ideas and plans
A: Data driven clusteringHuman interpretation of
patterns vs
computational clustering based on distances. (lexical or phonetic)
B: Open Language Resources
• “Wikipedia style” LR• Digitisation not the end of
the evolution of a LR• Evolution of Web seems to
be towards “Social Computing”
• Think of railroads -> cars
Policing
• How to automate police activities regarding open
(language) resources?• Maybe “distant”
entries/edits are more suspicious. When distant ->
notify police.
C: Cartography• Cartography as tool not just for illustration -> Google Earth.• Advantages:
• Different views on the data.• Easy to link different resources (also for end user)
Implementation
=
• Short termPaper on data driven clustering.Paper on cartography.
• Longer termPaper(s) on Open LR / Social Computing.