![Page 1: OLAC: Accessing the World’s Language Resourcesolac.ldc.upenn.edu/documents/iasa.pdf · Searching for “kayardild” using the OLAC service at LDC. What is the current coverage](https://reader033.vdocuments.mx/reader033/viewer/2022060812/6090c5e1dfdb3c19b510311f/html5/thumbnails/1.jpg)
OLAC:Accessing the World’s Language Resources
Steven Bird CSSE, University of Melbourne LDC, University of Pennsylvania
Gary Simons SIL International Graduate Institute of Applied Linguistics
![Page 2: OLAC: Accessing the World’s Language Resourcesolac.ldc.upenn.edu/documents/iasa.pdf · Searching for “kayardild” using the OLAC service at LDC. What is the current coverage](https://reader033.vdocuments.mx/reader033/viewer/2022060812/6090c5e1dfdb3c19b510311f/html5/thumbnails/2.jpg)
What is OLAC?
• OLAC is an international partnership of institutions and individuals who are creating a world-wide virtual library of language resources by:
• Developing consensus on best current practice for the digital archiving of language resources
• Developing a network of interoperating repositories and services for housing and accessing such resources
• Founded in December 2000
• now has 34 participating archives
• hosted at www.language-archives.org
![Page 3: OLAC: Accessing the World’s Language Resourcesolac.ldc.upenn.edu/documents/iasa.pdf · Searching for “kayardild” using the OLAC service at LDC. What is the current coverage](https://reader033.vdocuments.mx/reader033/viewer/2022060812/6090c5e1dfdb3c19b510311f/html5/thumbnails/3.jpg)
Who is involved in OLAC?
Aboriginal Studies Electronic Data Archive 1.Academia Sinica2.Alaska Native Language Center3.Archive of Indigenous Languages of Latin America 4.ATILF Resources5.Berkeley Language Center6.Centre de Ressources pour la Description de l'Oral7.CHILDES Data Repository8.Comparative Corpus of Spoken Portuguese9.Cornell Language Acquisition Laboratory10.Dictionnaire Universel Boiste 181211.DOBES catalogue (MPI, Nijmegen)12.Ethnologue: Languages of the World13.European Language Resources Association14.Laboratoire Parole et Langage15.Linguistic Data Consortium Corpus Catalog16.LINGUIST List Language Resources17.Natural Language Software Registry18.Online Database of Interlinear Text (ODIN)19.Oxford Text Archive20.PARADISEC21.Perseus Digital Library22.Research Papers in Computational Linguistics
23.Rosetta Project 1000 Language Archive24.SIL Language and Culture Archives25.Surrey Morphology Group Databases26.Survey for California and Other Indian Languages27.TalkBank 28.Tibetan and Himalayan Digital Library29.TRACTOR30.Typological Database Project31.University of Bielefeld Language Archive32.University of Queensland Flint Archive33.Virtual Kayardild Archive (Melbourne)
![Page 4: OLAC: Accessing the World’s Language Resourcesolac.ldc.upenn.edu/documents/iasa.pdf · Searching for “kayardild” using the OLAC service at LDC. What is the current coverage](https://reader033.vdocuments.mx/reader033/viewer/2022060812/6090c5e1dfdb3c19b510311f/html5/thumbnails/4.jpg)
How does OLAC work?
1. Archives submit catalogs in a standard format
2.Central index is updated every 8 hours
3.Accessed via search services
![Page 5: OLAC: Accessing the World’s Language Resourcesolac.ldc.upenn.edu/documents/iasa.pdf · Searching for “kayardild” using the OLAC service at LDC. What is the current coverage](https://reader033.vdocuments.mx/reader033/viewer/2022060812/6090c5e1dfdb3c19b510311f/html5/thumbnails/5.jpg)
How does OLAC Metadata relate to Dublin Core?
• OLAC has extended Dublin Core metadata by providing additional descriptors tailored to language resources:
• Subject language
• Linguistic type
• Linguistic field
• Discourse type
• Role
![Page 6: OLAC: Accessing the World’s Language Resourcesolac.ldc.upenn.edu/documents/iasa.pdf · Searching for “kayardild” using the OLAC service at LDC. What is the current coverage](https://reader033.vdocuments.mx/reader033/viewer/2022060812/6090c5e1dfdb3c19b510311f/html5/thumbnails/6.jpg)
Searching for “kayardild” using the OLAC service at LDC
![Page 7: OLAC: Accessing the World’s Language Resourcesolac.ldc.upenn.edu/documents/iasa.pdf · Searching for “kayardild” using the OLAC service at LDC. What is the current coverage](https://reader033.vdocuments.mx/reader033/viewer/2022060812/6090c5e1dfdb3c19b510311f/html5/thumbnails/7.jpg)
What is the current coverage of OLAC?
All archives Excluding Ethnologue
items 36,161 28,892
online items 21,579 (60%) 14,310 (50%)
ISO 639-3 coverage
7,334 3556
![Page 8: OLAC: Accessing the World’s Language Resourcesolac.ldc.upenn.edu/documents/iasa.pdf · Searching for “kayardild” using the OLAC service at LDC. What is the current coverage](https://reader033.vdocuments.mx/reader033/viewer/2022060812/6090c5e1dfdb3c19b510311f/html5/thumbnails/8.jpg)
OLAC Coverage in Relation to Language Size
![Page 9: OLAC: Accessing the World’s Language Resourcesolac.ldc.upenn.edu/documents/iasa.pdf · Searching for “kayardild” using the OLAC service at LDC. What is the current coverage](https://reader033.vdocuments.mx/reader033/viewer/2022060812/6090c5e1dfdb3c19b510311f/html5/thumbnails/9.jpg)
Online Resources in Relation to Language Size
![Page 10: OLAC: Accessing the World’s Language Resourcesolac.ldc.upenn.edu/documents/iasa.pdf · Searching for “kayardild” using the OLAC service at LDC. What is the current coverage](https://reader033.vdocuments.mx/reader033/viewer/2022060812/6090c5e1dfdb3c19b510311f/html5/thumbnails/10.jpg)
Current Issues
• Three shortcomings:
• metadata quality
• participation
• digitisation projects
• Broader Challenges
• searching for language resources on the web
• library catalogues and the deep web
• discovering OLAC
![Page 11: OLAC: Accessing the World’s Language Resourcesolac.ldc.upenn.edu/documents/iasa.pdf · Searching for “kayardild” using the OLAC service at LDC. What is the current coverage](https://reader033.vdocuments.mx/reader033/viewer/2022060812/6090c5e1dfdb3c19b510311f/html5/thumbnails/11.jpg)
NSF Project
• improve access to resources in language archives:
• All OLAC repositories should have up-to-date catalogs that contain metadata conforming to best practice.
• All major language archives should be participating in OLAC.
• All OLAC repositories should conform to current best practices for the long-term curation of their holdings.
• improve access to language resources on the web:
• Low-density language materials identified by linguistic web mining should be reliably categorized with OLAC vocabularies.
• Language resources held in libraries and digital repositories should be indexed in OLAC through services that crosswalk and enrich existing catalog records.
• Web search engines should index all OLAC records, so that users who discover language resources using a conventional web search quickly find OLAC records and are drawn to the OLAC site for more precise searching.
![Page 12: OLAC: Accessing the World’s Language Resourcesolac.ldc.upenn.edu/documents/iasa.pdf · Searching for “kayardild” using the OLAC service at LDC. What is the current coverage](https://reader033.vdocuments.mx/reader033/viewer/2022060812/6090c5e1dfdb3c19b510311f/html5/thumbnails/12.jpg)
Conclusion
• OLAC has a functioning infrastructure that allows our community to index and discover endangered language documentation
• We need to encourage every institution with endangered language resources to participate so that the catalog can be complete
• We could enhance standards for resource description to support metrics for summarising the extent of documentation for a language
![Page 13: OLAC: Accessing the World’s Language Resourcesolac.ldc.upenn.edu/documents/iasa.pdf · Searching for “kayardild” using the OLAC service at LDC. What is the current coverage](https://reader033.vdocuments.mx/reader033/viewer/2022060812/6090c5e1dfdb3c19b510311f/html5/thumbnails/13.jpg)
OLAC Website: Documents
![Page 14: OLAC: Accessing the World’s Language Resourcesolac.ldc.upenn.edu/documents/iasa.pdf · Searching for “kayardild” using the OLAC service at LDC. What is the current coverage](https://reader033.vdocuments.mx/reader033/viewer/2022060812/6090c5e1dfdb3c19b510311f/html5/thumbnails/14.jpg)
OLAC Website: Participating Archives
![Page 15: OLAC: Accessing the World’s Language Resourcesolac.ldc.upenn.edu/documents/iasa.pdf · Searching for “kayardild” using the OLAC service at LDC. What is the current coverage](https://reader033.vdocuments.mx/reader033/viewer/2022060812/6090c5e1dfdb3c19b510311f/html5/thumbnails/15.jpg)
Archive Details: PARADISEC
![Page 16: OLAC: Accessing the World’s Language Resourcesolac.ldc.upenn.edu/documents/iasa.pdf · Searching for “kayardild” using the OLAC service at LDC. What is the current coverage](https://reader033.vdocuments.mx/reader033/viewer/2022060812/6090c5e1dfdb3c19b510311f/html5/thumbnails/16.jpg)
Archive Metrics: PARADISEC
![Page 17: OLAC: Accessing the World’s Language Resourcesolac.ldc.upenn.edu/documents/iasa.pdf · Searching for “kayardild” using the OLAC service at LDC. What is the current coverage](https://reader033.vdocuments.mx/reader033/viewer/2022060812/6090c5e1dfdb3c19b510311f/html5/thumbnails/17.jpg)
Archive Metrics: PARADISEC
![Page 18: OLAC: Accessing the World’s Language Resourcesolac.ldc.upenn.edu/documents/iasa.pdf · Searching for “kayardild” using the OLAC service at LDC. What is the current coverage](https://reader033.vdocuments.mx/reader033/viewer/2022060812/6090c5e1dfdb3c19b510311f/html5/thumbnails/18.jpg)
XML Format of OLAC Record
![Page 19: OLAC: Accessing the World’s Language Resourcesolac.ldc.upenn.edu/documents/iasa.pdf · Searching for “kayardild” using the OLAC service at LDC. What is the current coverage](https://reader033.vdocuments.mx/reader033/viewer/2022060812/6090c5e1dfdb3c19b510311f/html5/thumbnails/19.jpg)
Display Format for OLAC Record
![Page 20: OLAC: Accessing the World’s Language Resourcesolac.ldc.upenn.edu/documents/iasa.pdf · Searching for “kayardild” using the OLAC service at LDC. What is the current coverage](https://reader033.vdocuments.mx/reader033/viewer/2022060812/6090c5e1dfdb3c19b510311f/html5/thumbnails/20.jpg)
Metadata Quality Analysis for OLAC Record
![Page 21: OLAC: Accessing the World’s Language Resourcesolac.ldc.upenn.edu/documents/iasa.pdf · Searching for “kayardild” using the OLAC service at LDC. What is the current coverage](https://reader033.vdocuments.mx/reader033/viewer/2022060812/6090c5e1dfdb3c19b510311f/html5/thumbnails/21.jpg)
Comparative Archive Metrics