carl burnett: searching the corpus of contemporary american english
TRANSCRIPT
Searching the Corpus of Contemporary American
English (COCA)Carl Burnett1 April 2012
a UW iSchool info skills workshop
A large (425 million-word) collection of electronic texts, with some very fancy search options
Compiled by Mark Davies, a computational linguist @Brigham Young University
What is it?
Answering language-based questions in an objective way: using data.◦ Who is more likely to use the word contemptible, fiction writers or journalists?
◦ What are some synonyms of the word research?◦ What nouns are most likely to be described as glittering?
What’s it good for?
More search features:◦ Wildcards◦ Proximity searching (collocates)◦ Parts of speech
Categorized source material:◦ Academic writing◦ Fiction◦ Spoken text◦ Newspaper◦ Magazine
Search by time period (1990-2011)
Why not just use Google?
Free access (create an account for full features) at http://corpus.byu.edu/coca/
Let’s play around with it!