digital research and big data: is the tail wagging the dog?

1
Digital Research and Big Data Is the Tail Wagging the Dog? Ralph Schroeder & Eric T. Meyer Oxford Internet Institute University of Oxford [ralph.schroeder],[eric.meyer]@oii.ox.ac.uk Source: Leonard John Mahews, CC-BY-SA (hp://www.flickr.com/photos/mythoto/3033590171) Big data are data that are unprecedented in scale and scope in relaon to a given phenomenon. They are oſten streams of data (rather than fixed datasets), accumulang large volumes, oſten at high velocity. Is the tail of the availability of big data and computaonal methods wagging the dog of good research quesons? If not, how do big data advance research? What are the opportunies and challenges? Case 1 Search Engine Behaviour Waller’s [1] analysis of Australian Google Users Key findings: - Mainly leisure - > 2% contemporary issues - No percepble ‘class’ differences Novel advance: - Unprecedented insight into what people search for Challenge: - Replicability - Securing access to commercial data Case 2 Large-Scale Text Analysis Michelet et al. [7] ‘culturomic’ analysis of 5 Million Digized Google Books and Heuser & Le-Khac [8] of 2779 19th Century Brish Novels Key findings: - Paerns of key terms - Industrializaon ed to shiſt from abstract to concrete words Novel advance: - Replicability, extension to other areas, systemac analysis of cultural materials Challenge: - Data quality Case 3 Social Network or News? Kwak et al.’s [17] analysis of Twier Key findings: - 1.47 billion social relaons - 2/3 of users are not followers or not followed by any of their followings - Celebries, policians and news are among top 20 being followed Novel advance: - Volume of relaons and topics Challenge: - News or social network needs to be contextualized in media ecology - Securing access to commercial data Conclusions Savage and Burrows? [6], who ask are commercial data outpacing social science? Boyd and Crawford? [18], who ask if big data raise epistemological conundrums? ... No ... The connecon between research technologies and the advance of knowledge The threats and opportunies represented by unprecended windows into people’s minds and thoughts Does this lead to more ‘scienfic’ (i.e. cumulave) social sciences and humanies? [14] S. Fish, “Mind Your P’s and B’s: The Digital Humanities and Interpretation”. The New York Times Opinionator [Online Commentary]. January 23, 2012. Online http://opinionator.blogs.nytimes.com/ 2012/01/23/mind-your-ps-and-bs-the-digital-humanities-and-interpretation/?hp [15] T. Porter, “Statistics and Statistical Methods. In ‘The Modern Social Sciences”, in T. Porter and D.Ross, eds. The Modern Social Sciences. Cambridge: Cambridge University Press, 238-50, 2008. [16] J. Beniger, The Control Revolution: Technological and Economic Origins of the Information Society. Cambridge, MA: Harvard University Press, 1996. [17] Kwak, H. et al. (2010). ‘What is Twitter, a Social Network or a News Media?’ Proceedings of the 19th International World Wide Web (WWW) Conference, April 26-30, 2010, Raleigh NC. [18] boyd, D. and Crawford, K. (2012). ‘Critical Questions for big data: Provocations for a cultural, technological and scholarly phenomenon’, Information, Communication and Society, 15(5), 662-79. References [1] V. Waller, “Not Just Information: Who Searches for What on the Search Engine Google?”, Journal of the American Society for Information Science and Technology, 62(4): 761-75, 2011. [2] E. Segev and N. Ahituv, “Popular Searches in Google and Yahoo!: A ‘Digital Divide’ in Information Uses?” The Information Society 26 (1): 17-37, 2010. [3] M. Hindman, The Myth of Digital Democracy. Princeton: Princeton University Press, 2010. [4] B. Tancer, Click: What Millions of People are Doing Online and Why It Matters. New York: Harper Collins, 2009. [5] W. H. Dutton and G. Blank, G. Next Generation Users: The Internet in Britain. Oxford Internet Survey 2011. Oxford Internet Institute, University of Oxford. Available at http://www.oii.ox.ac.uk/events/?id=453 (last accessed April 16, 2012). [6] M. Savage and R. Burrows, “The Coming Crisis of Empirical Sociology”, Sociology 41(5): 885-899, 2011. [7] J. Michelet al. Quantitative Analysis of Culture Using Millions of Digitized Books. Science: Vol. 331 no. 6014 pp. 176-182. 2010. [8] R. Heuser and L. Le-Khac, “Learning to Read Data: Bringing out the Humanistic in the Digital Humanities,” Victorian Studies 54.1: 79-86, 2011. [9] F. Moretti, “Conjectures on World Literature”, New Left Review, 1, p.54-68, 2000. [10] F. Moretti, Graphs, Maps, Trees: Abstract Models for Literary History. London: Verso, 2005. [11] A. Stauffer. “Introduction: Searching Engines, Readings Machines”. Victorian Studies 54.1, 63-68, 2011 [12] P. Duguid, “Inheritance and loss? A brief survey of Google Books”.FirstMonday12(8),2007.Online http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/1972/1847 [13] G. Nunberg, “Google’s Book Search: A Disaster for Scholars.” The Chronicle Review August 31, 2009. Online http://chronicle.com/article/Googles-Book-Search-A/48245/.

Upload: eric-meyer

Post on 14-Jun-2015

517 views

Category:

Documents


2 download

DESCRIPTION

Poster presented at Digital Research 2012, 10-12 September 2012, digital-research.oerc.ox.ac.uk

TRANSCRIPT

Page 1: Digital Research and Big Data: Is the Tail Wagging the Dog?

Digital Research and Big Data Is the Tail Wagging the Dog?

Ralph Schroeder & Eric T. MeyerOxford Internet Institute

University of Oxford[ralph.schroeder],[eric.meyer]@oii.ox.ac.uk

Source: Leonard John Matthews, CC-BY-SA (http://www.flickr.com/photos/mythoto/3033590171)

Big data are data that are unprecedented in scale and scopein relation to a given phenomenon. They are often streamsof data (rather than fixed datasets), accumulating large volumes,often at high velocity.

Is the tail of the availability of big data and computational methodswagging the dog of good research questions?

If not, how do big data advance research?What are the opportunities and challenges?

Case 1Search Engine Behaviour

Waller’s [1] analysis of Australian Google Users

Key findings: - Mainly leisure - > 2% contemporary issues - No perceptible ‘class’ differences

Novel advance: - Unprecedented insight into what people search for

Challenge: - Replicability - Securing access to commercial data

Case 2Large-Scale Text Analysis

Michelet et al. [7] ‘culturomic’ analysis of 5 Million Digitized Google Books and Heuser & Le-Khac [8] of2779 19th Century British Novels

Key findings: - Patterns of key terms - Industrialization tied to shift from abstract to concrete words

Novel advance: - Replicability, extension to other areas, systematic analysis of cultural materials

Challenge: - Data quality

Case 3Social Network or News?

Kwak et al.’s [17] analysis of Twitter

Key findings: - 1.47 billion social relations - 2/3 of users are not followers or not followed by any of their followings - Celebrities, politicians and news are among top 20 being followed

Novel advance: - Volume of relations and topics

Challenge: - News or social network needs to be contextualized in media ecology - Securing access to commercial data

ConclusionsSavage and Burrows? [6], who ask are commercial data outpacing social science?Boyd and Crawford? [18], who ask if big data raise epistemological conundrums?

... No ...

The connection between research technologies and the advance of knowledge

The threats and opportunities represented by unprecended windows into people’s minds and thoughts

Does this lead to more ‘scientific’ (i.e. cumulative) social sciences and humanities?

[14] S. Fish, “Mind Your P’s and B’s: The Digital Humanities and Interpretation”. The New York Times Opinionator [Online Commentary]. January 23, 2012. Online http://opinionator.blogs.nytimes.com/ 2012/01/23/mind-your-ps-and-bs-the-digital-humanities-and-interpretation/?hp[15] T. Porter, “Statistics and Statistical Methods. In ‘The Modern Social Sciences”, in T. Porter and D.Ross, eds. The Modern Social Sciences. Cambridge: Cambridge University Press, 238-50, 2008.[16] J. Beniger, The Control Revolution: Technological and Economic Origins of the Information Society. Cambridge, MA: Harvard University Press, 1996.[17] Kwak, H. et al. (2010). ‘What is Twitter, a Social Network or a News Media?’ Proceedings of the 19th International World Wide Web (WWW) Conference, April 26-30, 2010, Raleigh NC.[18] boyd, D. and Crawford, K. (2012). ‘Critical Questions for big data: Provocations for a cultural, technological and scholarly phenomenon’, Information, Communication and Society, 15(5), 662-79.

References[1] V. Waller, “Not Just Information: Who Searches for What on the Search Engine Google?”, Journal of the American Society for Information Science and Technology, 62(4): 761-75, 2011.[2] E. Segev and N. Ahituv, “Popular Searches in Google and Yahoo!: A ‘Digital Divide’ in Information Uses?” The Information Society 26 (1): 17-37, 2010.[3] M. Hindman, The Myth of Digital Democracy. Princeton: Princeton University Press, 2010.[4] B. Tancer, Click: What Millions of People are Doing Online and Why It Matters. New York: Harper Collins, 2009.[5] W. H. Dutton and G. Blank, G. Next Generation Users: The Internet in Britain. Oxford Internet Survey 2011. Oxford Internet Institute, University of Oxford. Available at http://www.oii.ox.ac.uk/events/?id=453 (last accessed April 16, 2012).[6] M. Savage and R. Burrows, “The Coming Crisis of Empirical Sociology”, Sociology 41(5): 885-899, 2011.

[7] J. Michelet al. Quantitative Analysis of Culture Using Millions of Digitized Books. Science: Vol. 331 no. 6014 pp. 176-182. 2010.[8] R. Heuser and L. Le-Khac, “Learning to Read Data: Bringing out the Humanistic in the Digital Humanities,” Victorian Studies 54.1: 79-86, 2011.[9] F. Moretti, “Conjectures on World Literature”, New Left Review, 1, p.54-68, 2000.[10] F. Moretti, Graphs, Maps, Trees: Abstract Models for Literary History. London: Verso, 2005. [11] A. Stauffer. “Introduction: Searching Engines, Readings Machines”. Victorian Studies 54.1, 63-68, 2011[12] P. Duguid, “Inheritance and loss? A brief survey of Google Books”.FirstMonday12(8),2007.Online http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/1972/1847[13] G. Nunberg, “Google’s Book Search: A Disaster for Scholars.” The Chronicle Review August 31, 2009. Online http://chronicle.com/article/Googles-Book-Search-A/48245/.