what is webometrics? mike thelwall statistical cybermetrics research group university of...
TRANSCRIPT
What is Webometrics?
Mike ThelwallStatistical Cybermetrics Research Group
University of Wolverhampton, UK
Virtual Knowledge Studio (VKS)
Information Studies
1. Introduction
□Webometrics is concerned with gathering data on and measuring aspects of the Web□web sites□web pages□hyperlinks□web search engine results□YouTube video commenter networks□MySpace Friend networks
□…for very varied social science purposes
New problems: Web-based phenomena
□Webometrics can be applied to understanding web-based phenomena□Why do web sites interlink?□Which web sites interlink?□What interlinking patterns exist?□What topics are frequently blogged
about?
Old problems: Offline phenomena reflected online
□Some offline phenomena have measurable online reflections□International communication□Inter-university collaboration□University-business collaboration□The impact or spread of ideas□Public opinion
2. ExamplesBlog searching - blogpulse.com
Example: Identifying and tracking public science concerns
in blogsOver 100,000 Blogs and other sources tracked
daily via RSS feedsObjective: to identify and track public
concerns about scienceE.g., “Schiavo” identified and tracked as
potential public science concern
Example: The online impact of research groups (NetReAct)
Normalised linking, smallest countries removed
Geopoliticalconnected
SwedenFinland
Norway
UK
Germany
Austria Switzerland
Poland
Italy
Belgium
Spain
France
NL
Example:Links betweenEU universities
International biofuels research network
Example: MySpace age profiles
percentage of profiles containing swearing
moderate strong very strong sample size
US males 16-19 10% 47% 2% 1,530
US females 16-19 11% 38% 2% 1,287
UK males 16-19 33% 33% 8% 171
UK females 16-19 18% 38% 3% 130
(typical sample size 20-148 for non-web swearing research)
emphatic adverb/adjective OR adverbial booster OR premodifying intensifying negative adjective
(36% of swearing)
□and we r guna go to town again n make a ryt fuckin nyt of it again lol
□see look i'm fucking commenting u back□lol and stop fucking tickleing me!! □Thanks for the party last night it was fucking
good and you are great hosts. □That 50's rock and roll weekender was fucking
mint! □Fuckin my space, my arse □1/2 d ppl cudnt even speak fuckin english! □yeah so me and sarah broke up and
everythings fucking shit
YouTube – Video poster ages
YouTubefriend network
Online impact - Keywords in web pages mentioning IWRM
Data Gathering/Processing Tools
□Blogpulse.com – blog network diagrams
□LexiURL Searcher – links, web text, YouTube, Flickr, Technorati
□Issue Crawler, Google TouchGraph - links
Discussion points for online data
□ Validity – is the underlying meaning of the text/video/picture readily apparent to the researcher?□ Possibly not to any great degree for teenagers’ MySpace
comments or very personal YouTube videos
□ Reliability –are search engines accurate/good at returning the correct results?□ Google blog search shows unreliability – very variable
over time□ Researchers can triangulate different similar search
engines or over time to test reliability
Discussion points for online data
□Coverage – to what extent is all the phenomena of interest covered by the source (e.g., search engine) used?
□Sample bias – are certain types of people over-represented? (e.g., the more literate, the more vocal, the more politically active, youth, educated, creative types…)
Summary
□The web contains a wide variety of interesting web and “web 2.0” content posted by many different people in many different formats
□Webometric methods can give insights into this data
Books
□Thelwall, M. (2009). Introduction to webometrics: Quantitative web research for the social sciences. New York: Morgan & Claypool.
□Rogers, R. (2005). Information politics on the Web. Massachusetts: MIT Press.
□ http://lexiurl.wlv.ac.uk http://webometrics.wlv.ac.uk http://www.issuecrawler.net
Important considerations
□Data accuracy□Data cleaning□Context to help interpret results□Report results carefully
Example: Analysis of the accuracy of search engine
results
Live Search results analysis