a webometric analysis of online health information: sponsorship, platform type and link structures

23
A webometric analysis of online health information: sponsorship, platform type and link structures Darja Groselj Oxford Internet Institute, University of Oxford, Oxford, UK Abstract Purpose – This study aims to map the information landscape as it unfolds to users when they search for health topics on general search engines. Website sponsorship, platform type and linking patterns were analysed in order to advance the understanding of the provision of health information online. Design/methodology/approach – The landscape was sampled by ten very different search queries and crawled with VOSON software. Drawing on Roger’s framework of information politics on the web, the landscape is described on two levels. The front-end is examined qualitatively by assessing website sponsorship and platform type. On the back-end, linking patterns are analysed using hyperlink network analysis. Findings – A vast majority of the websites have commercial and organisational sponsorship. The analysis of the platform type shows that health information is provided mainly on static homepages, informational portals and general news sites. A comparison of ten different health domains revealed substantial differences in their landscapes, related to domain-specific characteristics. Research limitations/implications – The size and properties of the web crawl were shaped by using third party software, and the generalisability of the results is limited by the selected search queries. Further research exploring how specific characteristics of different health domains shape provision of information online is suggested. Practical implications – The demonstrated method can be used by organisations to discern the characteristics of the online information landscape in which they operate and to inform their business strategies. Originality/value – The study examines health information landscapes on a large scale and makes an original contribution by comparing them across ten different health domains. Keywords Search engines, Sponsorship, Webometrics, Health information, Hyperlink analysis, Platform type Paper type Research paper Introduction The diffusion of internet technologies and advancement of online information services have dramatically changed the ways in which people seek and consume health information (Sundar et al., 2011). Research shows that the internet is “the de facto second opinion” people rely on in addition to doctors (Szokan, 2011) and that it strongly affects how people manage their own or someone else’s health (Fox and Jones, 2009; The current issue and full text archive of this journal is available at www.emeraldinsight.com/1468-4527.htm The author would like to thank Dr Sandra Gonzalez-Bailon and Prof. Ralph Schroeder for their help and guidance throughout the course of this research. The author would also like to thank Prof. John Powell and two anonymous reviewers for their helpful comments and suggestions, from which the final version of the paper has benefited greatly. This research was supported by the Slovene Human Resources Development and Scholarship Fund which covered the author’s tuition fees. Online health information 209 Received 16 January 2013 First revision approved 7 May 2013 Online Information Review Vol. 38 No. 2, 2014 pp. 209-231 q Emerald Group Publishing Limited 1468-4527 DOI 10.1108/OIR-01-2013-0011

Upload: darja

Post on 27-Jan-2017

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: A webometric analysis of online health information: sponsorship, platform type and link structures

A webometric analysis of onlinehealth information: sponsorship,platform type and link structures

Darja GroseljOxford Internet Institute, University of Oxford, Oxford, UK

Abstract

Purpose – This study aims to map the information landscape as it unfolds to users when they searchfor health topics on general search engines. Website sponsorship, platform type and linking patternswere analysed in order to advance the understanding of the provision of health information online.

Design/methodology/approach – The landscape was sampled by ten very different search queriesand crawled with VOSON software. Drawing on Roger’s framework of information politics on the web,the landscape is described on two levels. The front-end is examined qualitatively by assessing websitesponsorship and platform type. On the back-end, linking patterns are analysed using hyperlinknetwork analysis.

Findings – A vast majority of the websites have commercial and organisational sponsorship. Theanalysis of the platform type shows that health information is provided mainly on static homepages,informational portals and general news sites. A comparison of ten different health domains revealedsubstantial differences in their landscapes, related to domain-specific characteristics.

Research limitations/implications – The size and properties of the web crawl were shaped byusing third party software, and the generalisability of the results is limited by the selected searchqueries. Further research exploring how specific characteristics of different health domains shapeprovision of information online is suggested.

Practical implications – The demonstrated method can be used by organisations to discern thecharacteristics of the online information landscape in which they operate and to inform their businessstrategies.

Originality/value – The study examines health information landscapes on a large scale and makesan original contribution by comparing them across ten different health domains.

Keywords Search engines, Sponsorship, Webometrics, Health information, Hyperlink analysis,Platform type

Paper type Research paper

IntroductionThe diffusion of internet technologies and advancement of online information serviceshave dramatically changed the ways in which people seek and consume healthinformation (Sundar et al., 2011). Research shows that the internet is “the de factosecond opinion” people rely on in addition to doctors (Szokan, 2011) and that it stronglyaffects how people manage their own or someone else’s health (Fox and Jones, 2009;

The current issue and full text archive of this journal is available at

www.emeraldinsight.com/1468-4527.htm

The author would like to thank Dr Sandra Gonzalez-Bailon and Prof. Ralph Schroeder for their helpand guidance throughout the course of this research. The author would also like to thank Prof. JohnPowell and two anonymous reviewers for their helpful comments and suggestions, from which thefinal version of the paper has benefited greatly. This research was supported by the Slovene HumanResources Development and Scholarship Fund which covered the author’s tuition fees.

Online healthinformation

209

Received 16 January 2013First revision approved

7 May 2013

Online Information ReviewVol. 38 No. 2, 2014

pp. 209-231q Emerald Group Publishing Limited

1468-4527DOI 10.1108/OIR-01-2013-0011

Page 2: A webometric analysis of online health information: sponsorship, platform type and link structures

Sadasivam et al., 2013). The internet has altered the landscape of health information,particularly in terms of expanding the range of available resources, improving accessand empowering patients (Powell et al., 2011; Cline and Haynes, 2001). It makes iteasier for people to seek health information themselves, to become more involved intheir own health and to become exposed to a wider array of health information (Rice,2006), ranging from organisation-sponsored information websites to interactiveapplications such as support forums, online consultations and social media sites (vander Vaart et al., 2013; Thackeray et al., 2013; Eysenbach, 2008).

The use of the internet for health-related issues is extensive. In the US four out offive internet users seek health information online, this activity being the third mostpopular after emailing and use of search engines (Fox, 2011), and eight in ten onlinehealth enquiries begin by using a general search engine (Fox and Duggan, 2013).However it is not entirely clear what kind of sources people are most likely to encounterwhen searching for health information on general search engines. There is a need for amore substantial understanding of the online health information landscape acrossdifferent health domains. Thus this study addresses the following questions: whatinformation sources are present and most prominent in the online health realm, how dothey relate to one another, and what do they offer to end-users?

“Information landscape” is not a fully defined term, but scholars often use it to referto the constellation of sites providing information on the web (Delamothe, 2001; Eng,2001; O’Day and Jeffries, 1993). Previous investigations of the health informationlandscape differ in scope and in how the entry point to the landscape was determined.Some authors provide detailed analyses of a handful of websites selected fromexogenously created databases of health-related websites (e.g. Rice et al., 2001; Westand Miller, 2009; Grossman and Zerilli, 2013). While such studies offer insight intospecific websites’ sponsorship, content and features, they do not assess theirprominence or how they relate to other websites in the larger health-related web.Studies of search results for medical queries are often limited by a small selection oftopic-specific search queries and focus primarily on assessing the quality of providedinformation (e.g. Kaimal et al., 2008; Reichow et al., 2012; Quinn et al., 2012; Scullardet al., 2010; Black and Penson, 2006). There are three notable exceptions: Laurent andVickers (2009) examined several thousand queries for different medical conditions, butthey focussed exclusively on Wikipedia’s position in search results; Bowler et al. (2011)assessed the visibility of online health information portals for teens, but did so for onlysix websites; and Kitchens et al. (2012) studied the first pages of search results for over2,000 health terms, but they focussed on website quality alone.

The present study approaches the health information landscape on a large scale,taking into account different health domains and thousands of websites for each. Theanalysis is twofold, following Roger’s (2004) framework of information politics on theweb where sources are in constant competition for the privilege of providinginformation. Rogers distinguishes between the front- and back-end of the web. Thefront-end addresses the issues of inclusivity, fairness and scope of representation,whereas the back-end deals with information retrieval. This study sets out to analysethe front- and back-ends of the health information landscape. First on the visible end,sponsorship and platform type of the most prominent medical websites are assessed.Second the back-end of hyperlink connections between websites is analysed in order tounderstand how different entities relate to one another, and what informational pathsthey create.

OIR38,2

210

Page 3: A webometric analysis of online health information: sponsorship, platform type and link structures

Unravelling the structure of the health information landscape on the front- andback-end is important for two reasons: first it provides insight into what types ofwebsites health information seekers are most likely to encounter online, and second itreveals what linking paths these different informational venues create. The twofoldapproach forms a basis for understanding how provision of information andnavigational paths are shaped by the interplay of different interests in the online healthinformation realm.

Front-end health informationThe visible end of the health information landscape has most often been describedthrough a lens of sponsorship where the underlying question concerns a website’sorganisational origins. Referring to the general distinction between commercial,organisational, educational and governmental websites, studies explain how differenthealth-related websites have different features, audiences and motivations (Eng, 2001;Rice et al., 2001; West and Miller, 2009). Knowing about a website’s sponsorship isimportant, since it may affect its content provision (Rice et al., 2001) and how usersappraise the site (Morahan-Martin, 2004). In the light of today’s social andparticipatory web, standard sponsorship categorisation is no longer sufficient forcapturing the variety of the health information landscape. Authors thus introducecategories such as “individual person’s site” (Scullard et al., 2010; Reichow et al., 2012)and “blog/other interactive media” (Bowler et al., 2011). While such categoriesrepresent individuals’ presence on the web, they are less suitable for categorising socialmedia sites where individuals as well as companies, organisations and other entitiesare normally present.

Knowledge about a website’s sponsorship does not convey much about its servicesand features. From the lay users’ perspective it is important to distinguish between awebsite’s sponsorship and the type of platform it provides, that is, what it offers to usersin terms of content, services and features (Rice et al., 2001). Sponsorship and platformtype are two separate dimensions of health-related websites. In previous research thesetwo dimensions are often conflated by being included in a single categorisation scheme.For example Reichow et al. (2012) propose ten categories, e.g. “government”,“organisation”, “online information website” and “a collection of links to otherwebsites” to describe a website’s purpose. These categories are not mutually exclusive asboth government and non-profit organisations can provide “online information websites”or “collections of links”. Many existing categorisation schemes are also subject-specific.For example Kaimal et al. (2008) develop categories specific to obstetrics topics. Suchtwofold subject-specific schemes cannot be applied to the investigation of thehealth-related web on a large scale. Therefore a general scheme for assessing websites’platform types – what information, features and services they offer to users – isdeveloped to complement the sponsorship dimension of the landscape.

When entering health information landscapes through general search engines layusers should be aware of the interplay of various interests driving the provision ofcontent and services. Thus the aim of defining websites’ sponsorship and platformtype is to describe what types of informational venues users are likely to encounteronline.

Back-end health informationInformation on the web has a fundamental network structure, with hyperlinksconnecting websites in informational networks. By analysing links we can understand

Online healthinformation

211

Page 4: A webometric analysis of online health information: sponsorship, platform type and link structures

how websites are related, detect different communities and identify the most prominentwebsites (Kleinberg and Easley, 2010). Essentially links play two roles on the web.They serve as navigation tools, connecting users to information, while alsodetermining websites’ visibility and centrality (Gonzalez-Bailon, 2009). Even thoughlinks can be made to other sites for numerous reasons (Bar-Ilan, 2005) researchers oftensee them as proxies for a site’s quality and importance (Brin and Page, 1998; Thelwall,2004). Incoming links increase a website’s likelihood of being encountered: more linkstranslate to more paths leading to the site, which is positively correlated with theamount of traffic it receives (Hindman, 2008). Several investigations of the web showedthat inlinks and consequently traffic follow a power law distribution (Kleinberg andEasley, 2010), where a vast majority of websites receive very few links and a smallminority accumulates many more (Thelwall, 2004). By studying hyperlink connectionsbetween different types of websites, we can predict what informational venues receivethe majority of traffic. The number of links a website receives influences its visibility,i.e. its position in search engine results (Hindman, 2008; Kleinberg and Easley, 2010;Thelwall, 2004). Since general search engines are the starting point in searches forhealth information for the majority of internet users across cultures (Fox and Duggan,2013; Hargittai and Young, 2012; Hansen et al., 2003; Mager, 2012) the distribution oflinks significantly influences users’ experience of the information landscape.

Complementing qualitative analysis of health-related websites with quantitativeexamination of their link structures is crucial for understanding the characteristics ofthe online health information landscape (Seale, 2005). To the best of the author’sknowledge, this approach has not been widely applied in the internet medicalinformation-seeking research. According to Thelwall (2010, p. 713), health andmedicine are “particularly fertile areas for a range of webometrics research”, but theylack a connection with information science. The present study aims to contributetowards bridging this gap.

MethodologySearch query selectionThis study focuses on health-related topics that are of most interest to internet users.Interests on the web are often measured in terms of search volumes. Among variouslists of most searched health topics, a list of the ten most popular health queriesworldwide in 2010 on Yahoo! search was selected (Yahoo!, 2010). These are (startingwith the most frequently searched query): pregnancy, diabetes, herpes, shingles, lupus,depression, breast cancer, gall bladder, HIV and fibromyalgia. Yahoo! search querieswere selected to ensure consistency throughout the study since VOSON web-crawlingsoftware uses Yahoo! API to collect inlinks data.

The selection of search queries crucially determines the hyperlink data collection, interms of which portion of the web will be sampled. Thus the sampling is based onpopular medical queries to reconstruct landscapes that health information seekers aremost likely to navigate online. Another concern is that using simple one-word queriesmight not reflect how lay users actually search for medical information. Howeverusers’ search techniques have been shown to be somewhat “suboptimal”, meaningpeople tend to use too general, misspelled or one-word search queries (Eysenbach andKohler, 2002, p. 575; van der Vaart et al., 2013). The selected search queries also echothe fact that internet users mostly look for information about specific diseases ormedical problems (Fox, 2011).

OIR38,2

212

Page 5: A webometric analysis of online health information: sponsorship, platform type and link structures

Web crawling and data cleaningTo retrieve seed sites and to collect hyperlink data the VOSON (Ackland, 2011)web-crawling and webometric analysis tool was used. VOSON allows the crawl to starton a specific webpage. Preserving specific webpages is important, as it assures moreprecise sampling of the landscape. Using first-order domains as seeds could result in abig overlap between ten topical landscapes.

The crawl was carried out in May 2011. For each selected query the 100 top-rankedURLs in Yahoo! search results were selected as seeds. The threshold for inclusion wasset to the 100th webpage in order to ensure that all pages which might appear in anyuser’s top search results are included in the sample. Put differently, general searchengines employ algorithms which produce custom-tailored search results based on auser’s search history, interests, context and other similar factors (Micarelli et al., 2007;Goldman, 2008); thus each user’s top search results for the same query might bedifferent, but it is assumed that the set of webpages displayed among the first 100results is more or less the same. The scale of the crawl was constrained by thesoftware’s pre-set boundaries (Ackland, 2011). The crawling procedure was repeatedfor each selected search query.

For each health domain two types of hyperlink networks were generatedautomatically: one where the unit of analysis is a webpage and one where the unit ofanalysis is a website, meaning that all webpages with the same root URL are joinedunder the same node. All website-based networks were inspected and cleaned to ensurethat grouping was conceptually correct, where each organisation is represented by asingle node. To ensure conceptual similarity and proportional sizes of nodes, bloggingand social networking site profiles (e.g. Blogger.com, Twitter.com) were grouped byplatforms. By contrast commercially hosted websites (e.g. websites hosted by50webs.com: 53 such services were found) were not grouped. Websites that could notbe processed, links which served only for navigation (shortened or broken links,e.g. Bit.ly) or pointed to software providers (they could appear as the most centralnodes in the network even though they are not part of it conceptually, e.g. Adobe.com;Ackland and Antony, 2007) were also identified. Such websites (26 in total) weredeleted from each network.

Website categorisationWith seed websites plus the most important sites by indegree (this approach isdiscussed in the results section), 641 unique websites were categorised. Coding wasdone manually by the author. Two questions were asked for each website: whoprovides or sponsors it, and what are its purpose and its distinctive features. A set ofheuristic rules was followed in inspecting domain names, “about us” pages, andavailable features (Bowler et al., 2011). The initial coding schemes were developed bydrawing on previous literature. They were refined during the actual analysis.

A website’s sponsorship was determined based on the type of entity responsible forthe site. The initial set of general categories was adopted from Vaughan et al. (2007):commercial, organisational (non-profit entities), governmental, educational (hospitalsaffiliated with universities were also coded as educational) and personal. The schemewas amended by introducing the hybrid category. It includes venues such as socialmedia sites (e.g. Twitter) where pages can be created by lay users or any other entity,and websites where sponsorship is shared. Broken links, sites under construction or inlanguages other than English were coded as n.a.

Online healthinformation

213

Page 6: A webometric analysis of online health information: sponsorship, platform type and link structures

A new coding scheme for describing the platform type was developed. It isprimarily grounded in Sundar et al. (2011) and Hu and Sundar’s (2010) work where theydescribe certain features of the online health-related landscape such as access to andprovision of content, connectivity, community, commerce, care, and gatekeeping. Atotal of 11 categories were developed: blog (personal diaries), business: non-medical(selling products or services not related to health care), business: medical (sellingdrugs, treatments, and other health-related products), homepage (the purpose of a siteis representing the sponsor and its activities), informational portal (systematicallyproviding health information), information and community (providing medicalinformation as well as a community space), news (general news sites), search engines(search engines and directories), social networking site (non-medical social networkingsites), wiki and forum (sites using wiki technology and discussion forums), and n.a.

To ensure that the author’s coding was reliable two external coders each categoriseda different set of 100 randomly selected websites. Each coder’s categorisation wascompared to the author’s categorisation. Cohen’s (1960) kappa was used to evaluate thereliability of the coding. The average kappa values between the coders and the authorwere 0.79 for sponsorship and 0.74 for the platform type dimension. According toLandis and Koch (1977) these are “substantial” levels of inter-coder reliability.

ResultsLandscapes overviewTwo types of networks were obtained for each of the ten health domains: a network ofunique URLs (webpage-based) which was then collapsed to a website-based network.Therefore the initial networks are much bigger than the networks of unique websites(see Table I).

The differences in network sizes across health domains are striking (see Table I,columns 2-5). The biggest webpage-based network pertains to depression with almost100,000 unique webpages, whereas the smallest – gall bladder – has only 16,769. Thissuggests that depression as well as shingles, HIV, diabetes, and pregnancy – all withover 80,000 URLs – are topics which extend further into the web. Across healthdomains, a similar number of top-ranked websites generates a very different amount ofinbound and outbound links.

The variety of topical landscapes is captured in the sizes of website-based networks.The most diverse is the shingles landscape, which has 13 times more unique websitesthan the least diverse gall bladder, herpes and breast cancer landscapes. For sometopics users can access more diverse informational venues. The ratios betweenwebpage- and website-based networks indicate that the more URLs there are in theinitial network, the bigger the number of unique stakeholders intertwined in alandscape.

Of the total 250,176 websites in ten topical landscapes, almost 60 per cent areunique. Only 179 websites are present in all landscapes and over 103,000 websitesappear in only one topical landscape. Such a small overlap between health domainsindicates that topical landscapes are composed of disparate stakeholders.

Landscapes at the front-end: sponsorship and platform typeThe following analyses were performed on the networks of prominent websites. Theseare websites that health information seekers are most likely to encounter whenentering health information landscapes via search engines and navigating thetop-ranked websites. Each network of prominent websites is constructed from seeds

OIR38,2

214

Page 7: A webometric analysis of online health information: sponsorship, platform type and link structures

(the most visible websites) and websites with the highest indegree (most centralwebsites). As many of the most central websites as were needed to construct the list ofthe 100 prominent websites were added to the initial list of seed websites. All websiteswith the same indegree as the 100th site were included, resulting in slightly differentsizes of prominent networks across health domains (see Table I, columns 5-6). Theminimum inclusion indegrees were low, ranging from four to eight. This indicates thatlinks sent outside the networks of prominent sites are highly dispersed and thus a useris less likely to access sites on the periphery. This justifies the selection of theprominent websites.

Across ten networks of prominent sites 641 unique websites were categorised forsponsorship and platform type dimensions (see Table II). The overall landscape isdominated by commercial (54 per cent) and organisational (28 per cent) websites.Governmental, educational, hybrid and personal entities are substantially less present.The analysis of the platform type provides a clearer picture of the landscape. Almost

Networks overall Networks of prominent websites

Healthdomain

No. ofnodes

Web-sites/

pagesa

(%)No. ofedges

No. ofseed sites

No. ofprominent

sitesNo. ofnodes

No. ofedges

Self-loops/edges(%)

Breast cancer 3,317b 15 4,350 76 103 13 662 1822,166c

Depression 41,253b 41 58,007 71 104 12 1,702 2599,722c

Diabetes 40,198b 44 57,444 79 102 13 1,749 1691,619c

Fibromyalgia 10,172b 31 11,953 81 104 14 654 1332,531c

Gall bladder 3,172b 19 3,013 79 101 14 314 1116,769c

Herpes 3,178b 12 3,350 77 108 13 512 1927,309c

HIV 39,374b 42 61,246 76 104 11 2,230 2894,825c

Lupus 30,779b 44 41,478 68 116 13 1,429 2369,706c

Pregnancy 36,473b 46 46,042 71 110 11 1,094 1780,136c

Shingles 42,260b 44 61,676 80 101 11 1,372 2196,988c

Overall 250,176b n.a. n.a. 758 1,058 14 11,718 21147,498d 543d 641d

Note: aThe size of a website-based network as a percentage of a webpage-based network; bWebsite-based network; cWebpage-based network; dNumber of unique websites in the overall network

Table I.Overview of network

data across landscapes

Online healthinformation

215

Page 8: A webometric analysis of online health information: sponsorship, platform type and link structures

Com

mer

cial

Ed

uca

tion

alG

over

nm

enta

lH

yb

rid

n.a

.O

rgan

isat

ion

alP

erso

nal

Tot

aln

%n

%n

%n

%n

%n

%n

%n

%

Overalllandscape

Blo

g0

00

20

16

91

Bu

sin

ess:

med

ical

550

00

00

055

9B

usi

nes

s:n

on-m

edic

al12

00

00

00

122

Hom

epag

e5

2835

10

120

419

330

Info

rmat

ion

and

com

mu

nit

y53

04

10

111

7011

Info

rmat

ion

alp

orta

l84

210

30

362

137

21n

.a.

00

00

71

08

1N

ews

129

00

00

40

133

21S

earc

hen

gin

e4

00

00

00

41

Soc

ial

net

wor

kin

gsi

te3

00

90

00

122

Wik

ian

dfo

rum

20

00

06

08

1T

otal

347

5430

549

816

27

117

928

132

641

Breastcancer

Blo

g0

00

10

01

1B

usi

nes

s:m

edic

al5

00

00

05

5B

usi

nes

s:n

on-m

edic

al1

00

00

01

1H

omep

age

06

50

022

3332

Info

rmat

ion

and

com

mu

nit

y6

00

00

17

7In

form

atio

nal

por

tal

81

31

06

1918

n.a

.0

00

01

01

1N

ews

270

00

01

2827

Sea

rch

eng

ine

20

00

00

22

Soc

ial

net

wor

kin

gsi

te0

00

40

04

4W

iki

and

foru

m1

00

00

12

2T

otal

5049

77

88

66

11

3130

010

3

Depression

Blo

g0

00

10

0B

usi

nes

s:m

edic

al2

00

00

22

Bu

sin

ess:

non

-med

ical

00

00

01

1H

omep

age

11

70

1221

20In

form

atio

nan

dco

mm

un

ity

100

00

010

9

(continued

)

Table II.Categorisation ofprominent websites forsponsorship and platformtype

OIR38,2

216

Page 9: A webometric analysis of online health information: sponsorship, platform type and link structures

Com

mer

cial

Ed

uca

tion

alG

over

nm

enta

lH

yb

rid

n.a

.O

rgan

isat

ion

alP

erso

nal

Tot

aln

%n

%n

%n

%n

%n

%n

%n

%

Info

rmat

ion

alp

orta

l9

00

16

1615

n.a

.0

00

00

0N

ews

420

00

042

40S

earc

hen

gin

e3

00

00

33

Soc

ial

net

wor

kin

gsi

te0

00

60

66

Wik

ian

dfo

rum

00

00

33

3T

otal

6763

11

77

80

2120

010

6

Diabetes

Blo

g0

00

10

11

Bu

sin

ess:

med

ical

60

00

06

6B

usi

nes

s:n

on-m

edic

al1

00

00

11

Hom

epag

e1

66

018

3130

Info

rmat

ion

and

com

mu

nit

y12

00

02

1414

Info

rmat

ion

alp

orta

l13

02

05

2020

n.a

.0

New

s19

00

00

1919

Sea

rch

eng

ine

20

00

02

2S

ocia

ln

etw

ork

ing

site

10

05

06

6W

iki

and

foru

m1

00

01

22

Tot

al56

556

68

86

60

2625

010

2

Fibromyalgia

Blo

g0

00

10

00

11

Bu

sin

ess:

med

ical

140

00

00

014

13B

usi

nes

s:n

on-m

edic

al2

00

00

00

22

Hom

epag

e0

35

00

202

3029

Info

rmat

ion

and

com

mu

nit

y9

00

00

20

1111

Info

rmat

ion

alp

orta

l19

02

10

70

2928

n.a

.0

00

03

00

33

New

s6

00

00

00

66

Sea

rch

eng

ine

20

00

00

02

2S

ocia

ln

etw

ork

ing

site

00

04

00

04

4W

iki

and

foru

m1

00

00

10

22

(continued

)

Table II.

Online healthinformation

217

Page 10: A webometric analysis of online health information: sponsorship, platform type and link structures

Com

mer

cial

Ed

uca

tion

alG

over

nm

enta

lH

yb

rid

n.a

.O

rgan

isat

ion

alP

erso

nal

Tot

aln

%n

%n

%n

%n

%n

%n

%n

%

Tot

al53

513

37

76

63

330

292

210

4

Gallbladder

Blo

g0

00

10

14

66

Bu

sin

ess:

med

ical

210

00

00

021

21B

usi

nes

s:n

on-m

edic

al4

00

00

00

44

Hom

epag

e2

55

00

41

1717

Info

rmat

ion

and

com

mu

nit

y10

00

00

00

1010

Info

rmat

ion

alp

orta

l15

00

00

51

2121

n.a

.0

00

01

00

11

New

s8

00

00

10

99

Sea

rch

eng

ine

30

00

00

03

3S

ocia

ln

etw

ork

ing

site

00

07

00

07

7W

iki

and

foru

m1

00

00

10

22

Tot

al64

635

55

58

81

112

126

610

1

Herpes

Blo

g0

00

10

01

1B

usi

nes

s:m

edic

al6

00

00

06

6B

usi

nes

s:n

on-m

edic

al1

00

00

01

1H

omep

age

05

101

130

2927

Info

rmat

ion

and

com

mu

nit

y5

02

03

010

9In

form

atio

nal

por

tal

150

40

111

3129

n.a

.0

New

s23

00

00

023

21S

earc

hen

gin

e2

00

00

02

2S

ocia

ln

etw

ork

ing

site

00

03

00

33

Wik

ian

dfo

rum

10

00

10

22

Tot

al53

495

516

155

50

2826

11

108

HIV

Blo

g0

00

20

22

Bu

sin

ess:

med

ical

10

00

01

1B

usi

nes

s:n

on-m

edic

al0

Hom

epag

e0

310

013

2625

(continued

)

Table II.

OIR38,2

218

Page 11: A webometric analysis of online health information: sponsorship, platform type and link structures

Com

mer

cial

Ed

uca

tion

alG

over

nm

enta

lH

yb

rid

n.a

.O

rgan

isat

ion

alP

erso

nal

Tot

aln

%n

%n

%n

%n

%n

%n

%n

%

Info

rmat

ion

and

com

mu

nit

y3

02

00

55

Info

rmat

ion

alp

orta

l9

03

02

1413

n.a

.0

New

s45

00

02

4745

Sea

rch

eng

ine

20

00

02

2S

ocia

ln

etw

ork

ing

site

10

05

06

6W

iki

and

foru

m0

00

01

11

Tot

al61

593

315

147

70

1817

010

4

Lupus

Blo

g0

00

00

22

2B

usi

nes

s:m

edic

al2

00

00

02

2B

usi

nes

s:n

on-m

edic

al0

Hom

epag

e1

714

030

153

46In

form

atio

nan

dco

mm

un

ity

40

10

11

76

Info

rmat

ion

alp

orta

l8

14

07

020

17n

.a.

00

00

10

11

New

s17

00

00

017

15S

earc

hen

gin

e2

00

00

02

2S

ocia

ln

etw

ork

ing

site

10

07

00

87

Wik

ian

dfo

rum

10

00

30

43

Tot

al36

318

719

167

60

4236

43

116

Pregn

ancy

Blo

g0

Bu

sin

ess:

med

ical

20

00

00

22

Bu

sin

ess:

non

-med

ical

0H

omep

age

07

100

011

2825

Info

rmat

ion

and

com

mu

nit

y23

00

10

125

23In

form

atio

nal

por

tal

310

30

04

3835

n.a

.0

00

02

02

2N

ews

40

00

00

44

Sea

rch

eng

ine

20

00

00

22

Soc

ial

net

wor

kin

gsi

te0

00

50

05

5

(continued

)

Table II.

Online healthinformation

219

Page 12: A webometric analysis of online health information: sponsorship, platform type and link structures

Com

mer

cial

Ed

uca

tion

alG

over

nm

enta

lH

yb

rid

n.a

.O

rgan

isat

ion

alP

erso

nal

Tot

aln

%n

%n

%n

%n

%n

%n

%n

%

Wik

ian

dfo

rum

20

00

02

44

Tot

al64

587

613

126

52

218

160

110

Shingles

Blo

g0

Bu

sin

ess:

med

ical

80

00

08

8B

usi

nes

s:n

on-m

edic

al5

00

00

55

Hom

epag

e0

311

07

2121

Info

rmat

ion

and

com

mu

nit

y7

00

01

88

Info

rmat

ion

alp

orta

l11

03

27

2323

n.a

.0

New

s26

00

00

2626

Sea

rch

eng

ine

20

00

02

2S

ocia

ln

etw

ork

ing

site

20

04

06

6W

iki

and

foru

m1

00

01

22

Tot

al62

613

314

146

60

1616

010

1

Table II.

OIR38,2

220

Page 13: A webometric analysis of online health information: sponsorship, platform type and link structures

one third of sites are homepages, of which the majority are maintained by non-profitorganisations. Every fifth website in the landscape is an informational portal, of whichthe majority are for-profit. The third biggest group of websites is news sites. Socialvenues, such as online communities, blogs, social networking sites and forums,altogether account for only 15 per cent of the overall landscape.

Collapsing the overall landscape into topical landscapes uncovers importantdifferences in their front-ends (see Table II). From the sponsorship standpoint,organisational and governmental websites are most strongly present in the lupus (52per cent) and herpes (41 per cent) landscapes. In the case of lupus, 72 per cent of thoseare homepages that mostly aim to raise awareness and funds. This is not surprising: inthe US, non-profits related to chronic conditions are among the largest fundraisers inthe health care category (Philanthropy 400, 2012). In the herpes landscape about half ofthe non-profit websites are homepages, whereas one third of websites are informationalportals. A high proportion of informational websites may potentially be explained bythe high worldwide prevalence of the Herpes simplex types 1 and 2 viruses (Smith andRobinson, 2002). In terms of the platform type, the most diverse are the gall bladderand shingles landscapes. Since shingles has the biggest, and gall bladder the smallestwebsite-based network the conclusion that follows is that the size of the topicallandscape is not related to the diversity of prominent websites at the front-end.Commercial news sites occupy over 40 per cent of the HIV and depression front-ends.Traquina (2007) argues that understanding news as “event-oriented” (e.g. coverage ofnew scientific discoveries) or as “stories” (e.g. a news story of a worldwide epidemic)may help explain the media’s interest in HIV/AIDS-related topics. The sameexplanation may apply to the news coverage of depression-related topics.

Landscapes at the back-end: hyperlink structuresTo gain insight into how the most prominent websites relate to one another and whatinformational paths they create, hyperlink networks were analysed. In the networkdiagrams nodes represent the types of health-related websites, while edges are linksthat websites send to one another. Nodes were grouped by the platform type dimension(and by sponsorship for the homepage category) so that networks reveal more abouthow different types of websites are connected. The sizes of nodes are proportional tothe number of websites included in each category and are normalised across networks.The position of nodes is fixed across networks for ease of comparison. The edge widthreflects the proportion of links that two nodes generate in relation to all links in thenetwork. Hyperlink networks are directed, meaning that links have a “clear origin anddestination” (Hansen et al., 2010, p. 34). Accordingly, the size of the arrows representsthe proportion of links incoming to a node. Links can also be sent between websites inthe same category. Such linking, called self-loops, is represented by the colour of nodes:the darker the node, the higher the number of links that stay within that category. InFigure 1 the intensity of linking within a node is represented only by its colour and it isnot considered in visualising the links sent between different categories of websites.The total number of edges was normalised across the ten networks in order to makethe general observed patterns meaningful (the actual total numbers of links arereported in Table I).

Figure 1 exhibits linking relationships between 641 unique prominent websitesacross all health domains. There are 11,718 edges in the network. Three types ofwebsites – informational portals, news, and organisational homepages – stand outby the magnitude of their presence and the number of self-loops. The majority

Online healthinformation

221

Page 14: A webometric analysis of online health information: sponsorship, platform type and link structures

(1,208) of self-loops are within the news category. News sites are the major linkgenerator (2,517) and receiver (3,104). The second and the third most frequentreceivers as well as senders of links are informational portals and organisationalhomepages, respectively, where both types of websites receive more links than theysend. In general the overall landscape of the prominent websites is well connected.Out of 14 categories, ten receive links from all other nodes. Almost 90 per cent ofpossible edges are present in the network (graph density: 0.89), which indicatesactive linking across different types of websites. Hyperlink networks for each healthdomain are presented in Figure 2.First differences in node sizes show once again that different entities are interested indifferent health domains. HIV and depression have a strong media presence, with newssites accounting for about 40 per cent of their landscapes. Gall bladder and pregnancyoffer the most promising business opportunities, both having 55 per cent of thelandscape filled with commercial entities (excluding commercial news portals). In thegall bladder landscape, commercial interests are mainly represented by businessmedical sites and informational portals, whereas in the pregnancy landscape there aremany commercial informational portals, and information and community sites (seeFigure 2 and Table II). By contrast governmental homepages are most strongly presentin the lupus and shingles landscapes. Second even though the sizes of the networks are

Figure 1.Hyperlink structure of theoverall landscape

OIR38,2

222

Page 15: A webometric analysis of online health information: sponsorship, platform type and link structures

Figure 2.Hyperlink structures of

the topical landscapes

Online healthinformation

223

Page 16: A webometric analysis of online health information: sponsorship, platform type and link structures

relatively similar, the number of links websites send to one another differssubstantially (see Table I). The least active is the gall bladder network, where 101websites create only 314 links, whereas 104 HIV-related websites share 2,230 links. Themost interconnected networks, however, are shingles, HIV and diabetes (graph density:0.85, 0.80, and 076 respectively). Compared to other landscapes, these landscapesenable users to navigate to more diverse websites, regardless of the entry point into thelandscape. The pregnancy landscape is the least intertwined, with only 42 per cent ofall possible connections present. Of these the majority flow between informationalportals, information and community, social networking, and wiki and forum sites. Thisindicates the interactive and social aspect of pregnancy-related topics.

Across all networks informational portals are on the crossroads of informationalhighways. While they extensively link to organisational and governmental homepagesas well as to information and community sites, other types of websites often refer tothem. This indicates that informational portals are likely to be venues with the mostdiverse information, referencing many different websites. Indeed the majority of

Figure 2.

OIR38,2

224

Page 17: A webometric analysis of online health information: sponsorship, platform type and link structures

studied informational portals provide information about a range of medical topicsrather than specialising in only one. Finally news sites, informational portals,information and community sites and organisational homepages send a substantialnumber of links to social networking sites. Those sites have increasingly become aplace of health information dissemination.

DiscussionFront-end: beyond sponsorship to the platform typeThe dichotomy between commercial and organisational websites has been mostcommonly used to describe online health information providers from the sponsorshipperspective. However this dichotomy is not sufficient for describing today’s web,which is characterised by social features. Some authors outside the medical realmdescribe this part of the web as “personal” (Vaughan et al., 2007) or as havingindividual gatekeeping (Hu and Sundar, 2010). The results of this study, however,suggest that this is not an accurate description of the use of social media venues forhealth information. While social networking sites are defined as venues where, amongother things, individuals can construct profiles (Boyd and Ellison, 2007), formal entitiesuse them as well. In the realm of health communication Steele (2011, p. 188) identifiedfive types of interactions taking place on online social media, one of them being“corporate-patient interactions”. Classifying social networking sites as personal maytherefore be inaccurate. Thus the hybrid sponsorship category was introduced in orderto indicate that pages on social media platforms can be created either by individuals orformal entities. This approach was validated in the analysis of the hyperlink networks,which showed that many formal entities link to social networking sites. A subsequentqualitative inspection of these links confirmed that a vast majority point to their ownprofiles.

This study also contributes to the understanding of the online health realm byintroducing the platform type dimension to the website analysis. Platform typepertains to a website’s form, features and utility. The dimension proved extremelyuseful in disentangling the commercial part of the landscape. Describing a website ascommercial does not reveal much about its assistance to health information seekers. Asthe majority of the overall landscape has commercial and organisational sponsorship,the platform type dimension sheds light on some, otherwise hidden, landscapecharacteristics (see Table II):

. apart from e-commerce websites, commercial interests are also present oninformational portals, communities, and news sites;

. the health-related web is populated by informational portals providing diverseinformation;

. some informational sites also offer a community space for people to interact andconnect;

. static homepages pertain mostly to non-profit organisations primarily activeoffline; and

. news sites are strongly present in some of the landscapes.

The analysis of the overall landscape’s platform type (see Table II) tells two importantstories. First the health-related web is rather rigid, populated by portals providinginformation about a variety of medical conditions and websites representing offlineentities. Second no previous study has acknowledged the importance of news sites in

Online healthinformation

225

Page 18: A webometric analysis of online health information: sponsorship, platform type and link structures

online health information landscapes. This study shows that news media, which“serves as a key intermediary, communicating information between policy makers andthe public as a whole” (Brodie et al., 2003, p. 928), has a strong presence in the onlinehealth information realm. Nevertheless news sites are often left out of discussionsabout online health information sources.

Back-end: hyperlinks and differences in topical landscapesThe links between websites determine navigational paths and traffic distribution(Hindman, 2008). By exploring linking patterns this study tried to uncover what kindof websites users might be more likely to navigate to in the health informationlandscapes. Across networks several distinct patterns emerged.

First hyperlink networks are characterised by a considerable proportion ofself-loops (see Table I). Self-loops can act like black holes from which it is harder tonavigate to other parts of the landscape. In the overall landscape 21 per cent of theedges are created between websites of the same type. The awareness of the existence ofself-loops can assist information-seekers to consider navigating between differentproviders of medical information more consciously. In such a way they may obtainmore diverse health information.

Second across all networks social networking sites have been identified assignificant link-receivers. These links are usually sent from information providers’websites to their own profiles on social networking sites. This suggests thatinformation providers are increasingly trying to communicate and engage with usersof social media. By directing a user to a social networking site and inviting them to“like” or “follow” their profile, information providers are creating “fan bases”, whichthey can more easily communicate with in the future. As such social networking sitesshould be considered as sources of health information as well as health communication.

Third websites across health domains are notably intertwined, with websites ofvarious types linking to one another. One general pattern can be observed: commercialinformational portals such as Medscape.com and eMedicineHealth are likely to redirecttheir visitors to governmental and organisational homepages, whereas linking in thereverse direction is not common. Informational portals can thus be considered asvenues where information from various sources is aggregated and can serve as goodentry points into discovering medical information. The same, but to a lesser extent, istrue for information and community sites such as WebMD and EverydayHealth.com.

Lastly a closer look at the distinct medical domains uncovers significant differencesbetween them. Topical landscapes differ in terms of their size, most strongly presententities and forms of provided information. Those differences might be explained bydifferences among the studied conditions. Some health domains are influenced bypatients’ need for social support, fundraising and awareness-raising activities, whichare reflected in their online informational landscapes. Landscapes of chronic medicalconditions are characterised by a strong presence of organisational and governmentalentities (e.g. lupus and herpes, see Table II) or news sites (e.g. HIV and depression, seeTable II). Similarly topics such as gall bladder and pregnancy have distinct landscapeswhich reflect commercial interests in those health domains (see Table II, these twolandscapes have the highest proportion of commercial sponsorship, news sites beingexcluded from the analysis).

Differences in the topical hyperlink networks in terms of their sizes and linkingactivities might also be explained by the model of social explanation for emergence ofglobal health issues which draws on the paradigm of social constructionism (Shiffman,

OIR38,2

226

Page 19: A webometric analysis of online health information: sponsorship, platform type and link structures

2009). Shiffman (2009, p. 608) suggests that differences in levels of attention to globalhealth issues are due to the ways in which the “policy community – the network ofindividuals and organisations concerned with the problem – comes to understand andportray the issue and establishes institutions that can sustain this portrayal”. Hesuggests four determinants of attention in global health: actor power (collective powerof individuals and organisations in the network), ideas about the portrayal of the issue,issue characteristics, and political context (Shiffman, 2010). Thus differences inhyperlink networks also reflect the sociological aspect of health topics, where certainpolicy communities use the web for the portrayal of global health issues. For instancein the case of HIV, news sites are the biggest and most active node in the hyperlinknetwork, whereas in the lupus landscape this role is taken by non-profit organisations.There may also be other factors determining landscape composition such ascharacteristics of the audience interested in a particular health domain, which shouldbe considered for future research.

LimitationsMethodologically the study is shaped by using third-party web-crawling software. Thisinfluenced some subsequent decisions, such as the size of the crawl and the use of Yahoo!search engine rankings. The qualitative part of the study is limited by the fact that thecoding of the websites was done by one researcher only. However the categorisation ofthe subset of websites was tested by two external coders and the obtained inter-coderreliability scores were satisfactory, which ensures reliability of the results.

The generalisability of the results is limited by using ten very different searchqueries. To construct more comprehensive landscapes each should be sampled withseveral search queries pertaining to the same health domain. In addition the results areconfined to the English-speaking part of the web. To test whether the samecharacteristics appear in health landscapes in other languages, a comparative studyacross languages is suggested as future research.

The findings are also shaped by sampling the landscape through a general searchengine. They shed light on the structure of the information landscape, but it is not clearwhat proportion of information seekers would actually search with such search queries(Yahoo! does not reveal any absolute search volumes), which information sources theywould choose from search results and how subsequent navigation would be carriedout. These processes are shaped by the underlying link structures, thus it would beworthwhile to combine webometrics with observational methods. Since the landscapewas sampled through search results, search engine biases such as systematic exclusionof certain types of websites (Halavais, 2009; Vaidhyanathan, 2011) are also reflected inthe results. These limitations should be addressed in the future research on healthinformation landscapes.

Implications and conclusionsThe study aimed to deepen our understanding of the provision of health information onthe internet. While it expands on previous research in describing the visible aspects ofhealth information landscapes, its contribution is also in uncovering their invisibleends. To the author’s knowledge this is the first study where the most prominentsources of health information were defined by search engine rankings and the numberof inbound links on such a scale.

Various social and practical implications follow from the presented results. Acrosslandscapes websites are in constant competition for the privilege of providing

Online healthinformation

227

Page 20: A webometric analysis of online health information: sponsorship, platform type and link structures

information (Rogers, 2004). This study demonstrates that the most prominent websitesfor disparate health domains create very different networks with distinct structuresand informational paths. The landscapes are shaped by many specific interests andentities, which determine the types of information people obtain on the internet. Thuswhen using the internet to look for health information, information seekers have to payattention to the distinctive characteristics of the topics of interest as they mightsignificantly influence the landscape of the prominent websites they are most likely toencounter as well as navigational paths between them. Awareness-raising campaignswould be appropriate to educate the general public about the domain-specific politics ofonline health information.

Information seekers’ attention is shaped through linking patterns which affect theranking, i.e. visibility, of websites in search results. A higher rank in search resultsadds to an information provider’s prominence which, in turn, may influence the user’sperception of information quality. Domain-specific characteristics constitute thevisibility and diversity of information sources people are most likely to access viageneral search engines. Since the majority of internet users seek health-relatedinformation via general search engines (Fox and Duggan, 2013), the politics of onlinehealth information deserves more attention from policy makers as well, as it mightsignificantly determine the types of health information people obtain online andsubsequently how people manage their or someone else’s health.

The methodological approach demonstrated in this paper can be used by businessesand other information providers to map the domain-specific online informationlandscape they operate in. In this way they can identify important stakeholders, themost visible websites and informational paths determined by linking patterns on alarger scale. Such information may inform their online marketing and businessstrategies. Similarly non-profit organisations could use this approach to learn abouttheir supporters in order to engage them more actively in their fundraising campaigns.The mapping of linking patterns in the online health information landscapes could beused by health professionals to identify the most visible providers of healthinformation and to investigate where health information seekers are most likely tonavigate to. Then those venues could be inspected and encouraged to reference andsend links to reliable sources of health information.

The demonstrated twofold approach to describing the provision of healthinformation online provides rich insight into the most visible and central informationalsources and could be employed to advance future research in the medicalinformation-seeking realm.

References

Ackland, R. (2011), “The virtual observatory for the study of online networks: VOSON”,available at: http://voson.anu.edu.au/ (accessed 15 January 2013).

Ackland, R. and Antony, J. (2007), “Developing e-research tools for the analysis of large-scaleweb crawl data”, paper presented at 3rd International e-Social Science Conference, 7-9October, Ann Arbor, MI, available at: http://voson.anu.edu.au/papers/Ackland_Antony.pdf (accessed 23 August 2013).

Bar-Ilan, J. (2005), “What do we know about links and linking? A framework for studying links inacademic environments”, Information Processing and Management, Vol. 41 No. 4,pp. 973-986.

Black, P.C. and Penson, D.F. (2006), “Prostate cancer on the internet – information ormisinformation?”, The Journal of Urology, Vol. 175 No. 5, pp. 1836-1842.

OIR38,2

228

Page 21: A webometric analysis of online health information: sponsorship, platform type and link structures

Bowler, L., Hong, W.-Y. and He, D. (2011), “The visibility of health web portals for teens: ahyperlink analysis”, Online Information Review, Vol. 35 No. 3, pp. 443-470.

Boyd, D.M. and Ellison, N.B. (2007), “Social network sites: definition, history, and scholarship”,Journal of Computer-Mediated Communication, Vol. 13 No. 1, p. 11.

Brin, S. and Page, L. (1998), “The anatomy of a large-scale hypertextual web search engine”,paper presented at Seventh International World-Wide Web Conference, WWW 1998, 14-18April, Brisbane, Australia, available at: http://infolab.stanford.edu/,backrub/google.html(accessed 23 August 2013).

Brodie, M., Hamel, E.C., Altman, D.E., Blendon, R.J. and Benson, J.M. (2003), “Health news andthe American public, 1996-2002”, Journal of Health Politics, Policy and Law, Vol. 28 No. 5,pp. 927-950.

Cline, R.J.W. and Haynes, K.M. (2001), “Consumer health information seeking on the internet: thestate of the art”, Health Education Research: Theory and Practice, Vol. 16 No. 6, pp. 671-692.

Cohen, J. (1960), “A coefficient of agreement for nominal scales”, Educational and PsychologicalMeasurement, Vol. 20 No. 1, pp. 37-46.

Delamothe, T. (2001), “Navigating across medicine’s electronic landscape, stopping at placeswith Pub or Central in their names”, British Medical Journal, Vol. 323 No. 7321,pp. 1120-1122.

Eng, T.R. (2001), The E-Health Landscape: A Terrain Map of Emerging Information andCommunication Technologies in Health and Health Care, The Robert Wood JohnsonFoundation, Princeton, NJ.

Eysenbach, G. (2008), “Medicine 2.0: social networking, collaboration, participation,apomediation, and openness”, Journal of Medical Internet Research, Vol. 10 No. 3, p. 22.

Eysenbach, G. and Kohler, C. (2002), “How do consumers search for and appraise healthinformation on the world wide web? Qualitative study using focus groups, usability tests,and in-depth interviews”, British Medical Journal, Vol. 324 No. 7337, pp. 573-577.

Fox, S. (2011), “Health topics”, available at: http://pewinternet.org/Reports/2011/HealthTopics.aspx (accessed 15 December 2012).

Fox, S. and Duggan, M. (2013), “Health online 2013”, available at: http://pewinternet.org/Reports/2013/Health-online.aspx (accessed 15 January 2013).

Fox, S. and Jones, S. (2009), “The social life of health information: Americans’ pursuit of healthtakes place within a widening network of both online and offline sources”, available at:www.pewinternet.org/, /media//Files/Reports/2009/PIP_Health_2009.pdf (accessed15 December 2012).

Goldman, E. (2008), “Search engine bias and the demise of search engine utopianism”, in Spink, A.and Zimmer, M. (Eds), Web Search: Multidisciplinary Perspectives, Springer-Verlag, Berlin.

Gonzalez-Bailon, S. (2009), “Opening the black box of link formation: social factors underlyingthe structure of the web”, Social Networks, Vol. 31 No. 2009, pp. 271-280.

Grossman, S. and Zerilli, T. (2013), “Health and medication information resources on the worldwide web”, Journal of Pharmacy Practice, Vol. 26 No. 2, pp. 85-94.

Halavais, A. (2009), Search Engine Society, Polity Press, Cambridge.

Hansen, D.L., Shneiderman, B. and Smith, M.A. (2010), Analyzing Social Media Networks withNodeXL: Insights from a Connected World, Morgan Kaufmann/Elsevier, Burlington, MA.

Hansen, D.L., Derry, H.A., Resnick, P.J. and Richardson, C.R. (2003), “Adolescents searching forhealth information on the internet: an observational study”, Journal of Medical InternetResearch, Vol. 5 No. 4, p. 25.

Hargittai, E. and Young, H. (2012), “Searching for a ‘plan b’: young adults’ strategies for findinginformation about emergency contraception online”, Policy and Internet, Vol. 4 No. 2, p. 4.

Online healthinformation

229

Page 22: A webometric analysis of online health information: sponsorship, platform type and link structures

Hindman, M. (2008), The Myth of Digital Democracy, Princeton University Press, Princeton, NJ.

Hu, Y. and Sundar, S.S. (2010), “Effects of online health sources on credibility and behavioralintentions”, Communication Research, Vol. 37 No. 1, pp. 105-132.

Kaimal, A.J., Cheng, Y.W., Bryant, A.S., Norton, M.E., Shaffer, B.L. and Caughey, A.B. (2008),“Google obstetrics: who is educating our patients?”, American Journal of Obstetrics andGynecology, Vol. 198 No. 6, pp. 682e1-682e5.

Kitchens, B., Harle, C.A. and Li, S. (2012), “Quality of health-related online search results”,Decision Support Systems, DOI: 10.1016/j.dss.2012.10.050.

Kleinberg, J. and Easley, D. (2010), Networks, Crowds, and Markets: Reasoning about a HighlyConnected World, Cambridge University Press, New York, NY.

Landis, J.R. and Koch, G.G. (1977), “The measurement of observer agreement for categoricaldata”, Biometrics, Vol. 33 No. 1, pp. 159-174.

Laurent, M.R. and Vickers, T.J. (2009), “Seeking health information online: does Wikipediamatter?”, Journal of the American Medical Informatics Association, Vol. 16 No. 4,pp. 471-479.

Mager, A. (2012), “Search engines matter: from educating users towards engaging with onlinehealth information practices”, Policy and Internet, Vol. 4 No. 2, p. 7.

Micarelli, A., Gasparetti, F., Sciarrone, F. and Gauch, S. (2007), “Personalized search on the worldwide web”, in Brusilovsky, P., Kobsa, A. and Nejdl, W. (Eds), The Adaptive Web,Springer-Verlag, Berlin, pp. 195-230.

Morahan-Martin, J.M. (2004), “How internet users find, evaluate, and use online healthinformation: a cross-cultural review”, CyberPsychology and Behavior, Vol. 7 No. 5,pp. 497-510.

O’Day, V.L. and Jeffries, R. (1993), “Orienteering in an information landscape: how informationseekers get from here to there”, paper presented at CHI ‘93 Proceedings of the INTERACT‘93 and CHI ‘93 Conference on Human Factors in Computing Systems, 24-29 April, NewYork, DOI: 10.1145/169059.169365.

Philanthropy 400 (2012), “The 400 largest nonprofits, by category”, available at: http://philanthropy.com/section/Philanthropy-400/237/ (accessed 3 May 2013).

Powell, J., Inglis, N. and Ronnie, J. (2011), “The characteristics and motivations of online healthinformation seekers: cross-sectional survey and qualitative interview study”, Journal ofMedical Internet Research, Vol. 13 No. 1, p. e20.

Quinn, E.M., Corrigan, M.A., Mchugh, S.M., Murphy, D., O’Mullane, J., Hill, A.D.K. and Redmond,H.P. (2012), “Breast cancer information on the internet: analysis of accessibility andaccuracy”, The Breast, Vol. 21 No. 4, pp. 514-517.

Reichow, B., Halpern, J.I., Steinhoff, T.B., Letsinger, N., Naples, A. and Volkmar, F.R. (2012),“Characteristics and quality of autism websites”, Journal of Autism DevelopmentalDisorders, Vol. 42 No. 6, pp. 1263-1274.

Rice, R.E. (2006), “Influences, usage, and outcomes of internet health information searching:multivariate results from the Pew surveys”, International Journal of Medical Informatics,Vol. 75 No. 1, pp. 8-28.

Rice, R.E., Peterson, M. and Christine, R. (2001), “A comparative features analysis of publiclyaccessible commercial and government health database web sites”, in Rice, R.E. and Katz,J.E. (Eds), The Internet and Health Communication: Experiences and Expectations,Sage Publications, Thousand Oaks, CA, pp. 213-231.

Rogers, R. (2004), Information Politics on the Web, The MIT Press, Cambridge, MA.

Sadasivam, R.S., Kinney, R.L., Lemon, S.C., Shimada, S.L., Allison, J.J. and Houston, T.K. (2013),“Internet health information seeking is a team sport: analysis of the Pew Internet Survey”,International Journal of Medical Informatics, Vol. 82 No. 3, pp. 193-200.

OIR38,2

230

Page 23: A webometric analysis of online health information: sponsorship, platform type and link structures

Scullard, P., Peacock, C. and Davies, P. (2010), “Googling children’s health: reliability of medicaladvice on the internet”, Archives of Disease in Childhood, Vol. 95 No. 8, pp. 580-582.

Seale, C. (2005), “New directions for critical internet health studies: representing cancerexperience on the web”, Sociology of Health and Illness, Vol. 27 No. 4, pp. 515-540.

Shiffman, J. (2009), “A social explanation for the rise and fall of global health issues”, Bulletin ofthe World Health Organization, Vol. 87 No. 8, pp. 608-613.

Shiffman, J. (2010), “Issue attention in global health: the case of newborn survival”, Lancet,Vol. 375 No. 9730, pp. 2045-2049.

Smith, J.S. and Robinson, N.J. (2002), “Age-specific prevalence of infection with herpes simplexvirus types 2 and 1: a global review”, Journal of Infectious Diseases, pp. S3-28, 186 Suppl. 1,15 October.

Steele, R. (2011), “Social media, mobile devices and sensors: categorizing new techniques forhealth communication”, in Proceedings of the 5th International Conference on SensingTechnology (ICST 2011), IEEE, Los Alamitos, CA, pp. 187-192.

Sundar, S.S., Rice, R.E., Kim, H.-S. and Sciamanna, C.N. (2011), “Online health information:conceptual challenges and theoretical opportunities”, in Thompson, T., Parrott, R. andNussbaum, J. (Eds), Handbook of Health Communication, 2nd ed., Routledge, London.

Szokan, N. (2011), “Health information remains high on the list of popular uses for the internet”,available at: www.washingtonpost.com/wp-dyn/content/article/2011/02/01/AR2011020106916.html (accessed 27 December 2012).

Thackeray, R., Crookston, B.T. and West, J.H. (2013), “Correlates of health-related social mediause among adults”, Journal of Medical Internet Research, Vol. 15 No. 1, p. e21.

Thelwall, M. (2004), Link Analysis: An Information Science Approach, Elsevier, San Diego, CA.

Thelwall, M. (2010), “Webometrics: emergent or doomed?”, Information Research, Vol. 15 No. 4.

Traquina, N. (2007), “HIV/AIDS as news: a comparative case study analysis of the journalisticcoverage of HIV/AIDS by an Angolan newspaper and two Portuguese newspapers”,International Communication Gazette, Vol. 69 No. 4, pp. 355-375.

Vaidhyanathan, S. (2011), The Googlization of Everything: (And Why We Should Worry),University of California Press, Berkeley, CA.

van der Vaart, R., Drossaert, C.H., De Heus, M., Taal, E. and Van De Laar, M.A. (2013),“Measuring actual Ehealth literacy among patients with rheumatic diseases: a qualitativeanalysis of problems encountered using Health 1.0 and Health 2.0 applications”, Journal ofMedical Internet Research, Vol. 15 No. 2, p. e27.

Vaughan, L., Kipp, M. and Gao, Y. (2007), “Are co-linked business web sites really related? A linkclassification study”, Online Information Review, Vol. 31 No. 4, pp. 440-450.

West, D.M. and Miller, E.A. (2009), Digital Medicine: Health Care in the Internet Era, BrookingsInstitution Press, Washington, DC.

Yahoo! (2010), “Yahoo!’s year in review profiles a year of discovery and disaster”, available at:http://yhoo.client.shareholder.com/releasedetail.cfm?ReleaseID¼533921 (accessed20 December 2012).

About the authorDarja Groselj is a DPhil candidate and survey research assistant at the Oxford Internet Institute,University of Oxford. She is interested in human information behaviour, focusing on the use ofmobile phones for general information seeking. Darja Groselj may be contacted [email protected]

Online healthinformation

231

To purchase reprints of this article please e-mail: [email protected] visit our web site for further details: www.emeraldinsight.com/reprints