visceral and cognitive levels of credibility judgment in an authorless environment: a factor...
TRANSCRIPT
Visceral and Cognitive Levels of Credibility Judgment in an Authorless Environment: A Factor Analysis of
the Influence of Visual Design
Jason Holmes (corresponding author)School of Library and Information Science, Kent State University, PO Box 5190,
Kent, OH 44242; Tel:330-672-0007
David RobinsInformation Architecture/Knowledge Management Program, Kent State
University, PO Box 5190, Kent, OH 44242; Tel:330-672-5852
In thinking about the impact of social computing and Web 2.0 trends affecting information seekers (and the professionals who help them), the age-old problem of determining credibility in an authorless environment again comes to the fore.First impressions are key for web page content. Regardless of the quality orcredibility of content, a poorly designed or aesthetically unappealing web page will likely produce a negative impression of credibility. This study comparedcredibility judgments for websites in which the visual design had been varied. Afactor analysis showed patterns of higher credibility scores for higher visual design treatments. The importance of the findings presented here is that visualdesign has impact beyond decoration. It is a common (if latent) assumption thatall serious web sites wish to be perceived as credible, believable, and trustworthy,especially in an authorless environment.
Introduction
In thinking about the impact of social computing and Web 2.0 trends affecting information
seekers (and the professionals who help them), the age-old problem of determining
credibility in an authorless environment again comes to the fore. First impressions are key
for web page content. Regardless of the quality or credibility of content, a poorly designed
or aesthetically unappealing web page will likely produce a negative impression of
credibility. In an environment such as the World Wide Web, where there are billions of
documents and thousands of pages on a given topic, it is critical to present information in
such a way that it does not produce a negative visceral judgment before the viewer even
has a chance to engage the content at the cognitive level. People are quick to abandon a
site and move on to one of any number of competing options. Lack of perceived credibility
is surely one of those reasons. If, at the visceral level, the design of a website suggests
non-credible information, the viewer might not stay with the site long enough for the
content credibility to be perceived and judged at the cognitive level.
The visual design of a web site is thought to impact user experience in ways ranging from
simple decoration to directing users to view important areas on a page. This study seeks to
show whether a relationship exists between visual design and user judgment of the
credibility of the information on a web site. We are interested in how the perception of
credibility is affected by a variation of the visual design on identical content. Furthermore,
we are interested in the factors that contribute to the perception of credibility in a website.
Credibility is defined in this study as the trustworthiness of information presented as
content on web-based information resources. In the broadest sense, credibility may include
considerations of security, privacy, and authority of sources. This study evaluates user
perception of the veracity of information presented on web sites that are informational in
nature as opposed to an e-commerce, health care provider, or some other genre of website.
In other words, the web sites used for this study presented information about a specific
research topic of interest, and were not engaged in sales or the collection of private
information from its users. Evaluations of security and privacy were not studied.
This research focuses on how users perceive two types of aesthetic treatments applied to
the same content in a given scenario: low aesthetic treatment and high aesthetic
treatment. A low aesthetic treatment (LAT) is one in which content is simply placed on a
web site without professional graphic design. There may be graphical elements and some
page layout meant to help the reader comprehend the content, but the elements and layout
are crudely implemented. Our hypothesis is that this type of treatment creates a
"low-budget" impression in the user, and a concomitant feeling that the content in the site
is not credible.
A high aesthetic treatment (HAT) presents a professional look and feel appropriate to the
organization it represents. Sites employing HAT employ principles of layout to enhance
communication, and strategically and professionally use color and graphics to build brand
and concept. The pages in these sites convey professionalism and care in how they are
presented. These pages should immediately invoke confidence, enjoyment, or some other
positive emotion in users that makes them want to stay on the site. It is hypothesized that
this type of design will create a lucid impression of the site's intentions and invoke in users
a feeling that the content in the site is credible.
The purpose of this study is to perform a factor analysis to discover patterns of variation
effecting credibility judgments. In other words, "What factors emerge from credibility
judgments of the information on web sites with varying levels of aesthetic treatment?"
Related Research
Credibility, for the purpose of this research, is limited to the believability or trustworthiness
of information found in the World Wide Web. The focus on web-based information is
important because such information is often freely contributed to publicly available space
without being subjected to peer-review or editorial processes that, in general, improve its
veracity. This leaves a greater burden of credibility judgment on users (Warnick, 2004).
This study focuses on how visual design impacts these credibility judgments. Garrett (2003)
shows the surface plane to be visual design (including aesthetic consideration and
positioning of elements in a grid, typography and so forth) laid on top of the "skeleton" of
the site (the result of information architecture activities such as navigation design). The
surface plane is where branding takes place and branding is critical for making a positive
first impression that will grab users and hold them there (and potentially increase
conversion rates).
Although Rosenfeld and Morville (2002) and Wodtke (2003) concentrate mainly on what
Garrett would call the skeleton plane of user experience, they encourage designers to use
sketches to communicate to graphic designers their overall concept of what the site should
convey. This is similar to how architects in the physical world create preliminary sketches to
work out how a building should look and feel, even to the point of the design of happiness
(de Botton, 2006). Happiness, according to de Botton, can be experienced by living in
environments designed to reinforce positive aspects of humanity such as balance of
opposing elements and effortless grace.
Researchers at the Stanford Persuasive Technology Lab have conducted extensive studies
on the phenomenon of web credibility (Fogg et al., 2002), and were surprised to find the
extent to which the visual and information design of a web site mattered to users.
Comments elicited from users were categorized, and the largest category was "design and
look," indicated by 46.1% of respondents. The next highest category was of a similar nature:
28.5% indicated that the "information design" of a site contributed to their credibility
judgments. So nearly 75% of respondents reported making credibility judgments on the
basis of content presentation rather than evaluation of the content's/creator's authority,
trustworthiness, reputation, or expertise.
For this study, it is useful to borrow terminology from Norman (2004), who breaks down
reactions to design in three experience levels: visceral, behavioral and reflective. Both the
behavioral and reflective levels of experience are cognitive in nature. Visceral experience in
design is an immediate powerful reaction to a design. In describing various brands of
bottled water bottle designs, he asks:
How does one brand of water distinguish itself from another? Packaging is one answer,
distinctive packaging that, in the case of water, means bottle design. Glass, plastic,
whatever the material, the design becomes the product. This is bottling that appeals to
the powerful visceral level of emotion, that causes an immediate visceral reaction:
"Wow, yes, I like it, I want it." It is, as one design explained to me, the "wow" factor. (pp.
64-65)
The behavioral level is experienced in the use of a design. The presentation of the
experience is less important than the ease and practicality of use. Whereas visceral design
tries to immediately capture the user's attention, behavior design seeks to hold the user
through ease of use and ease of learning. It may be, however, that users will use objects
that do not perform well because of some emotional attachment to the object. Reflective
design is highly analytic and cognitive. It represents an attempt to make a design better by
incorporating the experience of use and the knowledge of goals and objectives of the
product or service.
Norman, Ortony and Russell (2003) bolster the importance of emotion and enjoyment in
people's interactions with objects in everyday life. Tractinsky, Katz and Ikar (2000) found
that high aesthetic treatments on ATM interfaces positively influenced users' perceptions of
the device's usability. This confirms notions put forth by Dion, Berscheid and Walster
(1972), who found that people who are considered physically attractive are more likely to
be perceived as better mates, more successful, more competent, and overall more
desirable people.
Lindgaard, Fernandes, Dudek and Brown (2006) found that significant judgments about the
acceptability of a web site are made within 50 milliseconds. This is certainly not enough
time for cognitive processes to occur at an analytical or reflective manner. They also
demonstrated that "visual appeal" was the prime determiner of a positive reaction to a web
site. Wathen and Burkell (2002) present a similar notion in their model of the credibility
judgment process. They identify "surface credibility" (visceral) and "message credibility"
(cognitive). The former addresses appearance issues that are quickly addressed, and the
latter requires further analysis to evaluate more objective criteria such as expertise and
accuracy.
We have shown that researchers are beginning to pay attention to the visceral aspects of
design and that the visual design of an interface impacts these visceral or precognitive
experiences. It is the visual design element that has not been extensively tested with regard
to its impact on credibility judgments. We set up an experiment to test some of these
notions.
Research Design
The aims of this study required a means of determining the direction of the judgment
(credible or not credible) and the magnitude of the judgment. The study also required
stimuli (web sites) of varying design/aesthetic treatment that subjects could judge. The
design of the web site needed to be carefully controlled so that judgments could be
compared across users for the same stimuli. Consequently it was decided that a judgment
of the effects of design/aesthetic treatment would be more informative if the same content
was presented with different designs.
Three steps needed to be completed in order to carry out the study: select stimuli, select
subjects and procedural design.
Selecting the Stimuli
A Google search on the terms "web accessibility" produced a large group of results. Stimuli
were chosen from the results on the basis of moderate to high aesthetic treatment in the
visual design. The determination of whether an aesthetic treatment was moderate to high
was made on the basis of researcher judgment, since there is no objective measure of the
degree of aesthetic treatment. The important thing to consider, however, is that the designs
found on the retrieved web sites were starting points for the study.
Most of these sites were informational in nature or were the sites of consultants offering
services in the area of accessible web design. The next task was to make two versions of
each site selected: one left as it was found on the web, and one with reduced aesthetic
design. To do this, each of the 20 landing pages selected were saved on a local computer,
opened in an html editor, and stripped of its visual enhancements. None of the content was
altered, only the visual design. This left us with 20 pairs of pages (i.e., 40 pages total). To
make these pages consistent, each of the 40 pages was opened in a web browser and
saved as an image file that could not be altered as easily as an html file. The image was
contextualized by also showing the browser window to give subjects the feeling they were
browsing the web. Of the web sites chosen for stimuli, eight were ".com," six were ".org,"
three were ".gov," three were ".edu," and one was ".net."
Finally, the images were arranged in random order. We were concerned that a fatigue effect
might skew the results, so we made two stimulus sets to show subjects: one of 40 from the
original randomization, and the other a reverse of the original randomization. Odd
numbered subjects were shown the original randomized set, and even numbered subjects
were shown the reversal of the randomized set.
Subjects
Twenty subjects were chosen for this study from a convenience sample of Library and
Information Science graduate students (14 females, 6 males), although six were
undergraduates from a variety of majors. Since web accessibility had only been covered in
one unit in one Library and Information Science elective course, it was assumed that the
subjects would not have much knowledge about web accessibility. Even though the number
of subjects tested is small, the number of judgments was quite high (20 subjects x 40
stimuli = 800 judgments overall). A power analysis showed that 20 subjects is enough to
detect the phenomenon we are measuring (Power = .99).
Procedure
Subjects were shown each of these 40 images in sequence and asked to quickly judge the
site's credibility. We shuffled the order of the images so that the pairs of differently treated
content would not be shown side-by-side. We did not tell subjects that the purpose of the
study was to judge credibility on the basis of visual design, only that they were to judge each
site's credibility on first impressions. We also created two sets of the stimuli (set 1 and set
2): set 1 given to odd numbered subjects and set 2 to even numbered subjects in reverse
order of presentation. This step was taken to control for fatigue and ordering effects among
subjects.
Each image in each set was separated by a white slide with centrally located cross hairs so
that subjects would not move directly from one image to another without a break. The cross
hair image was shown for 2 seconds, and then the next image was loaded. Images
remained visible until subjects indicated their credibility judgments.
Subjects indicated credibility judgments by moving a dial to the right for a positive judgment
and to the left for a negative judgment. The dial was programmed to register judgments on
a 14 point scale (1 through 7 (right direction) for positive judgments and -1 through -7 (left
direction) for negative judgments. The dial device itself was built with 14 programmable
positions on the dial, each of which could be assigned a value.
The computer on which the study was performed was able to collect screen capture (video),
the time for each judgment, and the values registered by the dial. Each image was
displayed until the user moved the dial. The value that was assigned as the magnitude
estimate for each image was the last position of the dial in the direction turned by the
subject before he or she allowed the dial to return to the center position.
Results
General Time Observations
Overall, subjects clustered into three groups with respect to rating scores and judgment
time. Five of the subjects tended to give overall high credibility ratings, five subjects tended
toward low ratings and 10 subjects tended to hover near zero (see Figure 1).
Figure 1: Average credibility score as a function of judgment time with representative
subjects identified for each factor.
Those giving lower overall ratings were much quicker to do so (averaging 2.0 seconds to
make their judgments), while those who gave higher overall ratings were slower to make
judgments (averaging 4.3 seconds). The middle group had a mid-range average judgment
time of 3.7 seconds. These results indicate a potentially intriguing pattern, namely, that the
longer a subject spent looking at the stimulus websites, the more likely they were to judge
the site as more highly credible. These results, although not statistically significant, suggest
a trend on which further research should be based.
Factor Analysis
A factor analysis on the credibility scores was performed using principal components
analysis in the SPSS statistical software package. Using the Varimax rotation with Kaiser
normalization yielded components scored on six factors that accounted for 67.3% of the
variance. Factors 3-6 had weak loadings and only one or two subjects. For the purposes of
this study, the top 2 factors were selected for closer analysis. Five subjects loaded on Factor
1, which accounted for 22% of the variance. Three subjects loaded on Factor 2, which
accounted for 12% of the variance.
Factor 1 was identified by the researchers as a visceral factor based on the short time in
which credibility judgments were made. The average time for the subjects loading on the
factor to make a credibility judgment was 1.6 seconds. Based on an examination of
representative stimuli, there seems to be a positive correlation between higher aesthetic
treatment and higher perceived credibility. In fact, a t-test found a statistically significant
difference between judgments of credibility of HATs and LATs (p <.001). Mean overall
credibility ratings for HATs was 1.05 and for LATs, -0.55 (on a scale of +7 to -7).
Factor 2 was identified as a cognitive factor in credibility judgments, also based on average
credibility judgment time. The average time for subjects loading on this factor to make a
judgment was 5.6 seconds. The difference in average time to credibility judgment between
these two factors is 4 seconds. In these 4 seconds, it is our contention that the basis for
judgment moves from the visceral level, which is based on a gut-level, nearly instantaneous
judgment using aesthetic design alone, to a cognitive level of experience which is based on
higher level processes such as reading text, relevance judgments, and content analysis.
Furthermore, subjects who loaded on this factor gave HATs high and low credibility scores
without the pattern demonstrated by Factor 1. Instead, these subjects rated HATs both
extremely high and extremely low. This suggests that credibility judgments were not based
purely on design but were the result of more reflection.
In the interest of publication space, we have not included the representative stimuli in this
text. These images will be presented at the conference and/or can be made available on
our website.
Discussion and Conclusion
The two factors identified by this study suggest a relationship between credibility judgment
time, the criteria on which credibility judgments are made, and the impact of visual design.
In general, we found that the longer subjects looked at a stimulus, the more likely they were
to make a positive credibility judgment and the more likely they were to make that
judgment on cognitive criteria.
Interestingly, shorter judgment times resulted in more negative credibility judgments. This
finding suggests that in the short term, visceral judgments impacted by visual design
preempts the effects of content authority. That is, after about two seconds,
multidimensional cognitive processes outweigh the initial visceral reaction to visual design
with respect to credibility judgments.
Subject 04, a representative of Factor 2, represents an anomaly to the above assertion. This
subject made quick judgments that rated LATs low (in his case an average rating of -4.0)
and HATs high (1.7). This subject's LATs were very low and the HATs were only moderately
high. However, if we look specifically at representative stimuli for this factor, this subject
was right in line with the other representative subjects. More research needs to be done to
determine the exact nature of these complex reactions.
This study investigated the visceral and cognitive factors affecting the perceived credibility
of information on web sites with varying levels of aesthetic treatment. In general, our
findings were consistent with our expectations that high aesthetic treatment would produce
high judgments of credibility. We have also established a positive correlation between the
time spent looking at a web page and perceived credibility. The nature of this correlation
has to do with movement of the user's experience from the visceral level to the cognitive
level of judgment.
The importance of the findings presented here is that visual design has impact beyond
decoration. It is a common (if latent) assumption that all serious web sites wish to be
perceived as credible, believable, and trustworthy. The question remains concerning exactly
what features, elements or configurations of features and elements of design impact
credibility judgment in what way. The ultimate result of this line of research is meant to
isolate these features so that designers can project the image necessary to support their
aim--whether it be commercial, informational or educational.
This study has bearing on design considerations and the establishment of credibility in an
authorless environment. By and large, if users do not make a positive credibility judgment
quickly at a visceral level, at the very least, it will be more difficult to establish credibility
through content or authority. It may be possible that a poor visual design will cause users to
abandon a site before they have a chance to engage higher-level cognitive processes.
Moreover, if more time is needed to establish credibility when the author is unknown or
unidentified, the problem of establishing credibility is compounded. Therefore, this first
stage of the establishment of credibility, visual design, is a crucial design consideration.
References
de Botton, A. (2006). The architecture of happiness. New York: Pantheon Books.
Dion, K., Berscheid, E., & Walster, E. (1972). What is beautiful is good. Journal of Personality and Social Psychology, 24(5), 283-290.
Fogg, B. J., Marshall, J., Kameda, T., Solomon, J., Ragnekar, A., Boyd, J., et al. (2001).
Web credibility research: A method for online experiments and early study results.
Proceedings of CHI 2001 Extended Abstracts on Human Factors in Computing Systems,61-68.
Garrett, J. J. (2003). The elements of user experience: User-centered design for the web.
Indianapolis, IN: New Riders.
Lindgaard, G., Fernandes, G., Dudek, C., & Brown, J.(2006). Attention web designers: You
have 50 milliseconds to make a good first impression! Behaviour & Information Technology, 25(2), 115-126.
Nielsen, J. (2006, April 17). F-shaped pattern for reading web content. Jakob Nielsen's Alertbox. Retrieved October 28, 2006,
http://www.useit.com/alertbox/reading_pattern.html
Norman, D. A. (2004). Emotional design: Why we love (or hate) everyday things. New York: Basic Books.
Norman, D. A., Ortony, A., & Russell, D. M. (2003). Affect and machine design: Lessons
for the development of autonomous machines. IBM Systems Journal, 42(1), 38-44.
Rosenfeld, L., & Morville, P. (2002). Information architecture for the World Wide Web:
Designing large-scale web sites (2nd Ed.). Sebastopol, CA: O'Reilly.
Tractinsky, N., Katz, A. S., & Ikar, D. (2000). What is beautiful is usable. Interacting with Computers, 13, 127-145.
Warnick, B. (2004). Online ethos: Source credibility in an "authorless" environment.
American Behavioral Scientist, 48(2), 256-265.
Wathan, C. N., & Burkell, J. (2002). Believe it or not: Factors influencing credibility on the
web. Journal of the American Society for Information Science and Technology, 53(2),
134-144.
Wodtke, C. (2003). Information architecture: Blueprints for the web. Indianapolis, IN: New Riders.