assessing the user experience (ux) of online museum collections: perspectives from design and museum...

41
Assessing the User Experience (UX) of Online Museum Collections: Perspectives from Design and Museum Professionals Craig M. MacDonald, Ph.D. Pra3 Institute, School of Information and Library Science Paper | Museums and the Web 2015 | April 9, 2015

Upload: craigmmacdonald

Post on 27-Jul-2015

393 views

Category:

Design


1 download

TRANSCRIPT

Assessing the User Experience (UX) of Online Museum Collections: Perspectives from Design and Museum Professionals

Craig  M.  MacDonald,  Ph.D. Pra3  Institute,  School  of  Information  and  Library  Science

Paper  |  Museums  and  the  Web  2015  |  April  9,  2015

The Online Collection A  common  feature  of  museum  websites  is  the  online  collection. Idea:  allow  experts  access  to  museum  holdings  without  needing  to  be  physically  present.

Substantial  time  and  effort  has  been  invested  in  developing  these  online  collections. But,  collections  are  routinely  among  the  least  visited  sections  of  the  website.

2

Two Possible Explanations

1. Most  people  are  

completely  uninterested  in  viewing  museum  objects  through  a  computer  screen.

3

2. People  want  to  view  digital  museum  objects  but  are  

deterred  from  doing  so  due  to  the  poor  experiences  offered  by  existing  online  

collection  interfaces.

Beyond Usability

Museums  understand  the  importance  of  a  usable  website. If  a  visitor  can’t  find  information  about  visiting  the  museum,  they  probably  won’t.

But,  usability  alone  is  no  longer  sufficient. Museums  cannot  simply  provide  access  to  their  digital  materials;  they  must  also  create  positive  experiences  for  their  users. 4

UX of Online Museum Collections

Overarching  Research  Question

How  can  the  experience  of  using  online  museum  collections  be  improved?

Related  Questions:

1.  What  factors  determine  the  UX  of  an  online  museum  collection?

2.  How  can  these  factors  be  used  to  evaluate  the  UX  of  existing  online  museum  collections?

5

Two Challenges 1.  Evaluating  interfaces  is  time-­‐‑consuming  and  

resource  intensive. Even  lightweight  usability  testing  methods  can  be  challenging.

2.  UX  is  a  complex  concept  that  is  difficult  to  evaluate  well. The  relevant  UX  factors  of  a  mobile  banking  app  are  likely  not  the  same  as  those  of  an  online  museum  collection.

What’s  needed:  

An  evaluation  method  that  is  easy  to  use,  adaptable,  and  quick.

6

Assessment Rubrics Defined  as:  “criteria  for  assessing  complicated  things.” Common  in  educational  se3ings  because  they  articulate  gradations  of  quality  for  meaningful  dimensions  or  criteria.

7

Scale  Level  1 Scale  Level  2 Scale  Level  2

Dimension  1 description description description

Dimension  2 description description description

Dimension  3 description description description

Benefits of Using Rubrics Efficiency

Streamline  assessment  by  reducing  need  to  explain  why  specific  scores  were  given.

Transparency Clearly  define  “quality”  in  objective  and  observable  ways.

Reflectiveness Don’t  directly  prescribe  specific  fixes;  instead,  reflect  on  why/how  improvements  can  be  made.

Ease  of  Use Simple  as  completing  a  form,  and  completed  rubric  is  effective  tool  for  communicating  results.

8

Rubric Creation Process

1.  Identify  purpose/goals 2.  Choose  rubric  type 3.  Identify  the  dimensions 4.  Choose  a  rating  scale 5.  Write  descriptions  for  each  rating  point

9

What is this rubric for?

This  is  the  most  important  step,  as  it  will  drive  all  subsequent  decisions. Goal:  To  assess  the  UX  quality  of  an  online  museum  collection.

10

Step  1

What type of rubric?

Holistic  Rubrics Look  at  a  product  or  performance  as  a  whole;  contain  just  one  dimension  (e.g.,  “overall  quality”).

Analytic  Rubrics Split  a  product  or  performance  into  its  component  parts;  allow  for  feedback  on  multiple  dimensions.

11

Step  2

What dimensions matter?

Requires  breaking  down  the  product  being  evaluated  into  components  that  are: Observable Important Precise

No  prescribed  way  to  do  this;  just  needs  to  be  a  process  that  can  be  explained  and  justified.

12

Step  3

Finding a starting point Began  with  a  literature  search  to  see  if  any  UX  criteria  for  online  museums  had  already  been  established. Starting  point:  Lin,  Fernandez,  and  Gregor  (2012)  identified  4  design  characteristics  and  five  design  principles  associated  with  user  enjoyment. Characteristics:  Novelty,  Harmonization,  No  time  constraint,  Appropriate  facilitation  and  association Principles:  Multisensory  learning  experiences,  Creating  a  storyline,  Mood  building,  Fun  in  learning,  Establishing  social  connection 13

Step  3

Testing Lin et al.’s model

With  a  graduate  assistant,  reviewed  39  online  museum  collections  with  respect  to  these  9  dimensions. This  allowed  for  a  bo3om-­‐‑up  approach.

Ensured  that  dimensions  were  reflective  of  what  the  museum  community  considers  valuable.

14

Step  3

Finding Exemplars The  Rijksmuseum  quickly  emerged  as  an  exemplar. But,  discussing  how  it  excelled  uncovered  limitations  to  Lin  et  al.’s  framework. Many  dimensions  were  actually  describing  multiple  concepts,  making  them  difficult  to  assess  independently.

15

Step  3

Refining the dimensions In  response,  we  developed  a  parallel  set  of  dimensions  that  were  more  observable  and  explicit. And  that  more  closely  matched  our  interpretation  of  Lin  et.  al.’s  framework.

This  allowed  us  to: Improve  the  vocabulary  to  make  it  more  accessible; Tighten  the  concepts  to  make  them  more  distinguishable;  and Evaluate  the  ability  of  each  dimension  to  capture  an  important  aspect  of  UX.

16

Step  3

Iterative testing We  iteratively  tested  the  rubric  with  various  museum  collections  to  further  refine  and  strengthen  the  dimensions. Goal  was  to  make  them  less  ambiguous  and  more  observable. •  Ex:  Harmonization  and  Mood  building  became  Strength  of  Visual  Content  and  Visual  Aesthetics.

Finally,  split  the  dimensions  into  3  categories  inspired  by  Don  Norman’s  model  of  Emotional  Design:  Visceral,  Behavioral,  Reflective. 17

Step  3

Choosing a rating scale

Typical  rubrics  use  between  2-­‐‑  and  5-­‐‑point  rating  scales. Four  rating  scale  points  were  chosen  and  a  neutral,  non-­‐‑judgmental  language  was  selected: Incomplete Beginning Developing Emerged

18

Step  4

Gradations of quality

Final  step:  writing  clear  and  well-­‐‑defined  gradations  of  quality  for  each  rubric  dimension. A  4-­‐‑point  rating  scale  should  describe  quality  ratings  as: No No,  but Yes,  but Yes

19

Step  5

Final Assessment Rubric Visceral  (immediate  impact)

1.  Strength  of  visual  content 2.  Visual  aesthetics

Behavioral  (immediate  usage) 3.  System  reliability  &  performance 4.  Usefulness  of  metadata 5.  Interface  usability 6.  Support  for  casual  &  expert  users

Reflective  (long-­‐‑term  usage) 7.  Uniqueness  of  virtual  experience 8.  Openness 9.  Integration  of  social  features 10. Personalization  of  experiences

20

1  Incomplete 2  Beginning 3  Developing 4  Emerged

Ex: Strength of Visual Content

21

Incomplete Beginning Developing Emerged Artwork  is  a  peripheral  

component  of  the  collection,  with  

text  the  dominant  visual  element.  Images,  when  present,  are  too  small  and  low  quality.  Text  is  a  major  distraction  from  the  visual  

content.

[No] [No,  but] [Yes,  but] [Yes]

Artwork  is  not  emphasized  

throughout  the  collection,  and  

images  are  rarely  the  dominant  visual  element.  Some  images  are  too  small  and/or  low  quality.  At  times,  text  is  too  

dense  and  distracts  from  the  visual  content.

Artwork  is  featured  

throughout  the  collection,  but  images  are  not  always  the  

dominant  visual  element.  Most  images  are  large  and  high  quality.  

Text  is  used  purposefully,  but  

some  is  superfluous.

Artwork  is  presented  as  the  primary  focus  of  the  collection,  with  images  as  the  dominant  visual  element.  All  images  are  large  and  high  quality.  Text  is  

used  purposefully  but  sparingly  to  enhance  the  

visual  content.

Next Step: Rubric Quality Four  experts  –  two  museum  professionals  and  two  UX  professionals  –  were  asked  to  apply  the  rubric  to  three  online  museum  collections. Sessions  took  ~90  minutes  to  complete  (approx.  20  minutes  per  museum) Held  one-­‐‑on-­‐‑one  (3  face-­‐‑to-­‐‑face,  1  remote) Completed  in  August/September  2014

Three  aspects  of  rubric  quality:

Reliability Validity Utility

22

What is rubric reliability?

The  extent  to  which  using  the  rubric  provides  consistent  ratings  of  quality. i.e.:  do  different  raters  provide  the  same  (or  similar)  ratings  when  applying  the  rubric  to  the  same  interface?

This  is  known  as  inter-­‐‑rater  reliability.

Common  measure:  consensus  agreement

23

UX Rubric Reliability

Participants  rated  three  museum  collections  on  ten  different  dimensions. 30  potential  opportunities  for  agreement.

Two  estimates  of  agreement:

Conservative:  all  raters  provide  the  same  rating •  Target:  Approximately  30%  or  higher

Liberal:  all  raters  are  within  one  rating  point •  Target:  Approximately  80%  or  higher

24

Reliability: Results [1] Participant  Type Conservative Liberal All  (4) 4  /  30  (13.3%) 19  /  30  (63.3%)

25

Reliability: Results [2] Participant  Type Conservative Liberal All  (4) 4  /  30  (13.3%) 19  /  30  (63.3%)

Museum  (2) 14  /  30  (46.7%) 28/30  (96.3%)

UX  (2) 9  /  30  (30.0%) 24  /  30  (80.0%)

26

Reliability: Discussion

27

Using  the  rubric  was  be3er  than  blind  guessing,  but  there  is  room  for  improvement. Especially  when  combining  UX  and  Museum  experts.

Conclusion:  Don’t  mix  evaluators  -­‐‑  they  should  all  share  a  disciplinary  background  and  professional  focus.

What is rubric validity?

The  extent  to  which  using  the  rubric  provides  accurate  measures  of  quality. Many  types  of  validity;  for  rubrics,  two  common  types: 1)  Content  Validity 2)  Construct  Validity

28

UX Rubric Content Validity

Content  validity  refers  to  the  extent  to  which  the  rubric  measures  things  that  actually  maXer. i.e.,  do  the  dimensions  of  the  rubric  make  sense?

Ideally,  content  validity  is  demonstrated  by  soliciting  feedback  from  subject  ma3er  experts  during  rubric  creation.   In  this  case,  study  participants  were  asked  to  rate  the  perceived  relevance  of  each  rubric  dimension.

29

Content Validity: Results

30

Content Validity: Discussion None  of  the  experts  proposed  any  other  concepts  or  elements  that  should  have  been  included. Conclusion:  Rubric  has  content  validity,  but  Reflective  dimensions  may  need  more  refinement. Are  social  features  or  personalization  options  really  the  best  way  to  engage  online  visitors?   Can  challenges  of  providing  “open”  collection  be  mitigated? •  These  are  open  research  questions.

31

UX Rubric Construct Validity Construct  validity  refers  to  whether  the  rubric  actually  measures  the  construct  it  is  supposed  to  measure. i.e.,  is  the  UX  rubric  actually  assessing  UX?

Ideally,  construct  validity  is  demonstrated  by  showing  a  correlation  between  rubric  scores  and  another  accepted  measure  of  quality. But,  there  is  no  accepted  measure  of  UX  quality. •  Instead,  study  participants  were  asked  to  provide  perceived  levels  of  construct  validity.

32

Construct Validity: Results

33

Construct Validity: Discussion

All  participants  felt  the  rubric  was  an  effective  measure  of  UX. But,  museum-­‐‑centric  language  was  a  perceived  barrier  for  the  UX  experts.

Conclusion:  Rubric  has  construct  validity,  but  language  could  be  more  accessible  to  non-­‐‑museum  experts.  

34

What is rubric utility?

The  actual  impact  of  using  the  rubric  as  an  assessment  instrument. i.e.,  does  using  the  rubric  make  a  difference?

Arguably  the  most  complex  and  most  important  quality  of  a  rubric. But,  measuring  actual  impact  is  nearly  impossible  (too  many  confounding  factors).

35

UX Rubric Utility

Instead,  focus  on  perceived  impact. Evaluators  need  to  think  the  rubric  is  valuable,  otherwise  they’ll  be  unlikely  to  use  it.

Need  to  demonstrate  the  extent  evaluators  believe  the  rubric  is:   Useful?   Easy  to  use? Easy  to  learn?

36

Utility: Results

37

Utility: Discussion [1]

All  participants  affirmed  the  utility  of  the  rubric  as  an  assessment  instrument. Biggest  benefit  is  to  aid  decision-­‐‑making:

UX  expert:  the  rubric  seems  like  a  great  tool  to  “help  museums  figure  out  their  digital  budget.”

How?  By  providing  a  snapshot  of  the  assessment  results.

38

Utility: Discussion [2]

39

Summary

Study  results  show  that  the  rubric  is  a  reliable,  valid,  and  useful  assessment  instrument. Future  work: •  Clarify  museum-­‐‑specific  language. •  Examine  the  reflective  dimensions  more  closely. •  Study  the  practicality  of  the  rubric  through  an  applied  case  study  with  a  museum  partner.

Conclusion:  Rubric  can  provide  valuable  guidance  for  museums  interested  in  improving  their  users’  experience  with  online  collections.

40

Thank you

Craig  M.  MacDonald,  Ph.D. [email protected] @CraigMMacDonald www.craigmacdonald.com