an exploratory study of users information search ... · satisfaction with intelligent assistants....
TRANSCRIPT
An Exploratory Study of Users’ Information Search Performance and Challenges in Using Intelligent
Conversational Assistants in Health-related Scenarios
Zhaopeng Xing
Practicum Presentation
PSM in BMHI
4/5/2018
Outlines
•Motivation • Exploratory study
• Methods• Observations• Wrap-up
• Introduction• Related studies• Purpose & scope
Motivation
Intelligent Conversational Assistants (ICAs)
A program that assists users to achieve certain results in a dialogic fashion (Dale, 2016; Jurafsky & Martin, 2014).
(Dale, 2016)
Potential but with challenges in healthcare
• Improving information accessibility (Bickmore, Utami, Matsuyama, & Paasche-Orlow, 2016)
• Personalized health information service (Alexa with WebMD)
• Long-tail health issues, such as drugs, alcohol & sex issues (Crutzen, Peters, Portugal, Fisser, & Grolleman, 2011)
• Demanding interactions (Schalkwyk et al., 2010)
• Limited capability of providing satisfying information (Miner et al., 2016)
Studies Topic categories
(Jiang et al., 2015) (Schalkwyk et al., 2010)
User’s intent and purposes (Preferred topics and scenarios)
(Guy, 2016) (Jeng, Jiang, & He, 2013, 2016)
(Schalkwyk et al., 2010)(Shokouhi, Jones, Ozertem, Raghunathan, & Diaz, 2014)(Lopatovska & Williams, 2018)
Query behaviors & strategies
(Trippas, Spina, Cavedon, Joho, & Sanderson, 2017, 2018) Information seeking models and search pattern
(Kiseleva & de Rijke, 2017) (Kiseleva, Williams, Hassan Awadallah, et al., 2016) (Kiseleva, Williams, Jiang, et al., 2016)(Jiang et al., 2015)(Zamora, 2017)
Users’ satisfaction, perception and expectations
(Trippas, Spina, Cavedon, & Sanderson, 2017) (Avula, Chadwick, Arguello, & Capra, 2018) Engagement of ICAs
Related studies
Study findings
• Difference of interaction modalities has effects on users’ search beahviors
• The use of voice search is sensitive to the scenario
• Technical issues impacts the search experience, e.g. recognition errors
Methods applied
• Observational studies using mixed method
• Compare with traditional search interface
• Query analysis is not appropriate to use in lab settings
Questions to Figure Out
1. What effects ICAs may have on users’ search performance in healthcare scenarios?• Completion time
• The number of query
• Perceived usefulness
2. What types of health search tasks and which part of search process may be challenging in using ICAs in contrast with textual search interface?
What types of health search tasks and which part of search experience users may feel challenging using ICAs in contrast with traditional web browser
• Fact-finding task and exploratory search task (Pang, Chang, Pearce, & Verspoor, 2014, 2016)
• Fact-finding task
• Exploratory search task
• Search subprocess (Marchionini & White, 2007): • Express the information needs • Examine results• Reformulation• Make a decision
What types of health search tasks and which part of search process users may feel challenging using ICAs in contrast with traditional web browser
Study
• 10 participants recruited at UNC-CH
• Native speakers (3 male and 7 female, Age mean = 23, SD = 3.23)
• (Screen) Have experience of using ICAs on weekly basis (Siri, Alexa, Cortana, Google Assistant)
• Google Chrome app (5 Typers) and Google Allo (5 Talkers)
• Iphone 6s plus, IOS 11.2.6
• Videotaped and Monitor the screen
• In study rooms in HSL at UNC
Methods – Participants & Apparatus
• Pre screen with a questionnaire
• Inform consent
• Orientation
• Each participant was assigned three task scripts in random order, one task at one time.
• Post task questionnaire after each task
• Interview
• Search and browse history were deleted from the mobile phone after every participant session
• Search without accounts and the search customization was disabled.
Methods – Procedure (45min)
A demo
Task Script
Task 1:Imagine that you recently have a dry mouth and reduced urinations and realize that you rarely have time for water for the busy schedule. You want to know the amount of water an adult should drink per day.
Task 2: Imagine that after finishing a crazy day in your office, you begin to sneeze and feel extremely fatigue. You feel you get a temperature and a mild headache with chills. You don't want to hang out with your friends as usual but sleep at home. Now you want to figure out what’s wrong with you and then look up which types of OTC medicine can relieve these symptoms.
Task 3: Imagine that you are taking a course, An introduction to Health Policy in U.S, in this semester.. You recently read a news article about American Health Care Act of 2017 (AHCA), which aims to partially repeal and replace Obamacare. You are very interested in this law and expect to learn more details. Find information about the current state of AHCA and then explore three interesting issues regarding AHCA for a group discussion next week
(Hendahewa & Shah, 2015); (Pang, Chang, Pearce, & Verspoor, 2014); (Wildemuth & Freund, 2012)
Measurements Quantitative Qualitative
Information search performance
Completion time (video review)
Semi-structured interview
The number of queries (Query and chat logs)
Users’ perceived usefulness of obtained information (5 point scale likert)
Perceived difficultiesOverall difficulties of each task (5 point scale likert)
Difficulties of going thru every single sub-process (5 point scale likert)
Data Collection
Observations – Completion Time
Talkers Typers Talkers Typers Talkers Typers
89.80 339.0 600.4 403.0270.8175.6
• (T1) Four talkers used information given by answer box
• (T3) Talkers had more visits to webpages (3.6 v.s 1.8 webpages)
• (T3) Back-and-forth check between task script and webpage than the typers(3.4 times vs 1.2 times)
Observations – Query number
4.6
1.6 1.6
Talkers Typers Talkers Typers Talkers Typers
5.6
2.6
1.8
• (T2) Talkers reissued query partially because of interruption or forgetting to tap speak button 6/23
• (T3) All talkers did slide the suggested query tags but only 1 talker tapped it.
• (T3) All typers used recommended queries at least once
• (T3) 4 typers reformulated their query by slightly modifying queries
How useful was the information you obtained using GA/GC
How difficult was it to complete this task using GA/GC
Sub-process in task 3
Talkers Typers
Initiate query 0 0
Examineresults
0 0
Reformulate 3 (difficult) 0
Make a decision
3 (2 difficult, 1 very difficult)
1(difficult)
The number of users rated it over 3
Talkers said:
• Prefer to use voice search in health fact-finding tasks (All takers)
• A preference for a list of results in exploratory search task (All talkers)
• Have to think about how to formulate query effectively (All talkers)
• Suggested query is useless (all talkers)
• Tend to avoid rewording queries in exploratory tasks (Talker1, 2, 4, 5)
• Seldom use voice search for health information (Talker 1, 3, 4, 5)
• Answer box is useful in quick judgement of information usefulness (Talker 1, 2, 5)
Some other interesting observations
• Didn’t notice copy-pasting behaviors on typers’ sessions.
• Speech recognition errors occurred 2 times overall but interruption ruined users search experience
• Humanoid moment excited users (“so sweet”)• - I am sneezing and tired with a headache
- don't try to talk if it hurts, but you can keep trying
• Funny moment made users laugh to cry• - My symptoms are mild headache chills, fatigue and sneezing
- Ok I will remember that
Conclusion
• Potential but with rooms for improvements
• Google assistant has great advantage in health fact-finding tasks and needs improvements in addressing exploratory search tasks on • facilitate users’ query reformulation process
• provide indications of useful results properly
Future study
• Research topics: query logs analysis, existing information behavior model still works
• Methods: Wizard of Oz, physiological measurement
• Subjective: the elder had more difficulties in finding health related information (Czaja, Sharit, & Nair, 2008; Sharit, Hernandez, Nair, Kuhn, & Czaja, 2011; Trewin et al., 2012)
• Different input modality: Alexa medical AI with WebMD
https://www.youtube.com/watch?v=19RXpg65CbA
Thanks
Citation• Avula, S., Chadwick, G., Arguello, J., & Capra, R. (2018). SearchBots: User Engagement with ChatBots during Collaborative Search. In
Proceedings of the 2018 Conference on Human Information Interaction&Retrieval' ' - CHIIR ’18 (pp. 52–61). New York, New York, USA: ACM Press. doi:10.1145/3176349.3176380
• Crestani, F., & Du, H. (2006). Written versus spoken queries: A qualitative and quantitative comparative analysis. Journal of the American Society for Information Science and Technology, 57(7), 881–890. doi:10.1002/asi.20350
• Guy, I. (2016). Searching by talking: analysis of voice queries on mobile web search. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval - SIGIR' ' ’16 (pp. 35–44). New York, New York, USA: ACM Press. doi:10.1145/2911451.2911525
• Jeng, W., Jiang, J., & He, D. (2016). Users’ Perceived Difficulties and Corresponding Reformulation Strategies in Google Voice Search. Journal of Library and Information Studies, 14(Number 1, 2016), 25–39(15).
• Jiang, J., Hassan Awadallah, A., Jones, R., Ozertem, U., Zitouni, I., Gurunath Kulkarni, R., & Khan, O. Z. (2015). Automatic online evaluation of intelligent assistants. In Proceedings of the 24th International Conference on World Wide Web - WWW' ' ’15 (pp. 506–516). New York, New York, USA: ACM Press. doi:10.1145/2736277.2741669
• Jiang, J., Jeng, W., & He, D. (2013). How do users respond to voice input errors? Lexical and phonetic query reformulation in voice search. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval -SIGIR' ' ’13 (p. 143). New York, New York, USA: ACM Press. doi:10.1145/2484028.2484092
• Kiseleva, J., & de Rijke, M. (2017). Evaluating Personal Assistants on Mobile devices. arXiv.
• Kiseleva, J., Williams, K., Hassan Awadallah, A., Crook, A. C., Zitouni, I., & Anastasakos, T. (2016). Predicting User Satisfaction with Intelligent Assistants. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval - SIGIR' ' ’16 (pp. 45–54). New York, New York, USA: ACM Press. doi:10.1145/2911451.2911521
Citation• Kiseleva, J., Williams, K., Jiang, J., Hassan Awadallah, A., Crook, A. C., Zitouni, I., & Anastasakos, T. (2016). Understanding User
Satisfaction with Intelligent Assistants. In Proceedings of the 2016 ACM on Conference on Human Information Interaction and Retrieval - CHIIR' ' ’16 (pp. 121–130). New York, New York, USA: ACM Press. doi:10.1145/2854946.2854961
• Kotranza, A., Lok, B., Deladisma, A., Pugh, C. M., & Lind, D. S. (2009). Mixed reality humans: evaluating behavior, usability, and acceptability. IEEE Transactions on Visualization and Computer Graphics, 15(3), 369.
• Le Bigot, L., Jamet, E., & Rouet, J.-F. (2004). Searching information with a natural language dialogue system: a comparison of spoken vs. written modalities. Applied Ergonomics, 35(6), 557–564. doi:10.1016/j.apergo.2004.06.001
• Lopatovska, I., & Williams, H. (2018). Personification of the amazon alexa: BFF or a mindless companion. In Proceedings of the 2018 Conference on Human Information Interaction&Retrieval' ' - CHIIR ’18 (pp. 265–268). New York, New York, USA: ACM Press. doi:10.1145/3176349.3176868
• Lurie, N. H. (2004). Decision making in information-rich environments: The role of information structure. Journal of Consumer Research.
• Smith, A. (2015). US smartphone use in 2015. Pew Research Center, 1.
• Schalkwyk, J., Beeferman, D., Beaufays, F., Byrne, B., Chelba, C., Cohen, M., … Strope, B. (2010). your word is my command“: google search by voice: A case study. In A. Neustein (Ed.), Advances in speech recognition (pp. 61–90). Boston, MA: Springer US. doi:10.1007/978-1-4419-5951-5_4
• Shokouhi, M., Jones, R., Ozertem, U., Raghunathan, K., & Diaz, F. (2014). Mobile query reformulations. In Proceedings of the 37th
• international ACM SIGIR conference on Research & development in information retrieval - SIGIR' ' ’14 (pp. 1011–1014). New York, New York, USA: ACM Press. doi:10.1145/2600428.2609497
• Thomas, P., Czerwinski, M., McDuff, D., Craswell, N., & Mark, G. (2018). Style and Alignment in Information-Seeking Conversation. In Proceedings of the 2018 Conference on Human Information Interaction&Retrieval' ' - CHIIR ’18 (pp. 42–51). New York, New York, USA: ACM Press. doi:10.1145/3176349.3176388
• Trippas, J. R., Spina, D., Cavedon, L., Joho, H., & Sanderson, M. (2018). Informing the design of spoken conversational search: perspective paper. In Proceedings of the 2018 Conference on Human Information Interaction&Retrieval' ' - CHIIR ’18 (pp. 32–41). New York, New York, USA: ACM Press. doi:10.1145/3176349.3176387
• Trippas, J. R., Spina, D., Cavedon, L., & Sanderson, M. (2017). How Do People Interact in Conversational Speech-Only Search Tasks: A Preliminary Analysis. In Proceedings of the 2017 Conference on Conference Human Information Interaction and Retrieval - CHIIR' ' ’17(pp. 325–328). New York, New York, USA: ACM Press. doi:10.1145/3020165.3022144
• Ye-Yi Wang, Dong Yu, Yun-Cheng Ju, & Acero, A. (2008). An introduction to voice search. IEEE Signal Processing Magazine, 25(3), 28–38. doi:10.1109/MSP.2008.918411
• Yuan, X., Belkin, N., Jordan, C., & Dumas, C. (2011). Design of a study to evaluate the effectiveness of a spoken language interface to information systems. Proceedings of the American Society for Information Science and Technology, 48(1), 1–3. doi:10.1002/meet.2011.14504801239
• Zamora, J. (2017). I’m sorry, dave, i'm afraid I can't do that: chatbot perception and expectations. In Proceedings of the 5th International Conference on Human Agent Interaction' ' - HAI ’17 (pp. 253–260). New York, New York, USA: ACM Press. doi:10.1145/3125739.3125766
Citation