searching the “deep” web for medical information: reflections from a healthcare information...

4
Searching the “deep” web for medical information: reflections from a healthcare information researcher Angela Heath Long Island University, 720 Northern Blvd., Brookville, NY 11548 Palmer School of Library & Information Science Email: [email protected] As a researcher of medical equipment, my job is to provide informational support to engineers and architects who build healthcare facilities. Because the information requested is technical and specific, it is often times not available by performing searches on the World Wide Web. In order to perform the job and fulfill information requests, it is often necessary to search in deeper levels of the Web. This paper reflects upon the experiences and practices that I have developed which utilize traditional and current technologies. Introduction Information searching research and practice converge in my world daily as I perform the many information-related activities on my job. As a medical information researcher in an Architectural / Engineering (AE) firm, my job is to provide engineers and architects with medical equipment information. In order to do this, my knowledge of information searching theory and practice have to converge. What is the “deep” web and how is it different than the web? The “deep” or “invisible web”, according to Price and Sherman (2001) is the collection of web pages that are not searchable by the major search engines. These pages are usually found by burrowing into deeper levels of the Web. One main difference between the Deep Web and Web is the type of content. The content on the Deep Web is much more specific and focused. Major search engines that attempt to offer information on a wide range of topics do not index this content. Often times, information contained in the Deep Web is password protected or not free and thus is “invisible” to the automatic spider programs used by the major search engines. These characteristics make information in the Deep Web harder to retrieve. A day in the life of … Sellen, Murphy and Shaw (2002) conducted a study of 24 knowledge workers from various fields and characterized their information searching activities as: finding, information gathering, browsing and transacting. As a healthcare information researcher, my tasks are no different. My typical information needs involve finding specific items of information such as a technical specifications sheet (a.k.a. cutsheet) or set of installation guidelines for a specific product. This information is extremely technical and primarily comes in the form of a Portable Document File (PDF), audio/visual file or AutoCad drawing. Although this information is free, it is often not public knowledge. For instance, while a cutsheet on a $1500 treadmill might be readily available on the company’s website, a cutsheet on a $2 million MRI unit may not be. Many companies regard this data as too valuable to put online because of the competitive market. In these cases, usually phone calls to a regional or local sales representative may be required. Some companies have an entire technical support hierarchy that one must go through in order to get the documents. In these cases, the free information on the Web can be used to get to deeper information by providing proper terminology and context for information requests. Like any information researcher, information gathering is a crucial activity and is done the majority of the time. Defined by Wei Choo (2000) as “less-specific finding”, information gathering is similar in that it involves having a goal. Much of my time is spent researching current changes in technology, evolution of products and general

Upload: angela-heath

Post on 15-Jun-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Searching the “deep” web for medical information: reflections from a healthcare information researcher Angela Heath Long Island University, 720 Northern Blvd., Brookville, NY 11548 Palmer School of Library & Information Science Email: [email protected] As a researcher of medical equipment, my job is to provide informational support to engineers and architects who build healthcare facilities. Because the information requested is technical and specific, it is often times not available by performing searches on the World Wide Web. In order to perform the job and fulfill information requests, it is often necessary to search in deeper levels of the Web. This paper reflects upon the experiences and practices that I have developed which utilize traditional and current technologies.

Introduction Information searching research and practice converge in my world daily as I perform the many information-related activities on my job. As a medical information researcher in an Architectural / Engineering (AE) firm, my job is to provide engineers and architects with medical equipment information. In order to do this, my knowledge of information searching theory and practice have to converge.

What is the “deep” web and how is it different than the web? The “deep” or “invisible web”, according to Price and Sherman (2001) is the collection of web pages that are not searchable by the major search engines. These pages are usually found by burrowing into deeper levels of the Web. One main difference between the Deep Web and Web is the type of content. The content on the Deep Web is much more specific and focused. Major search engines that attempt to offer information on a wide range of topics do not index this content. Often times, information contained in the Deep Web is password protected or not free and thus is “invisible” to the automatic spider programs used by the major search engines. These characteristics make information in the Deep Web harder to retrieve.

A day in the life of … Sellen, Murphy and Shaw (2002) conducted a study of 24 knowledge workers from various fields and characterized their information searching activities as: finding, information gathering, browsing and transacting.

As a healthcare information researcher, my tasks are no different. My typical information needs involve finding specific items of information such as a technical specifications sheet (a.k.a. cutsheet) or set of installation guidelines for a specific product. This information is extremely technical and primarily comes in the form of a Portable Document File (PDF), audio/visual file or AutoCad drawing. Although this information is free, it is often not public knowledge. For instance, while a cutsheet on a $1500 treadmill might be readily available on the company’s website, a cutsheet on a $2 million MRI unit may not be. Many companies regard this data as too valuable to put online because of the competitive market. In these cases, usually phone calls to a regional or local sales representative may be required. Some companies have an entire technical support hierarchy that one must go through in order to get the documents. In these cases, the free information on the Web can be used to get to deeper information by providing proper terminology and context for information requests.

Like any information researcher, information gathering is a crucial activity and is done the majority of the time. Defined by Wei Choo (2000) as “less-specific finding”, information gathering is similar in that it involves having a goal. Much of my time is spent researching current changes in technology, evolution of products and general

healthcare issues that might affect medical equipment. For instance, the ADA or the American Disabilities Act historically has had the greatest impact on how health facilities have been designed within the past few decades. Being familiar with the ADA and its requirements affects the size and dimensions of equipment such as bathroom dispensers for towels and soap, width of patient beds, tables, countertop and under-the-counter (UTC) units like microwaves and refrigerators.

In the Sellen, et al. (2002) survey, the participants listed browsing to include activities such as scanning websites, book-marking articles from online newsletters and viewing pictures for “interest or entertainment”. All of these activities are a daily function of my work. And finally, the survey listed transacting, which can be defined as executing online transactions. In my work, completing online transactions usually involves filling out online forms to receive the desired information. Depending upon the location of the company and the familiarity with the company, sometimes a phone call will get quicker results. If I need to fill out an online form, I tend to use existing text that can be copied and pasted into form windows.

What I’ve learned What I have learned is that there exists a short list of obstacles that can prevent me from fulfilling an information request. They include – time, money, type of information needed and tools/technology available. To minimize and overcome these obstacles, I have utilized prior knowledge and research experience in information searching. Table 1 summarizes some realizations that I have come to and how I deal with them daily.

Considerations for future design As mentioned previously, one big problem with searching the Deep Web for medical information is the fact that the content in the Deep Web is usually located in databases and unfortunately, tracking programs used by search engines can’t track databases. Many of these databases require a fee to access the information while others require professional affiliation with the area. Interestingly enough, sometimes, free information that is available from manufacturers is listed in paid databases, which can be quite frustrating. In other words, the fee is for the organization and easy accessibility not for the information content. To minimize these types of costs, many firms like mine, have begun to develop their own in-house libraries. For instance, my company is in the process of developing a technical specifications database that will store product information from over 250 companies. These are companies that my firm has developed relationships with for over 30 years and thus make their information easily accessible by providing updated product binders, CDs and access to their extranet, One such company that we have a strong relationship with is XYZ Corporation*. (Name has been changed for privacy) XYZ Corporation is a leading manufacturer of equipment that is used to sterilize medical devices used in operating rooms. XYZ has given our company access to its web-based extranet system that stores updated technical cutsheets and Autocad drawings specific to our projects. The advantage of this system for our company is that the information comes directly from XYZ, is updated by them and is available for us to download from anywhere to any device. (i.e. computer, PDA, etc.)

Another well-publicized problem with searching both the Deep Web and the Web in general is poor interfaces or interfaces that are not specifically designed to retrieve technical information. Problems in terminology and the lack of industry-appropriate terms on most search engines actually hinder effective searching. Many search engines don’t give enough information on their construction and indexing techniques. For instance, a typical piece of medical equipment has many descriptors and it can be tricky retrieving. Shorter query strings actually won’t include enough information. For instance, in most fee-based medical equipment databases, the “general-specific” system for indexing is used. For example, a typical description for a display refrigerator used in a hospital gift shop might be – “refrigerator, merchandiser, glass, radius front, 2-door”. This description, when read backwards would make logical sense – “Radius front 2-door glass merchandiser refrigerator” which is how this item appears on a manufacturer’s technical cutsheet. Although this system may seem cumbersome, it is actually easier because each descriptor allows you to burrow into the database’s levels. So even if you didn’t know all of the above descriptors, you still have the option of just putting in “refrigerator” and going from there. What works well in this system is that you can put in filters that allow you to choose the particular features that you want to see. For example, putting in “refrigerator, 2-dooor” would give you all 2-door refrigerators – radius (curved) front, swing doors and sliding doors. You can also indicate size restrictions such as “at least 60 inches high” by using advanced search options. Unfortunately, the advanced features of the major search engines, even the most intelligent ones don’t allow for this type of specificity.

One promising design approach has been to create domain-specific search environments. These search spaces are designed by the actual communities that they serve. Researchers and practitioners alike certainly have explored this idea. Many promising examples can be seen, from the informally created library guides (http://www.lii.org/) to the more formal deep web browser projects such as the Directed Query Engine projects at the Department of Energy. (Warnick et. al 2001) The project uses a tool called the Distributed Explorer, which works like the older meta-crawling search technology – namely, the ability to deliver search results from multiple search engines simultaneously. As a practitioner, I think that both approaches are valuable and necessary. The directory approach gives knowledge workers in specific domains opportunities to create and share their own resources. (i.e. Open Directory Project – www.dmoz.org) These directories become the best sources of information because the entries are continually used and updated by the users.

References Cannon, N. (2002). Yahoo! Do You Google? Virtual Reference Overview. The Reference Librarian. 77, 31-37. Price, G. and Sherman, C. (2001). "Exploring the Invisible Web," Online, 25 (4), 32-34. Price, G. and Sherman, C. (2001). "The Invisible Web," Searcher, 9(6), 62-74. Sellen, A.J., Murphy, R. and Shaw, K. (2002). How knowledge workers use the Web. Proceedings of CHI. 4(1): 227-234 Sherman, Chris (2001). "Google unveils more of the Invisible Web” Searchday, No. 128. Warnick, W. et al. (2001). Searching the Deep Web: Directed Query Engine Applications at the Energy Department. D-Lib Magazine. 7(1). Wei Choo C., Detlor, B. & Turnbull, D. (2000). Information seeking on the web: An integrated model of searching and browsing. First Monday, Vol. 5(2).