positioning enterprise search
TRANSCRIPT
1
Positioning Enterprise Search: Case Studies of What it is and What it should be
Boston KM Forum, August 18, 2005Lynda MoultonLynda MoultonLWMLWM Technology ServicesTechnology Serviceshttp://www.lwmtechnology.comhttp://www.lwmtechnology.com
2
Topics� What is search?
� What search should be?
� Evolution of Search
� Search Jargon
� Contemporary Evolution of Search
� Generic Search Technologies
� Search Interfaces
� Under the Cover of Search
� The Next Big Step Depends on …
3
What Search Is
Usually software - implemented to help people find specific content
4
What Search Should Be
Positioned as finding answers to questions or satisfying inquiry
5
The Evolution of the Search for Knowledge� 1000 BC – Xi’An –cited as the world’s oldest library� Assyria - 883-859 BC.The importance of Ashurbanipal'sLibrary can not be overstated…Like a modern library this collection was spread out into many rooms according to subject matter. [tablet from 1500 BC]
� Greeks searched for knowledge through metaphysical rhetoric and polemics (800 – 600 BC) .
Aristotle classified and categorized knowledge by type
6
The Evolution of the Search for Knowledge, continued
The transmission of knowledge A case study: the Arab acquisition of Greek scienceAlthough the Arabs were aware of the Greek scientific tradition, it was not they who were using it … it had yet to be adopted.The next phase seems to have been started under Harun ar-Rashid (ruled 786-808), who sent agents to purchase Greek manuscripts from the Byzantine empire.Nicholas Whyte, Sint-Genesius-Rode/Rhode-St-Génèse, 10 September 2000, last modified 15 September 2002
http://www.worldinvisible.com/images/apolog/codsin.jpg
Here is an example of what “is arguably the first large bound book to have been produced. For one volume to contain all the Christian scriptures book technology had to make a great technological leap forward.”[roughly 4th century]
7
The Evolution of the Search for Knowledge, continued
Jumping to the 1600s Jumping to the 1600s we find this we find this approach to approach to capturing and capturing and safeguarding safeguarding knowledgeknowledge
Hereford Cathedral Chained Hereford Cathedral Chained LibraryLibrary
8
The Evolution of the Search for Knowledge, continued
� Fast-forward to the 1900s when we used a card catalogue index for search. This was supplemented with Reader’s Guide to Periodical Literature and other more specialized print indices to find content beyond books.� In the 1970s the SDC Corporation developed Orbit, as a result of work done for the National Library of Medicine to enable electronic searching of NLM’s gold standard index, Index Medicus (database name was Medline)� Searching was done via a teletype device (TI silent 700) hooked up using a 90-baud acoustic coupler modem to a mainframe computer housing the index in Palo Alto, CA.� In the late 1970s we upgraded to 300 baud modems
9
Some Search Jargon� Structured search� Spidering� Indexing� Free text search� Keyword� Phrase� Date� Numeric� Federated search
� Structured with authority control
� Operators� Metadata� Tagging� Thesaurus� Taxonomy� Ontology� Semantic search� Portals & search boxes
10
Command Language InterfaceThe first searching on-line was
executed using commands describing the fields to be searched, the operator (e.g. begins with, contains), and explicit Boolean commands (and, or, & not), and the string of text to be found
The typed commands appeared on the thermo fax paper, were transmitted with a SEND key, and the remote computer returned your command in tact with errors in syntax underscored
http://www.cas.org/ONLINE/DBSS/copperlitss.html
Example of command language commands and symbols:
QS (titlew,subjw)#’solar’SS (ti@’solar energy’) !
(sh@’solar energy’)A ma,tsL ma,ts,sn,pd,pd,rnE sk1,50
11
Enterprise Search 1970s - GUIsLibrarians trained in
command language searching and specific database specialties did all the searching for the corporation
Typical Specialties: Scientific & technical, Medicine, Social sciences, Theatre & Arts, Competitive intelligence, Patents, Government Documents
� Some professionals outside the library field took search courses in their specialties but without regular (daily) exposure to searching, soon realized that their time was more valuable
� Costs were based on connect time, number of records retrieved – those that fumbled incurred big costs
12
Enterprise Search 1980s - GUIsAutomated library systems for universities and public consortia became commonCorporations often procured the software used by the major search services (Dialog, BRS, and Orbit) to full-text index any ASCII text files they had. Specialized library systems for corporations evolved into the forerunner of content management systems today
� Terminals with screens made command language searching somewhat easier – scrolling & editing (Self-service was encouraged) � Pre-windows CRTs (dumb terminals) supported scrolling through indices (browsing) and of menu-driven Boolean searching through good forms design (Self-Service became possible and easier)� ESC and CTRL keys supported early versions of pop-up windows for help, look-ups, and efficient movement among forms before Macs and Windows to guide users in system navigation
13
Enterprise Search 1990s – GUIs w/ Web AlternativesDumb terminals connected to a single processor were
replaced with client server (Windows clients, any server)
Distributed database technology grew popular with possibilities for federated searching or single query searches across databases possible
Web-based searching emerged in about 1994-96
Librarians still controlled the domain of enterprise content
14
Enterprise Search 21st Century –Hybrid and Siloed
�Search embedded in all applications (e.g. CRM, email, Adobe reader, QuickPlace)�Search engines spidering enterprises (e.g. Verity, Inxight, Endeca)�Web-based searching for published information (fee-based and free) (e.g. Dialog, Lexis-Nexis, Google)�Federated search to search many structure databases (e.g. BookWhere? by SeaChange)�Structured search – Specified Fields (http://66.187.153.86/archives/frame.htm)
�Free-text search – Anywhere in record (of metadata) or anywhere in source document
15
The Order of ImplementationTypical Flow but Starting with Taxonomies is Smarter
Search is implemented to index key words and phrases in targeted documents.
Intranet Web sites or portals are put on corporate systems to deliver search options.
Taxonomies are built to offer a visible list of navigable topics as an alternative to search.
Because navigation requires a link from taxonomy term to each document a metadata structure must be built.
Search box is placed on the Intranet without clear explanation of what is being searched.
16
Classes of Search Products� Embedded Search – combo index &
sequential� Spider, Categorize, Index� Spider and Index� Structured (database or metadata back-
end)� Hybrid – a mix of search – most common� Semantic Search – index against ontology
and answer question
17
Who is Doing What?� MITRE Cases-08182005.htm
� Raytheon� Lincoln Laboratory� Air Products� DuPont Cases-08182005.htm
� Johnson & Johnson http://www.sla.org/Documents/conf/SharingKnowledge.doc
18
Where Search is Headed� Natural Language – But how can this
become real?Barriers:� Ontologies need to be built� Migration from existing systems must be
addressed� We must find better ways to tag and codify
enterprise content� Enabling technologies, like browsers were
for the Internet, are needed (voice processing anyone?)
19
What Are the Enabling Factors� Interfaces that guide thought� Enable inquiry in human-like manner� Systems that enable systematic and
natural learning� Systems that identify holes in
knowledge and intelligently seek out the missing ingredients
20
THANK YOU FOR LISTENING AND LEARNINGTHANK YOU FOR LISTENING AND LEARNINGTHANK YOU FOR LISTENING AND LEARNINGTHANK YOU FOR LISTENING AND LEARNING
THE END
��������������� ������������