terence k. huwe director of library & information resources

27
Track A | Tech Developments & Trends | A105 Track A | Tech Developments & Trends | A105 Meaning-Based Computing: Meaning-Based Computing: New Functionalities from the New Functionalities from the World of Enterprise Search World of Enterprise Search Terence K. Huwe Terence K. Huwe Director of Library & Information Director of Library & Information Resources Resources Institute for Research on Labor & Institute for Research on Labor & Employment Employment University of California, Berkeley University of California, Berkeley [email protected] [email protected]

Upload: deliz

Post on 20-Mar-2016

40 views

Category:

Documents


0 download

DESCRIPTION

Track A | Tech Developments & Trends | A105 Meaning-Based Computing: New Functionalities from the World of Enterprise Search. Terence K. Huwe Director of Library & Information Resources Institute for Research on Labor & Employment University of California, Berkeley [email protected]. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Terence K. Huwe Director of Library & Information Resources

Track A | Tech Developments & Trends | A105Track A | Tech Developments & Trends | A105

Meaning-Based Computing:Meaning-Based Computing: New Functionalities from the World of New Functionalities from the World of

Enterprise SearchEnterprise Search

Terence K. HuweTerence K. HuweDirector of Library & Information ResourcesDirector of Library & Information Resources

Institute for Research on Labor & EmploymentInstitute for Research on Labor & EmploymentUniversity of California, BerkeleyUniversity of California, Berkeley

[email protected]@library.berkeley.edu

Page 2: Terence K. Huwe Director of Library & Information Resources

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

OverviewOverview

Meaning-Based Computing (MBC):Meaning-Based Computing (MBC):– What it is, where it came fromWhat it is, where it came fromIts rapidly evolving potentialIts rapidly evolving potentialSome examples of its impact on work, and the Some examples of its impact on work, and the professionsprofessionsIts applicability to the research processIts applicability to the research processSome forecasts on what MBC may bring to Some forecasts on what MBC may bring to the information professionsthe information professions

Page 3: Terence K. Huwe Director of Library & Information Resources

The Importance of Forecasting The Importance of Forecasting ProbabilityProbability

How should we modify our beliefs in the How should we modify our beliefs in the light of new information?light of new information?““When the facts change, I change my When the facts change, I change my opinion. What do you do, sir?opinion. What do you do, sir?– John Maynard KeynesJohn Maynard Keynes

From: From: The Theory That Would Not Die,The Theory That Would Not Die,Sharon Bertsch McGrayne, Yale, 2011Sharon Bertsch McGrayne, Yale, 2011

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 4: Terence K. Huwe Director of Library & Information Resources

Bayseian Theory Sheds LightBayseian Theory Sheds Light

Thomas Bayse’ work was published after Thomas Bayse’ work was published after his death in 1763 by Richard Pricehis death in 1763 by Richard PriceBayse was interested in proving the Bayse was interested in proving the existence of Godexistence of GodHe labored in obscurity but has since He labored in obscurity but has since gained famegained fame

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 5: Terence K. Huwe Director of Library & Information Resources

What is Bayesian Analysis?What is Bayesian Analysis?““Scientific inquiry is an iterative process of integrating and Scientific inquiry is an iterative process of integrating and

accumulating information. Investigators assess the accumulating information. Investigators assess the current state of knowledge regarding the issue of current state of knowledge regarding the issue of interest, gather new data to address remaining interest, gather new data to address remaining questions, and then update and refine their questions, and then update and refine their understanding to incorporate both new and old data. understanding to incorporate both new and old data. Bayesian inference provides a logical, quantitative Bayesian inference provides a logical, quantitative framework for this process. framework for this process. It has been applied in a It has been applied in a multitude of scientific, technological, and policy settings.”multitude of scientific, technological, and policy settings.”

--The International Society for Bayseian Analysis--The International Society for Bayseian Analysis

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 6: Terence K. Huwe Director of Library & Information Resources

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 7: Terence K. Huwe Director of Library & Information Resources

1. Applications and Potential Uses1. Applications and Potential Uses

Used to help break the Enigma CodeUsed to help break the Enigma CodeHandwriting and speech recognitionHandwriting and speech recognitionMilitary uses Military uses Manufacturing and sales efficienciesManufacturing and sales efficienciesLegal compliance and due diligenceLegal compliance and due diligencePure scientific researchPure scientific researchAnd—information & records managementAnd—information & records management

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 8: Terence K. Huwe Director of Library & Information Resources

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 9: Terence K. Huwe Director of Library & Information Resources

New Life from British RootsNew Life from British Roots

Michael Lynch (Cambridge) saw the broad Michael Lynch (Cambridge) saw the broad potential of probability studypotential of probability studyHis academic research led to the founding His academic research led to the founding of of Autonomy,Autonomy, a FTSE 100 firm, recently a FTSE 100 firm, recently acquired by Hewlett-Packardacquired by Hewlett-PackardThe catalyst for market success: The catalyst for market success: enterprise search, and enterprise search, and the ascendance of the ascendance of unstructured dataunstructured data

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 10: Terence K. Huwe Director of Library & Information Resources

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 11: Terence K. Huwe Director of Library & Information Resources

Enterprise Search As Test BedEnterprise Search As Test Bed

80 percent of a firm’s info assets are 80 percent of a firm’s info assets are unstructured, and thus hard to retrieve unstructured, and thus hard to retrieve conventionallyconventionallyConventional search “imposes” structure Conventional search “imposes” structure onto data to categorize and retrieve itonto data to categorize and retrieve itThe Intellgient Data Operating Layer The Intellgient Data Operating Layer (IDOL) searches both (IDOL) searches both structured structured (databases) and (databases) and unstructured dataunstructured data

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 12: Terence K. Huwe Director of Library & Information Resources

Two Events Furthered the Two Events Furthered the Growth of MBCGrowth of MBC

In 2007, the U.S. Federal Rules of Civil In 2007, the U.S. Federal Rules of Civil Procedure made all Procedure made all data forms admissible data forms admissible for litigation—remember for litigation—remember EnronEnron??

The explosion in social media has created The explosion in social media has created new challenges for firms, requiring that new challenges for firms, requiring that they track they track unstructured unstructured information more information more effectivelyeffectively

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 13: Terence K. Huwe Director of Library & Information Resources

Since So Much Data Are Now Since So Much Data Are Now AdmissibleAdmissible

If firms have liability attached to online If firms have liability attached to online discourse, social media, etc, they need discourse, social media, etc, they need protectionprotection

Michael Lynch founded Autonomy to Michael Lynch founded Autonomy to provide just that: enterprise search across provide just that: enterprise search across all media types, enabling fuller awareness all media types, enabling fuller awareness of data assetsof data assets

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 14: Terence K. Huwe Director of Library & Information Resources

Enterprise Search is BoomingEnterprise Search is Booming

Enterprise Search is now Pan-EnterpriseEnterprise Search is now Pan-EnterpriseMany Fortune 500 firms recognize that Many Fortune 500 firms recognize that they need new tools for managing both they need new tools for managing both structured and unstructured datastructured and unstructured dataIt’s big business—MBC thrives in It’s big business—MBC thrives in commercial and pure research settingscommercial and pure research settings

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 15: Terence K. Huwe Director of Library & Information Resources

Autonomy’s MBC-Based ToolsAutonomy’s MBC-Based Tools

Implistic Query— “hotkey” to related Implistic Query— “hotkey” to related information without leaving a primary taskinformation without leaving a primary taskHyperlinking—live links, diverse sourcesHyperlinking—live links, diverse sourcesSmart, or Active FoldersSmart, or Active FoldersAutomatic Taxonomy GenerationAutomatic Taxonomy GenerationSentiment AnalysisSentiment AnalysisAutomatic clustering of all data typesAutomatic clustering of all data types

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 16: Terence K. Huwe Director of Library & Information Resources

2. The Lawyer vs. the Algorithm2. The Lawyer vs. the Algorithm

The “Discovery” process meets “E-Discovery”The “Discovery” process meets “E-Discovery”Tracing meaning by linking varied word useTracing meaning by linking varied word useFirms Like Blackstone Discovery are building a Firms Like Blackstone Discovery are building a market nichemarket nicheResult: Result: No armies of lawyers billing their timeNo armies of lawyers billing their time The client saves on legal bills, and law firms The client saves on legal bills, and law firms confront a new challengeconfront a new challenge

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 17: Terence K. Huwe Director of Library & Information Resources

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 18: Terence K. Huwe Director of Library & Information Resources

John Kelly, CEO, Blackstone John Kelly, CEO, Blackstone Discovery, Palo Alto, CA:Discovery, Palo Alto, CA:

““Data are talking to each Data are talking to each other in the ‘third other in the ‘third

person’” person’” MBC-driven techniques can uncover crucial MBC-driven techniques can uncover crucial

data for litigation by tracing relationshipsdata for litigation by tracing relationshipsTech Trends | A105 | IRLE | Tech Trends | A105 | IRLE |

University of California, Berkeley University of California, Berkeley

Page 19: Terence K. Huwe Director of Library & Information Resources

3. Impact on the Information 3. Impact on the Information ProfessionsProfessions

Coming our way soon?Coming our way soon?Still seeping from the enterprise search Still seeping from the enterprise search world. Some highlights:world. Some highlights:– ““Meaning Based Healthcare”Meaning Based Healthcare”– Universities use it at the enterprise level Universities use it at the enterprise level

(admission, etc)(admission, etc)– ConsultingConsulting– TelecommunicationsTelecommunications

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 20: Terence K. Huwe Director of Library & Information Resources

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 21: Terence K. Huwe Director of Library & Information Resources

Potential ApplicationsPotential Applications

Turbo-charged meta-searchTurbo-charged meta-searchEffective search of unstructured data Effective search of unstructured data (including social media)(including social media)Establish relationships between structured Establish relationships between structured information (libraries and databases) and information (libraries and databases) and unstructured information (social media, unstructured information (social media, voicemail, audio)voicemail, audio)

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 22: Terence K. Huwe Director of Library & Information Resources

MBC and Taxonomy-Based SearchMBC and Taxonomy-Based Search

Taxonomies continue to gain market shareTaxonomies continue to gain market shareTaxonomy & MBC solutions might coexistTaxonomy & MBC solutions might coexistWhy? Because MBC can manage social Why? Because MBC can manage social media categorization as an media categorization as an automated automated processprocessFor this to happen, For this to happen, developers need to get developers need to get involved involved

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 23: Terence K. Huwe Director of Library & Information Resources

Trend: 21Trend: 21stst Century Reference Century Reference

Pattern recognition is practiced at the reference desk; Pattern recognition is practiced at the reference desk; MBC proves that it is a high-level skillMBC proves that it is a high-level skill““Better” data requires more interpretation and analysis, Better” data requires more interpretation and analysis, not lessnot lessMore machine assistance is a good thingMore machine assistance is a good thingWe need a place at the table, perhaps without invitationWe need a place at the table, perhaps without invitation

““There is a massive space for information professional to There is a massive space for information professional to analyze data” --John Kelly, Blackstone Discoveryanalyze data” --John Kelly, Blackstone Discovery

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 24: Terence K. Huwe Director of Library & Information Resources

Some ForecastsSome Forecasts

Academic-based digital library developers Academic-based digital library developers may take an interestmay take an interestVendors might explore MBC as a meta-Vendors might explore MBC as a meta-search toolsearch toolRepositories may get a boostRepositories may get a boostThe practice of reference librarianship The practice of reference librarianship would benefit from this kind of toolwould benefit from this kind of tool

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 25: Terence K. Huwe Director of Library & Information Resources

ConclusionsConclusions

We need to be aware of Meaning Based We need to be aware of Meaning Based ComputingComputingWe should analyze its as-yet-unknown We should analyze its as-yet-unknown potential for search and discovery within potential for search and discovery within our digital librariesour digital librariesSocial media are growingSocial media are growingBe prepared to make the case for library-Be prepared to make the case for library-based information analysis and counselbased information analysis and counsel

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 26: Terence K. Huwe Director of Library & Information Resources

ReferencesReferencesACM Digital Library: ACM Digital Library: http://dl.acm.org

Autonomy: Autonomy: http://www.autonomy.com

Bayse, Thomas: Bayse, Thomas: http://en.wikipedia.org/wiki/Thomas_Bayes

Bertsch McGrayne, Sharon. Bertsch McGrayne, Sharon. The Theory That Would Not Die: How Bayse’ Rule The Theory That Would Not Die: How Bayse’ Rule Cracked the Enigma Code, Hunted Down Russian Submarines and Cracked the Enigma Code, Hunted Down Russian Submarines and Emerged Triumphant from Two Centuries of controversy. Emerged Triumphant from Two Centuries of controversy. Yale, 2011Yale, 2011

Blackstone Discovery: Blackstone Discovery: http://blackstonedisocvery.com

Markoff, John. “Armies of expensive lawyers, replaced by software.” The New Markoff, John. “Armies of expensive lawyers, replaced by software.” The New York Times, March 4, 2011York Times, March 4, 2011

Tech Trends | A105 | IRLE | Tech Trends | A105 | IRLE | University of California, Berkeley University of California, Berkeley

Page 27: Terence K. Huwe Director of Library & Information Resources

Track A | Tech Developments & Trends | A105Track A | Tech Developments & Trends | A105

Meaning-Based Computing:Meaning-Based Computing:New Functionalities from the World of New Functionalities from the World of

Enterprise SearchEnterprise Search

Terence K. HuweTerence K. HuweDirector of Library & Information ResourcesDirector of Library & Information Resources

Institute for Research on Labor & EmploymentInstitute for Research on Labor & EmploymentUniversity of California, BerkeleyUniversity of California, Berkeley

[email protected]@library.berkeley.edu