msdn event 03/06/2004 welcome. sharepoint search inge de neef sharepoint consultant getronics...

65
MSDN Event 03/06/2004 Welcome

Upload: collin-boone

Post on 24-Dec-2015

225 views

Category:

Documents


4 download

TRANSCRIPT

MSDN Event 03/06/2004

Welcome

SharePoint SearchInge De NeefSharePoint ConsultantGetronics [email protected]

AgendaAgenda

DefinitionDefinitionComparing Search TechnologiesComparing Search TechnologiesSearch ExtensibilitySearch Extensibility

Modifying the Built-in Default User Modifying the Built-in Default User InterfaceInterfaceThesaurusThesaurusUsing SharePoint Portal Server Search Using SharePoint Portal Server Search from other applicationsfrom other applicationsExtending SharePoint Portal Server Search Extending SharePoint Portal Server Search to Index Other Contentto Index Other Content

SummarySummary

Search - DefinitionSearch - Definition

A single locationA single location from which to from which to search multiple information search multiple information sources simultaneously.sources simultaneously.

Microsoft Search Service ships with Microsoft Search Service ships with number of default content sources:number of default content sources:

File SharesFile SharesWebsites (both http and https)Websites (both http and https)Exchange FoldersExchange FoldersActive DirectoryActive DirectoryLotus Notes databasesLotus Notes databasesOther SharePoint sites and portalsOther SharePoint sites and portals

Search - DefinitionSearch - Definition

Keyword searchesKeyword searches that search that search the full text of a document the full text of a document and the document's and the document's properties (metadata). properties (metadata).

Best Bet classificationBest Bet classification for for documents selected as the documents selected as the best recommendation for a best recommendation for a category or a specific category or a specific keyword keyword

Search Technology Search Technology ComparisonComparison

SharePoint Portal SharePoint Portal Server 2003Server 2003

Windows Sharepoint Windows Sharepoint ServicesServices

Uses SharePoint Portal Server Uses SharePoint Portal Server 2003 -specific Search2003 -specific Search * Richer UI* Richer UI

Uses Microsoft SQL Server Full-Uses Microsoft SQL Server Full-Text SearchText Search

Aggregator of multiple sites, Aggregator of multiple sites, portals and external sourcesportals and external sources

Limited to a single siteLimited to a single site

E-mail and Web Part deliveryE-mail and Web Part delivery* Optionally creation of custom * Optionally creation of custom delivery channels delivery channels

E-mail delivery onlyE-mail delivery only

Shared ServicesShared Services

Search ExtensibilitySearch ExtensibilityModifying the Built-in Default User Modifying the Built-in Default User InterfaceInterface

Search WebPart PageSearch WebPart PageSearch Box, Search Menu, Advanced Search, Search Search Box, Search Menu, Advanced Search, Search Results PageResults Page

ThesaurusThesaurusUsing SharePoint Portal Server Search Using SharePoint Portal Server Search from Other Applicationsfrom Other Applications

SharePoint Portal Server Query Web service SharePoint Portal Server Query Web service (http://<portal>/_vti_bin/search.asmx)(http://<portal>/_vti_bin/search.asmx)

Research Library Task Pane Research Library Task Pane

Extending SharePoint Portal Server Search Extending SharePoint Portal Server Search to Index Other Contentto Index Other Content

IFilter, IProtocolHandler, IWordbreaker, IFilter, IProtocolHandler, IWordbreaker, IStemmerIStemmer

Modifying the Built-in User Modifying the Built-in User InterfaceInterface

Search Web Part PageSearch Web Part PageGhosted pageGhosted page

SharePoint Portal Server’s SharePoint Portal Server’s Search Web Part PageSearch Web Part Page

Working TogetherWorking Together

Search Menu

Search Box

Advanced Search

Search Result

JavaScript (Search.js and in HTML)

Hidden Fields

Search BoxSearch Box

SearchBoxSearchBoxUsed to enter the search keywordUsed to enter the search keywordPresent on every SharePoint pagePresent on every SharePoint pageInteresting property: Interesting property: SearchResultPageURLSearchResultPageURL

Default opens Search.aspxDefault opens Search.aspx

Property: ContextSensitiveScopeType Property: ContextSensitiveScopeType determines the default search scope determines the default search scope (All Sources or This Topic)(All Sources or This Topic)

<SPSWC:RightBodySectionSearchBox runat="server" <SPSWC:RightBodySectionSearchBox runat="server" ContextSensitiveScopeType=“1”ContextSensitiveScopeType=“1”SearchResultPageURL=”SEARCH_HOME” SearchResultPageURL=”SEARCH_HOME”

FrameType="None“/>FrameType="None“/>

Search MenuSearch Menu

Search MenuSearch Menu

JavaScript JavaScript FunctionsFunctions

Onshd Onshd OnToggleAllGroups OnToggleAllGroups OnPinSearchOnPinSearchOnSubscribeSearchOnSubscribeSearchTooleMgmtAdvTooleMgmtAdv

Advanced Search Advanced Search

Adding custom metadata Adding custom metadata to the Advanced Searchto the Advanced Search

Creating a New Advanced Creating a New Advanced Search Web PartSearch Web Part

Can be created by inheriting from:Can be created by inheriting from:Microsoft.SharePoint.Portal.WebControls.AdvanceMicrosoft.SharePoint.Portal.WebControls.Advance

dSearchControldSearchControl

Own controls can be addedOwn controls can be added

Only need to change Only need to change CreateChildControls and CreateChildControls and RenderWebPart methodsRenderWebPart methods

Search ResultsSearch Results

Search Results Web PartSearch Results Web Part

3 ways of customizing:3 ways of customizing:Property Sheet (basic Property Sheet (basic customization)customization)Dwp fileDwp fileFrontPage 2003FrontPage 2003

Creating your own Creating your own SearchResults pageSearchResults page

Search Results Web PartSearch Results Web Part

Customize Customize through the through the Property SheetProperty Sheet

The number of The number of items returneditems returnedThe display The display text for the text for the “no results” “no results” conditionconditionWidths of Widths of columnscolumns

Search Results Web PartSearch Results Web Part

Customize through the DWP fileCustomize through the DWP file

Call the search page with Call the search page with http://localhost/Search.aspx?http://localhost/Search.aspx?mode=Edit&PageView=Sharedmode=Edit&PageView=Shared Export the Search Results Web PartExport the Search Results Web Part

Search Results Web PartSearch Results Web Part

Customize using FrontPage 2003Customize using FrontPage 2003

Search page is displayed in xmlSearch page is displayed in xml

Properties can be changed by modifying Properties can be changed by modifying their values.their values.

Search Results Web Part – Search Results Web Part – PropertiesPropertiesResultListIDResultListIDFixLayoutFixLayoutGroupByListGroupByListDefaultGroupByDefaultGroupBySortByListSortByListDefaultSortByDefaultSortByColumnURIsColumnURIsColumnWidthsColumnWidthsTextForNoResultTextForNoResultss

RowNumberForEachItemRowNumberForEachItemEnableQueryLoggingEnableQueryLoggingSearchSearchSupportExpandCollapseAllSupportExpandCollapseAllEnableSQLCommandLogging EnableSQLCommandLogging ColumnDisplayNamesColumnDisplayNamesOpenNewWindowForMatchingIteOpenNewWindowForMatchingIte

msmsShowRankForEachItemShowRankForEachItemMaxMatchingItemsNumberMaxMatchingItemsNumber

Search Results Web Part – Search Results Web Part – Interesting propertiesInteresting properties

DefaultSortBy:DefaultSortBy:Default value: Default value:

““urn:schemas-microsoft-urn:schemas-microsoft-com:fulltextqueryinfo:rank DESC”com:fulltextqueryinfo:rank DESC”

Most relevant result is shown first Most relevant result is shown first (based on OKAPI algorithm)(based on OKAPI algorithm)

MaxMatchingItemsNumber:MaxMatchingItemsNumber:Maximum number of search results Maximum number of search results shownshown

Search Results Web Part – Search Results Web Part – Interesting propertiesInteresting properties

OpenNewWindowForMatchingItemsOpenNewWindowForMatchingItems::

To open search results in a new To open search results in a new windowwindowBy default set to falseBy default set to falseNo default xml tag available in the No default xml tag available in the search page, must be added:search page, must be added:

<OpenNewWindowForMatchingItems <OpenNewWindowForMatchingItems xmlns=“urn:schemas-microsoft-xmlns=“urn:schemas-microsoft-com:sharepoint:DataResultBase” >true com:sharepoint:DataResultBase” >true </OpenNewWindowForMatchingItems></OpenNewWindowForMatchingItems>

Creating your own Search Creating your own Search Results PageResults Page

Can be created by inheriting from:Can be created by inheriting from:

Microsoft.SharePoint.Portal.WebControls.SearchRMicrosoft.SharePoint.Portal.WebControls.SearchResultsesults

The number of methods to override The number of methods to override depends on the complexity of the depends on the complexity of the search.search.

Your own Search Results Your own Search Results Page: method examplesPage: method examples GenerateQueryStringGenerateQueryString

ParametersParameters:: QueryTemplateSelectPartQueryTemplateSelectPart QueryTemplateFromPartQueryTemplateFromPart QueryTemplateWherePartQueryTemplateWherePart QueryTemplateOrderByPartQueryTemplateOrderByPart (out) strSavedQuery(out) strSavedQuery

IssueQueryIssueQuery

GenerateHtmlOneRowForOneItemGenerateHtmlOneRowForOneItem

Example: How to Add Example: How to Add Support for Wildcard Support for Wildcard SearchesSearches

Thesaurus and Noise WordsThesaurus and Noise Words

Thesaurus and noise wordsThesaurus and noise words

ThesaurusThesaurus•Allows you to search for:Allows you to search for:

•Search termSearch term•SynonymsSynonyms•TranslationsTranslations•Chemical formulasChemical formulas•… … other matching wordsother matching words

•SharePoint uses different thesaurus SharePoint uses different thesaurus files for different languagesfiles for different languages

ThesaurusThesaurus

•Expansion tags:Expansion tags:•E.g. holiday verlof congéE.g. holiday verlof congé

•Replacement tagsReplacement tags•E.g. MS MicrosoftE.g. MS Microsoft

•StemmingStemming•E.g. talk** speak**E.g. talk** speak**

ThesaurusThesaurus

Thesaurus – Noise wordsThesaurus – Noise words

•Noise words are words that are Noise words are words that are neglected in search queries, such as: neglected in search queries, such as: “the”, “or”, “if”, numbers, …“the”, “or”, “if”, numbers, …

•Stored in:Stored in:•C:\Program Files\SharePoint Portal Server\C:\Program Files\SharePoint Portal Server\DATA\ConfigDATA\Config

Noise WordsNoise Words

SummarySummary

Search is powerfulSearch is powerfulSearch is meant to be used by many Search is meant to be used by many clientsclientsSearch is extensible with many Search is extensible with many componentscomponentsSearch is customizable with many Search is customizable with many optionsoptions

Get out there and build on it!Get out there and build on it!

SharePoint Portal Server SharePoint Portal Server Query Web ServiceQuery Web Service

Query Web ServiceQuery Web ServiceQueryQuery

Accepts Query XMLAccepts Query XMLDefined by the Defined by the urn:Microsoft.Search.Queryurn:Microsoft.Search.Query namespacenamespace

Returns Response XMLReturns Response XMLDefined by the Defined by the urn:Microsoft.Search.Responseurn:Microsoft.Search.Response namespace namespace

QueryExQueryExAccepts Query XMLAccepts Query XML

Defined by the Defined by the urn:Microsoft.Search.Queryurn:Microsoft.Search.Query namespacenamespace

Returns search results as DataSet for Returns search results as DataSet for the specified query string.the specified query string.

SharePoint Portal Server SharePoint Portal Server Query Web ServiceQuery Web Service

RegistrationRegistrationDefined by the Defined by the urn:Microsoft.Search.Registrationurn:Microsoft.Search.Registration namespacesnamespaces Returns the name of a portal site.Returns the name of a portal site.

SPSGetPortalSearchInfoSPSGetPortalSearchInfoReturns a list of search and catalog Returns a list of search and catalog scopes.scopes.

StatusStatusReturns a success code to indicate that Returns a success code to indicate that the search provider is availablethe search provider is available

SharePoint Portal Server SharePoint Portal Server Query Web ServiceQuery Web Service

Add Web ReferenceAdd Web ReferenceService found at Service found at http://<portal>/_vti_bin/search.asmxhttp://<portal>/_vti_bin/search.asmx

AuthenticateAuthenticateFormulate and send a queryFormulate and send a query

SharePoint Portal Server SharePoint Portal Server Query Web ServiceQuery Web Service

Syntax HelpSyntax HelpMicrosoft SharePoint Portal Server 2001 Microsoft SharePoint Portal Server 2001 SDKSDKManage Properties of Indexed ContentManage Properties of Indexed ContentManage Content SourcesManage Content SourcesSPSGetPortalSearchInfo()SPSGetPortalSearchInfo()SPSQueryServiceConst ClassSPSQueryServiceConst Class

Templates for SELECT, WHERE, CONTAINSTemplates for SELECT, WHERE, CONTAINS

View Source on search.aspxView Source on search.aspx

SharePoint Portal Server SharePoint Portal Server Query Web ServiceQuery Web Service

QueryText PointersQueryText PointersQueryText type='STRING‘QueryText type='STRING‘

Returns results with some Research Task Returns results with some Research Task Pane intelligent formattingPane intelligent formatting

QueryText type='MSSQLFT‘QueryText type='MSSQLFT‘Query() returns 2 columns regardless of Query() returns 2 columns regardless of queryquery

DAV:DisplayName, DAV:hrefDAV:DisplayName, DAV:hrefSELECT must contain SELECT must contain urn:schemas.microsoft.com:fulltextqueryinfourn:schemas.microsoft.com:fulltextqueryinfo:sdid:sdid

Research Library Task PaneResearch Library Task Pane

Research and Reference Research and Reference Task Pane Task Pane

Task Pane in Microsoft Office System Task Pane in Microsoft Office System applicationsapplicationsAllows user to search information Allows user to search information sourcessourcesA platform content providers can A platform content providers can build onbuild onIt supports rich content and formsIt supports rich content and formsSharePoint Portal Server is SharePoint Portal Server is compatible!compatible!

Research and Reference Research and Reference Task PaneTask Pane

Registration FunctionRegistration FunctionQuery FunctionQuery FunctionResponse XMLResponse XML

Extending SharePoint Portal Extending SharePoint Portal Server Search to Index Server Search to Index

Other ContentOther Content

Extending SharePoint Extending SharePoint Portal Server Portal Server to Index More Contentto Index More Content

Architecture overviewArchitecture overviewTools to be BuiltTools to be BuiltProtocol HandlersProtocol HandlersFiltersFiltersWord BreakersWord Breakers

Search CharacteristicsSearch CharacteristicsEnterprise ScalabilityEnterprise ScalabilityFrom ~ 5 M Docs to ~ 20 M DocsFrom ~ 5 M Docs to ~ 20 M DocsCross catalog querying, load balanced Cross catalog querying, load balanced queriesqueriesVery Significant for Enterprise Search Very Significant for Enterprise Search ScenariosScenariosShared Portal ServicesShared Portal ServicesContent AggregationContent AggregationProbablistic Relevance RankingProbablistic Relevance RankingNotifications/Alerts, Topic AssistantNotifications/Alerts, Topic AssistantAdaptive CrawlingAdaptive CrawlingCommon Search Technology across Common Search Technology across Microsoft Product OfferingsMicrosoft Product Offerings

Protocol Handlers and Protocol Handlers and IFiltersIFilters

SharePoint Portal Server indexing capability is SharePoint Portal Server indexing capability is extensible via the development of Protocol extensible via the development of Protocol Handlers and IFiltersHandlers and IFilters

Protocol Handlers are used for extending the indexing Protocol Handlers are used for extending the indexing capability of SharePoint Portal Server to other capability of SharePoint Portal Server to other content content sourcessourcesIFilters are IFilters are generallygenerally used for indexing specific types of used for indexing specific types of files.files.

Called by Protocol Handler, and thusCalled by Protocol Handler, and thusCan be skipped if the Protocol Handler is willing to do all Can be skipped if the Protocol Handler is willing to do all the workthe work

This is low-level technology; you’ll still need to This is low-level technology; you’ll still need to use COMuse COM

You must write a COM component, your end result will You must write a COM component, your end result will be a .dllbe a .dllCan use VC.NET to develop these components – Can use VC.NET to develop these components – attributed C++ is an advantage, but the code is not attributed C++ is an advantage, but the code is not managedmanaged

Search StructureSearch Structure

Protocol Handler General Protocol Handler General FeaturesFeatures

Registers with gathererRegisters with gathererConnects to external content sourceConnects to external content sourceCollects data from external content Collects data from external content sourcesourceBinds to content in external content Binds to content in external content source & streams back to gatherersource & streams back to gathererObtains metadata and security Obtains metadata and security information on external content information on external content source and sends back to gatherersource and sends back to gathererSends LCID info to gatherer where Sends LCID info to gatherer where appropriate. appropriate.

Protocol Handlers Protocol Handlers Provided by MicrosoftProvided by Microsoft

Microsoft Search Service ships with a Microsoft Search Service ships with a number of Protocol out-of-boxnumber of Protocol out-of-box

file://file://http://http://ExchangeExchangeActive DirectoryActive DirectoryLotus Notes databasesLotus Notes databasesSharePoint sites and portalsSharePoint sites and portals

IFilter General FeaturesIFilter General Features

Extends the types of files which can Extends the types of files which can be indexedbe indexedAlso COM based, end result is a .DLLAlso COM based, end result is a .DLLExtracts internal properties from files Extracts internal properties from files as well as body textas well as body textIFilters can be used with any IFilters can be used with any Microsoft Search Vehicle, not just Microsoft Search Vehicle, not just SharePoint Portal ServerSharePoint Portal Server

Microsoft® Windows®Microsoft® Windows®SQL ServerSQL ServerMicrosoft® Exchange ServerMicrosoft® Exchange Server

IFilters Provided by IFilters Provided by MicrosoftMicrosoft

Microsoft Search Service ships with a Microsoft Search Service ships with a number of IFilters out-of-boxnumber of IFilters out-of-box

All Office System document formatsAll Office System document formatsTIFFTIFFXMLXML

Popular 3Popular 3rdrd Party IFilters Party IFiltersPDFPDFCAD (.dwg)CAD (.dwg)

Word Breaker General Word Breaker General FeaturesFeatures

Decomposition of text into individual Decomposition of text into individual text tokens, or words text tokens, or words Extends the locales of data which Extends the locales of data which can be indexedcan be indexedAlso COM based, end result is a .DLLAlso COM based, end result is a .DLLWordbreakers can be also used with Wordbreakers can be also used with any Microsoft Search Vehicle, not just any Microsoft Search Vehicle, not just SharePoint Portal ServerSharePoint Portal Server

Word breakers Provided Word breakers Provided by Microsoftby Microsoft

Many word breakers ship out of box Many word breakers ship out of box in the Microsoft Search Servicein the Microsoft Search ServiceInterface recently published in Interface recently published in Microsoft Platform SDKMicrosoft Platform SDKhttp://msdn.microsoft.com/library/default.asp?http://msdn.microsoft.com/library/default.asp?

url=/library/en-us/indexsrv/html/url=/library/en-us/indexsrv/html/ixrefint_9sfm.aspixrefint_9sfm.asp

Steps to Building a Steps to Building a Protocol HandlerProtocol Handler

Install sample, get it runningInstall sample, get it runninghttp://msdn.microsoft.com/library/default.asp?http://msdn.microsoft.com/library/default.asp?

url=/library/en-us/spssdk/html/url=/library/en-us/spssdk/html/_creating_a_protocol_handler_sample.asp_creating_a_protocol_handler_sample.asp

Connect to your content sourceConnect to your content sourceIterate through contentsIterate through contentsExtract data & pass to GathererExtract data & pass to GathererWrite metadata mapping codeWrite metadata mapping codeWrite security mapping codeWrite security mapping codeTestTest

IFiltersIFilters

Typically for parsing file formatsTypically for parsing file formatsImplement them within a protocol Implement them within a protocol handler to expose …handler to expose …Contents of directories (Pretend the Contents of directories (Pretend the directory is a document and the files’ directory is a document and the files’ names form a multi-valued property. names form a multi-valued property. See sample PH in SDK for example.)See sample PH in SDK for example.)Properties taken from the document Properties taken from the document store, not from inside the document store, not from inside the document itself itself

The Gatherer Pulls DataThe Gatherer Pulls Data

You are reactive to the gatherer – You are reactive to the gatherer – must wait for requestsmust wait for requestsThe gatherer pulls – you can’t pushThe gatherer pulls – you can’t pushYou must buffer/cache data until You must buffer/cache data until asked for itasked for itYou must keep more state than you You must keep more state than you might likemight likeThis isn’t rocket science, but it can This isn’t rocket science, but it can be a pitfall.be a pitfall.

The Gatherer is The Gatherer is MultithreadedMultithreaded

Address data locking early in the Address data locking early in the design processdesign processThink about COM apartments and Think about COM apartments and threading modelsthreading modelsThe gatherer promises some thread The gatherer promises some thread affinity; see the SDK for detailsaffinity; see the SDK for detailsAre your libraries thread safe?Are your libraries thread safe?

Returning DataReturning Data

Return File/Folder indicatorsReturn File/Folder indicatorsDocument BodyDocument BodyRelevant Metadata IncludingRelevant Metadata Including

Standard MetadataStandard MetadataCustom MetadataCustom MetadataLCID (if applicable)LCID (if applicable)

Location information (URL)Location information (URL)Security InformationSecurity Information

Clickable URLs in Search Clickable URLs in Search ResultsResults

Search is no good if you can’t get to Search is no good if you can’t get to the documentthe documentDefault result URL is search URL, but Default result URL is search URL, but IE probably doesn’t understand your IE probably doesn’t understand your protocol (ie dctm://).protocol (ie dctm://).The DAV:href property controls the The DAV:href property controls the URL that search renders – you can URL that search renders – you can overwrite with a URL that the overwrite with a URL that the browser will likebrowser will like

Security MappingSecurity Mapping

Search results contain file names, Search results contain file names, locations, and excerpts of the locations, and excerpts of the contentscontentsDefault allows every user to see Default allows every user to see every fileevery fileSearch can “trim” results, but you Search can “trim” results, but you must provide an NT ACL for each file must provide an NT ACL for each file at crawl time … no callbackat crawl time … no callbackMapping users between domains or Mapping users between domains or even OSes can be trickyeven OSes can be trickyMapping mechanismMapping mechanism

Property MappingProperty Mapping

Done through the administrative UIDone through the administrative UIMaps source document properties Maps source document properties (e.g., Author, Subject, Description, (e.g., Author, Subject, Description, etc.) to specific target properties etc.) to specific target properties (e.g., Auteur, Onderwerp, (e.g., Auteur, Onderwerp, Omschrijving)Omschrijving)