![Page 1: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/1.jpg)
www.EBooksWorld.ir
![Page 2: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/2.jpg)
www.EBooksWorld.ir
![Page 3: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/3.jpg)
ElasticsearchServerThirdEdition
www.EBooksWorld.ir
![Page 4: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/4.jpg)
TableofContents
ElasticsearchServerThirdEdition
Credits
AbouttheAuthors
AbouttheReviewer
www.PacktPub.com
eBooks,discountoffers,andmore
Whysubscribe?
Preface
Whatthisbookcovers
Whatyouneedforthisbook
Whothisbookisfor
Conventions
Readerfeedback
Customersupport
Downloadingtheexamplecode
Downloadingthecolorimagesofthisbook
Errata
Piracy
Questions
1.GettingStartedwithElasticsearchCluster
Fulltextsearching
TheLuceneglossaryandarchitecture
Inputdataanalysis
Indexingandquerying
Scoringandqueryrelevance
ThebasicsofElasticsearch
KeyconceptsofElasticsearch
Index
Document
www.EBooksWorld.ir
![Page 5: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/5.jpg)
Documenttype
Mapping
KeyconceptsoftheElasticsearchinfrastructure
Nodesandclusters
Shards
Replicas
Gateway
Indexingandsearching
Installingandconfiguringyourcluster
InstallingJava
InstallingElasticsearch
RunningElasticsearch
ShuttingdownElasticsearch
Thedirectorylayout
ConfiguringElasticsearch
Thesystem-specificinstallationandconfiguration
InstallingElasticsearchonLinux
InstallingElasticsearchusingRPMpackages
InstallingElasticsearchusingtheDEBpackage
Elasticsearchconfigurationfilelocalization
ConfiguringElasticsearchasasystemserviceonLinux
ElasticsearchasasystemserviceonWindows
ManipulatingdatawiththeRESTAPI
UnderstandingtheRESTAPI
StoringdatainElasticsearch
Creatinganewdocument
Automaticidentifiercreation
Retrievingdocuments
Updatingdocuments
Dealingwithnon-existingdocuments
Addingpartialdocuments
www.EBooksWorld.ir
![Page 6: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/6.jpg)
Deletingdocuments
Versioning
Usageexample
Versioningfromexternalsystems
SearchingwiththeURIrequestquery
Sampledata
URIsearch
Elasticsearchqueryresponse
Queryanalysis
URIquerystringparameters
Thequery
Thedefaultsearchfield
Analyzer
Thedefaultoperatorproperty
Queryexplanation
Thefieldsreturned
Sortingtheresults
Thesearchtimeout
Theresultswindow
Limitingper-shardresults
Ignoringunavailableindices
Thesearchtype
Lowercasingtermexpansion
Wildcardandprefixanalysis
Lucenequerysyntax
Summary
2.IndexingYourData
Elasticsearchindexing
Shardsandreplicas
Writeconsistency
Creatingindices
www.EBooksWorld.ir
![Page 7: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/7.jpg)
Alteringautomaticindexcreation
Settingsforanewlycreatedindex
Indexdeletion
Mappingsconfiguration
Typedeterminingmechanism
Disablingthetypedeterminingmechanism
Tuningthetypedeterminingmechanismfornumerictypes
Tuningthetypedeterminingmechanismfordates
Indexstructuremapping
Typeandtypesdefinition
Fields
Coretypes
Commonattributes
String
Number
Boolean
Binary
Date
Multifields
TheIPaddresstype
Tokencounttype
Usinganalyzers
Out-of-the-boxanalyzers
Definingyourownanalyzers
Defaultanalyzers
Differentsimilaritymodels
Settingper-fieldsimilarity
Availablesimilaritymodels
Configuringdefaultsimilarity
ConfiguringBM25similarity
ConfiguringDFRsimilarity
www.EBooksWorld.ir
![Page 8: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/8.jpg)
ConfiguringIBsimilarity
Batchindexingtospeedupyourindexingprocess
Preparingdataforbulkindexing
Indexingthedata
The_allfield
The_sourcefield
Additionalinternalfields
Introductiontosegmentmerging
Segmentmerging
Theneedforsegmentmerging
Themergepolicy
Themergescheduler
Throttling
Introductiontorouting
Defaultindexing
Defaultsearching
Routing
Theroutingparameters
Routingfields
Summary
3.SearchingYourData
QueryingElasticsearch
Theexampledata
Asimplequery
Pagingandresultsize
Returningtheversionvalue
Limitingthescore
Choosingthefieldsthatwewanttoreturn
Sourcefiltering
Usingthescriptfields
Passingparameterstothescriptfields
www.EBooksWorld.ir
![Page 9: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/9.jpg)
Understandingthequeryingprocess
Querylogic
Searchtype
Searchexecutionpreference
SearchshardsAPI
Basicqueries
Thetermquery
Thetermsquery
Thematchallquery
Thetypequery
Theexistsquery
Themissingquery
Thecommontermsquery
Thematchquery
TheBooleanmatchquery
Thephrasematchquery
Thematchphraseprefixquery
Themultimatchquery
Thequerystringquery
Runningthequerystringqueryagainstmultiplefields
Thesimplequerystringquery
Theidentifiersquery
Theprefixquery
Thefuzzyquery
Thewildcardquery
Therangequery
Regularexpressionquery
Themorelikethisquery
Compoundqueries
Theboolquery
Thedis_maxquery
www.EBooksWorld.ir
![Page 10: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/10.jpg)
Theboostingquery
Theconstant_scorequery
Theindicesquery
Usingspanqueries
Aspan
Spantermquery
Spanfirstquery
Spannearquery
Spanorquery
Spannotquery
Spanwithinquery
Spancontainingquery
Spanmultiquery
Performanceconsiderations
Choosingtherightquery
Theusecases
Limitingresultstogiventags
Searchingforvaluesinarange
Boostingsomeofthematcheddocuments
Ignoringlowerscoringpartialqueries
UsingLucenequerysyntaxinqueries
Handlinguserquerieswithouterrors
Autocompleteusingprefixes
Findingtermssimilartoagivenone
Matchingphrases
Spans,spanseverywhere
Summary
4.ExtendingYourQueryingKnowledge
Filteringyourresults
Thecontextisthekey
Explicitfilteringwithboolquery
www.EBooksWorld.ir
![Page 11: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/11.jpg)
Highlighting
Gettingstartedwithhighlighting
Fieldconfiguration
Underthehood
Forcinghighlightertype
ConfiguringHTMLtags
Controllinghighlightedfragments
Globalandlocalsettings
Requirematching
Customhighlightingquery
ThePostingshighlighter
Validatingyourqueries
UsingtheValidateAPI
Sortingdata
Defaultsorting
Selectingfieldsusedforsorting
Sortingmode
Specifyingbehaviorformissingfields
Dynamiccriteria
Calculatescoringwhensorting
Queryrewrite
Prefixqueryasanexample
GettingbacktoApacheLucene
Queryrewriteproperties
Summary
5.ExtendingYourIndexStructure
Indexingtree-likestructures
Datastructure
Analysis
Indexingdatathatisnotflat
Data
www.EBooksWorld.ir
![Page 12: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/12.jpg)
Objects
Arrays
Mappings
Finalmappings
SendingthemappingstoElasticsearch
Tobeornottobedynamic
Disablingobjectindexing
Usingnestedobjects
Scoringandnestedqueries
Usingtheparent-childrelationship
Indexstructureanddataindexing
Childmappings
Parentmappings
Theparentdocument
Childdocuments
Querying
Queryingdatainthechilddocuments
Queryingdataintheparentdocuments
Performanceconsiderations
ModifyingyourindexstructurewiththeupdateAPI
Themappings
Addinganewfieldtotheexistingindex
Modifyingfieldsofanexistingindex
Summary
6.MakeYourSearchBetter
IntroductiontoApacheLucenescoring
Whenadocumentismatched
Defaultscoringformula
Relevancymatters
ScriptingcapabilitiesofElasticsearch
Objectsavailableduringscriptexecution
www.EBooksWorld.ir
![Page 13: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/13.jpg)
Scripttypes
Infilescripts
Inlinescripts
Indexedscripts
Queryingwithscripts
Scriptingwithparameters
Scriptlanguages
Usingotherthanembeddedlanguages
Usingnativecode
Thefactoryimplementation
Implementingthenativescript
Theplugindefinition
Installingtheplugin
Runningthescript
Searchingcontentindifferentlanguages
Handlinglanguagesdifferently
Handlingmultiplelanguages
Detectingthelanguageofthedocument
Sampledocument
Themappings
Querying
Querieswithanidentifiedlanguage
Querieswithanunknownlanguage
Combiningqueries
Influencingscoreswithqueryboosts
Theboost
Addingtheboosttoqueries
Modifyingthescore
Constantscorequery
Boostingquery
Thefunctionscorequery
www.EBooksWorld.ir
![Page 14: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/14.jpg)
Structureofthefunctionquery
Theweightfactorfunction
Fieldvaluefactorfunction
Thescriptscorefunction
Therandomscorefunction
Decayfunctions
Whendoesindex-timeboostingmakesense?
Definingboostinginthemappings
Wordswiththesamemeaning
Synonymfilter
Synonymsinthemappings
Synonymsstoredonthefilesystem
Definingsynonymrules
UsingApacheSolrsynonyms
Explicitsynonyms
Equivalentsynonyms
Expandingsynonyms
UsingWordNetsynonyms
Queryorindex-timesynonymexpansion
Understandingtheexplaininformation
Understandingfieldanalysis
Explainingthequery
Summary
7.AggregationsforDataAnalysis
Aggregations
Generalquerystructure
Insidetheaggregationsengine
Aggregationtypes
Metricsaggregations
Minimum,maximum,average,andsum
Missingvalues
www.EBooksWorld.ir
![Page 15: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/15.jpg)
Usingscripts
Fieldvaluestatisticsandextendedstatistics
Valuecount
Fieldcardinality
Percentiles
Percentileranks
Tophitsaggregation
Additionalparameters
Geoboundsaggregation
Scriptedmetricsaggregation
Bucketsaggregations
Filteraggregation
Filtersaggregation
Termsaggregation
Countsareapproximate
Minimumdocumentcount
Rangeaggregation
Keyedbuckets
Daterangeaggregation
IPv4rangeaggregation
Missingaggregation
Histogramaggregation
Datehistogramaggregation
Timezones
Geodistanceaggregations
Geohashgridaggregation
Globalaggregation
Significanttermsaggregation
Choosingsignificantterms
Multiplevalueanalysis
Sampleraggregation
www.EBooksWorld.ir
![Page 16: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/16.jpg)
Childrenaggregation
Nestedaggregation
Reversenestedaggregation
Nestingaggregationsandorderingbuckets
Bucketsordering
Pipelineaggregations
Availabletypes
Referencingotheraggregations
Gapsinthedata
Pipelineaggregationtypes
Min,max,sum,andaveragebucketaggregations
Cumulativesumaggregation
Bucketselectoraggregation
Bucketscriptaggregation
Serialdifferencingaggregation
Derivativeaggregation
Movingavgaggregation
Predictingfuturebuckets
Themodels
Summary
8.BeyondFull-textSearching
Percolator
Theindex
Percolatorpreparation
Gettingdeeper
Controllingthesizeofreturnedresults
Percolatorandscorecalculation
Combiningpercolatorswithotherfunctionalities
Gettingthenumberofmatchingqueries
Indexeddocumentpercolation
Elasticsearchspatialcapabilities
www.EBooksWorld.ir
![Page 17: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/17.jpg)
Mappingpreparationforspatialsearches
Exampledata
Additionalgeo_fieldproperties
Samplequeries
Distance-basedsorting
Boundingboxfiltering
Limitingthedistance
Arbitrarygeoshapes
Point
Envelope
Polygon
Multipolygon
Anexampleusage
Storingshapesintheindex
Usingsuggesters
Availablesuggestertypes
Includingsuggestions
Suggesterresponse
Termsuggester
Termsuggesterconfigurationoptions
Additionaltermsuggesteroptions
Phrasesuggester
Configuration
Completionsuggester
Indexingdata
Queryingindexedcompletionsuggesterdata
Customweights
Contextsuggester
Contexttypes
Usingcontext
Usingthegeolocationcontext
www.EBooksWorld.ir
![Page 18: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/18.jpg)
TheScrollAPI
Problemdefinition
Scrollingtotherescue
Summary
9.ElasticsearchClusterinDetail
Understandingnodediscovery
Discoverytypes
Noderoles
Masternode
Datanode
Clientnode
Configuringnoderoles
Settingthecluster’sname
Zendiscovery
Masterelectionconfiguration
Configuringunicast
Faultdetectionpingsettings
Clusterstateupdatescontrol
Dealingwithmasterunavailability
AdjustingHTTPtransportsettings
DisablingHTTP
HTTPport
HTTPhost
Thegatewayandrecoverymodules
Thegateway
Recoverycontrol
Additionalgatewayrecoveryoptions
IndicesrecoveryAPI
Delayedallocation
Indexrecoveryprioritization
Templatesanddynamictemplates
www.EBooksWorld.ir
![Page 19: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/19.jpg)
Templates
Anexampleofatemplate
Dynamictemplates
Thematchingpattern
Fielddefinitions
Elasticsearchplugins
Thebasics
Installingplugins
Removingplugins
Elasticsearchcaches
Fielddatacache
Fielddatasize
Circuitbreakers
Fielddataanddocvalues
Shardrequestcache
Enablingandconfiguringtheshardrequestcache
Perrequestshardrequestcachedisabling
Shardrequestcacheusagemonitoring
Nodequerycache
Indexingbuffers
Whencachesshouldbeavoided
TheupdatesettingsAPI
TheclustersettingsAPI
TheindicessettingsAPI
Summary
10.AdministratingYourCluster
Elasticsearchtimemachine
Creatingasnapshotrepository
Creatingsnapshots
Additionalparameters
Restoringasnapshot
www.EBooksWorld.ir
![Page 20: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/20.jpg)
Cleaningup–deletingoldsnapshots
Monitoringyourcluster’sstateandhealth
ClusterhealthAPI
Controllinginformationdetails
Additionalparameters
IndicesstatsAPI
Docs
Store
Indexing,get,andsearch
Additionalinformation
NodesinfoAPI
Returnedinformation
NodesstatsAPI
ClusterstateAPI
ClusterstatsAPI
PendingtasksAPI
IndicesrecoveryAPI
IndicesshardstoresAPI
IndicessegmentsAPI
Controllingtheshardandreplicaallocation
Explicitlycontrollingallocation
Specifyingnodeparameters
Configuration
Indexcreation
Excludingnodesfromallocation
Requiringnodeattributes
UsingtheIPaddressforshardallocation
Disk-basedshardallocation
Configuringdiskbasedshardallocation
Disablingdiskbasedshardallocation
Thenumberofshardsandreplicaspernode
www.EBooksWorld.ir
![Page 21: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/21.jpg)
Allocationthrottling
Cluster-wideallocation
Allocationawareness
Forcingallocationawareness
Filtering
Whatdoinclude,exclude,andrequiremean
Manuallymovingshardsandreplicas
Movingshards
Cancelingshardallocation
Forcingshardallocation
MultiplecommandsperHTTPrequest
Allowingoperationsonprimaryshards
Handlingrollingrestarts
Controllingclusterrebalancing
Understandingrebalance
Clusterbeingready
Theclusterrebalancesettings
Controllingwhenrebalancingwillbeallowed
Controllingthenumberofshardsbeingmovedbetweennodesconcurrently
Controllingwhichshardsmayberebalanced
TheCatAPI
Thebasics
UsingCatAPI
Commonarguments
Theexamples
Gettinginformationaboutthemasternode
Gettinginformationaboutthenodes
Retrievingrecoveryinformationforanindex
Warmingup
Defininganewwarmingquery
Retrievingthedefinedwarmingqueries
www.EBooksWorld.ir
![Page 22: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/22.jpg)
Deletingawarmingquery
Disablingthewarmingupfunctionality
Choosingqueriesforwarming
Indexaliasingandusingittosimplifyyoureverydaywork
Analias
Creatinganalias
Modifyingaliases
Combiningcommands
Retrievingaliases
Removingaliases
Filteringaliases
Aliasesandrouting
Zerodowntimereindexingandaliases
Summary
11.ScalingbyExample
Hardware
Physicalserversoracloud
CPU
RAMmemory
Massstorage
Thenetwork
Howmanyservers
Costcutting
PreparingasingleElasticsearchnode
Thegeneralpreparations
Avoidingswapping
Filedescriptors
Virtualmemory
Thememory
Fielddatacacheandbreakingthecircuit
Usedocvalues
www.EBooksWorld.ir
![Page 23: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/23.jpg)
RAMbufferforindexing
Indexrefreshrate
Threadpools
Horizontalexpansion
Automaticallycreatingthereplicas
Redundancyandhighavailability
Costandperformanceflexibility
Continuousupgrades
MultipleElasticsearchinstancesonasinglephysicalmachine
Preventingashardanditsreplicasfrombeingonthesamenode
Designatednoderolesforlargerclusters
Queryaggregatornodes
Datanodes
Mastereligiblenodes
Preparingtheclusterforhighindexingandqueryingthroughput
Indexingrelatedadvice
Indexrefreshrate
Threadpoolstuning
Automaticstorethrottling
Handlingtime-baseddata
Multipledatapaths
Datadistribution
Bulkindexing
RAMbufferforindexing
Adviceforhighqueryratescenarios
Shardrequestcache
Thinkaboutthequeries
Parallelizeyourqueries
Fielddatacacheandbreakingthecircuit
Keepsizeandshardsizeundercontrol
Monitoring
www.EBooksWorld.ir
![Page 24: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/24.jpg)
ElasticsearchHQ
Marvel
SPMforElasticsearch
Summary
Index
www.EBooksWorld.ir
![Page 25: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/25.jpg)
www.EBooksWorld.ir
![Page 26: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/26.jpg)
ElasticsearchServerThirdEdition
www.EBooksWorld.ir
![Page 27: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/27.jpg)
www.EBooksWorld.ir
![Page 28: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/28.jpg)
ElasticsearchServerThirdEditionCopyright©2016PacktPublishing
Allrightsreserved.Nopartofthisbookmaybereproduced,storedinaretrievalsystem,ortransmittedinanyformorbyanymeans,withoutthepriorwrittenpermissionofthepublisher,exceptinthecaseofbriefquotationsembeddedincriticalarticlesorreviews.
Everyefforthasbeenmadeinthepreparationofthisbooktoensuretheaccuracyoftheinformationpresented.However,theinformationcontainedinthisbookissoldwithoutwarranty,eitherexpressorimplied.Neithertheauthors,norPacktPublishing,anditsdealersanddistributorswillbeheldliableforanydamagescausedorallegedtobecauseddirectlyorindirectlybythisbook.
PacktPublishinghasendeavoredtoprovidetrademarkinformationaboutallofthecompaniesandproductsmentionedinthisbookbytheappropriateuseofcapitals.However,PacktPublishingcannotguaranteetheaccuracyofthisinformation.
Firstpublished:October2013
Secondedition:February2015
Thirdedition:February2016
Productionreference:1230216
PublishedbyPacktPublishingLtd.
LiveryPlace
35LiveryStreet
BirminghamB32PB,UK.
ISBN978-1-78588-881-6
www.packtpub.com
www.EBooksWorld.ir
![Page 29: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/29.jpg)
www.EBooksWorld.ir
![Page 30: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/30.jpg)
CreditsAuthors
RafałKuć
MarekRogoziński
Reviewer
PaigeCook
CommissioningEditor
NadeemBagban
AcquisitionEditor
DivyaPoojari
ContentDevelopmentEditor
KirtiPatil
TechnicalEditor
UtkarshaS.Kadam
CopyEditor
AlphaSingh
ProjectCoordinator
NidhiJoshi
Proofreader
SafisEditing
Indexer
RekhaNair
Graphics
JasonMonteiro
ProductionCoordinator
ManuJoseph
CoverWork
ManuJoseph
www.EBooksWorld.ir
![Page 31: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/31.jpg)
www.EBooksWorld.ir
![Page 32: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/32.jpg)
AbouttheAuthorsRafałKućisasoftwareengineer,trainer,speakerandconsultant.HeisworkingasaconsultantandsoftwareengineeratSematextGroupInc.whereheconcentratesonopensourcetechnologiessuchasApacheLucene,Solr,andElasticsearch.Hehasmorethan14yearsofexperienceinvarioussoftwaredomains—frombankingsoftwaretoe–commerceproducts.HeismainlyfocusedonJava;however,heisopentoeverytoolandprogramminglanguagethatmighthelphimtoachievehisgoalseasilyandquickly.Rafałisalsooneofthefoundersofthesolr.plsite,wherehetriestosharehisknowledgeandhelppeoplesolvetheirSolrandLuceneproblems.HeisalsoaspeakeratvariousconferencesaroundtheworldsuchasLuceneEurocon,BerlinBuzzwords,ApacheCon,Lucene/SolrRevolution,Velocity,andDevOpsDays.
RafałbeganhisjourneywithLucenein2002;however,itwasn’tloveatfirstsight.WhenhecamebacktoLuceneinlate2003,herevisedhisthoughtsabouttheframeworkandsawthepotentialinsearchtechnologies.ThenSolrcameandthatwasit.HestartedworkingwithElasticsearchinthemiddleof2010.Atpresent,Lucene,Solr,Elasticsearch,andinformationretrievalarehismainareasofinterest.
RafałisalsotheauthoroftheSolrCookbookseries,ElasticSearchServeranditssecondedition,andthefirstandsecondeditionsofMasteringElasticSearch,allpublishedbyPacktPublishing.
MarekRogozińskiisasoftwarearchitectandconsultantwithmorethan10yearsofexperience.Hisspecializationconcernssolutionsbasedonopensourcesearchengines,suchasSolrandElasticsearch,andthesoftwarestackforbigdataanalyticsincludingHadoop,Hbase,andTwitterStorm.
Heisalsoacofounderofthesolr.plsite,whichpublishesinformationandtutorialsaboutSolrandLucenelibraries.HeisthecoauthorofElasticSearchServeranditssecondedition,andthefirstandsecondeditionsofMasteringElasticSearch,allpublishedbyPacktPublishing.
HeiscurrentlythechieftechnologyofficerandleadarchitectatZenCard,acompanythatprocessesandanalyzeslargequantitiesofpaymenttransactionsinrealtime,allowingautomaticandanonymousidentificationofretailcustomersonallretailerchannels(m-commerce/e-commerce/brick&mortar)andgivingretailersacustomerretentionandloyaltytool.
www.EBooksWorld.ir
![Page 33: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/33.jpg)
www.EBooksWorld.ir
![Page 34: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/34.jpg)
AbouttheReviewerPaigeCookworksasasoftwarearchitectforVidea,partoftheCoxFamilyofCompanies,andlivesnearAtlanta,Georgia.Hehastwentyyearsofexperienceinsoftwaredevelopment,primarilywiththeMicrosoft.NETFramework.Hiscareerhasbeenlargelyfocusedonbuildingenterprisesolutionsforthemediaandentertainmentindustry.HeisespeciallyinterestedinsearchtechnologiesusingtheApacheLucenesearchengineandhasexperiencewithbothElasticsearchandApacheSolr.Apartfromhiswork,heenjoysDIYhomeprojectsandspendingtimewithhiswifeandtwodaughters.
www.EBooksWorld.ir
![Page 35: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/35.jpg)
www.EBooksWorld.ir
![Page 36: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/36.jpg)
www.PacktPub.com
www.EBooksWorld.ir
![Page 37: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/37.jpg)
eBooks,discountoffers,andmoreDidyouknowthatPacktofferseBookversionsofeverybookpublished,withPDFandePubfilesavailable?YoucanupgradetotheeBookversionatwww.PacktPub.comandasaprintbookcustomer,youareentitledtoadiscountontheeBookcopy.Getintouchwithusat<[email protected]>formoredetails.
Atwww.PacktPub.com,youcanalsoreadacollectionoffreetechnicalarticles,signupforarangeoffreenewslettersandreceiveexclusivediscountsandoffersonPacktbooksandeBooks.
https://www2.packtpub.com/books/subscription/packtlib
DoyouneedinstantsolutionstoyourITquestions?PacktLibisPackt’sonlinedigitalbooklibrary.Here,youcansearch,access,andreadPackt’sentirelibraryofbooks.
www.EBooksWorld.ir
![Page 38: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/38.jpg)
Whysubscribe?FullysearchableacrosseverybookpublishedbyPacktCopyandpaste,print,andbookmarkcontentOndemandandaccessibleviaawebbrowser
www.EBooksWorld.ir
![Page 39: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/39.jpg)
www.EBooksWorld.ir
![Page 40: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/40.jpg)
PrefaceWelcometoElasticsearchServer,ThirdEdition.ThisisthethirdinstalmentofthebookdedicatedtoyetanothermajorreleaseofElasticsearch—thistimeversion2.2.Inthethirdedition,wehavedecidedtogoonasimilarroutethatwetookwhenwewrotethesecondeditionofthebook.WenotonlyupdatedthecontenttomatchthenewversionofElasticsearch,butalsorestructuredthebookbyremovingandaddingnewsectionsandchapters.Wereadthesuggestionswegotfromyou—thereadersofthebook,andwecarefullytriedtoincorporatethesuggestionsandcommentsreceivedsincethereleaseofthefirstandsecondeditions.
Whilereadingthisbook,youwillbetakenonajourneytothewonderfulworldoffull-textsearchprovidedbytheElasticsearchserver.WewillstartwithageneralintroductiontoElasticsearch,whichcovershowtostartandrunElasticsearch,itsbasicconcepts,andhowtoindexandsearchyourdatainthemostbasicway.Thisbookwillalsodiscussthequerylanguage,socalledQueryDSL,thatallowsyoutocreatecomplicatedqueriesandfilterreturnedresults.Inadditiontoallofthis,you’llseehowyoucanusetheaggregationframeworktocalculateaggregateddatabasedontheresultsreturnedbyyourqueries.WewillimplementtheautocompletefunctionalitytogetherandlearnhowtouseElasticsearchspatialcapabilitiesandprospectivesearch.
Finally,thisbookwillshowyouElasticsearch’sadministrationAPIcapabilitieswithfeaturessuchasshardplacementcontrol,clusterhandling,andmore,endingwithadedicatedchapterthatwilldiscussElasticsearch’spreparationforsmallandlargedeployments—bothonesthatconcentrateonindexingandalsoonesthatconcentrateonindexing.
www.EBooksWorld.ir
![Page 41: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/41.jpg)
WhatthisbookcoversChapter1,GettingStartedwithElasticsearchCluster,coverswhatfull-textsearchingis,whatApacheLuceneis,whattextanalysisis,howtorunandconfigureElasticsearch,andfinally,howtoindexandsearchyourdatainthemostbasicway.
Chapter2,IndexingYourData,showshowindexingworks,howtoprepareindexstructure,whatdatatypesweareallowedtouse,howtospeedupindexing,whatsegmentsare,howmergingworks,andwhatroutingis.
Chapter3,SearchingYourData,introducesthefull-textsearchcapabilitiesofElasticsearchbydiscussinghowtoqueryit,howthequeryingprocessworks,andwhattypesofbasicandcompoundqueriesareavailable.Inadditiontothis,wewillshowhowtouseposition-awarequeriesinElasticsearch.
Chapter4,ExtendingYourQueryKnowledge,showshowtoefficientlynarrowdownyoursearchresultsbyusingfilters,howhighlightingworks,howtosortyourresults,andhowqueryrewriteworks.
Chapter5,ExtendingYourIndexStructure,showshowtoindexmorecomplexdatastructures.Welearnhowtoindextree-likedatatypes,howtoindexdatawithrelationshipsbetweendocuments,andhowtomodifyindexstructure.
Chapter6,MakeYourSearchBetter,coversApacheLucenescoringandhowtoinfluenceitinElasticsearch,thescriptingcapabilitiesofElasticsearch,anditslanguageanalysiscapabilities.
Chapter7,AggregationsforDataAnalysis,introducesyoutothegreatworldofdataanalysisbyshowingyouhowtousetheElasticsearchaggregationframework.Wewilldiscussalltypesofaggregations—metrics,buckets,andthenewpipelineaggregationsthathavebeenintroducedinElasticsearch.
Chapter8,BeyondFull-textSearching,discussesnonfull-textsearch-relatedfunctionalitiessuchaspercolator—reversedsearch,andthegeo-spatialcapabilitiesofElasticsearch.Thischapteralsodiscussessuggesters,whichallowustobuildaspellcheckingfunctionalityandanefficientautocompletemechanism,andwewillshowhowtohandledeep-pagingefficiently.
Chapter9,ElasticsearchClusterinDetail,discussesnodesdiscoverymechanism,recoveryandgatewayElasticsearchmodules,templates,caches,andsettingsupdateAPI.
Chapter10,AdministratingYourCluster,coverstheElasticsearchbackupfunctionality,rebalancing,andshardsmoving.Inadditiontothis,youwilllearnhowtousethewarmupfunctionality,usetheCatAPI,andworkwithaliases.
Chapter11,ScalingbyExample,isdedicatedtoscalingandtuning.WewillstartwithhardwarepreparationsandconsiderationsandasingleElasticsearchnode-relatedtuning.Wewillgothroughclustersetupandverticalscaling,endingthechapterwithhighqueryingandindexingusecasesandclustermonitoring.
www.EBooksWorld.ir
![Page 42: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/42.jpg)
www.EBooksWorld.ir
![Page 43: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/43.jpg)
WhatyouneedforthisbookThisbookwaswrittenusingElasticsearchserver2.2andalltheexamplesandfunctionsshouldworkwiththis.Inadditiontothis,you’llneedacommandthatallowsyoutosendHTTPrequestsuchascurl,whichisavailableformostoperatingsystems.Pleasenotethatalltheexamplesinthisbookusethepreviouslymentionedcurltool.Ifyouwanttouseanothertool,pleaseremembertoformattherequestinanappropriatewaythatisunderstoodbythetoolofyourchoice.
Inadditiontothis,somechaptersmayrequireadditionalsoftware,suchasElasticsearchplugins,butwhenneededithasbeenexplicitlymentioned.
www.EBooksWorld.ir
![Page 44: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/44.jpg)
www.EBooksWorld.ir
![Page 45: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/45.jpg)
WhothisbookisforIfyouareabeginnertotheworldoffull-textsearchandElasticsearch,thenthisbookisespeciallyforyou.YouwillbeguidedthroughthebasicsofElasticsearchandyouwilllearnhowtousesomeoftheadvancedfunctionalities.
IfyouknowElasticsearchandyouworkedwithit,thenyoumayfindthisbookinterestingasitprovidesaniceoverviewofallthefunctionalitieswithexamplesanddescriptions.However,youmayencountersectionsthatyoualreadyknow.
IfyouknowtheApacheSolrsearchengine,thisbookcanalsobeusedtocomparesomefunctionalitiesofApacheSolrandElasticsearch.Thismaygiveyoutheknowledgeaboutwhichtoolismoreappropriateforyourusecase.
IfyouknowallthedetailsaboutElasticsearchandyouknowhoweachoftheconfigurationparameterswork,thenthisisdefinitelynotthebookyouarelookingfor.
www.EBooksWorld.ir
![Page 46: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/46.jpg)
www.EBooksWorld.ir
![Page 47: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/47.jpg)
ConventionsInthisbook,youwillfindanumberoftextstylesthatdistinguishbetweendifferentkindsofinformation.Herearesomeexamplesofthesestylesandanexplanationoftheirmeaning.
Codewordsintext,databasetablenames,foldernames,filenames,fileextensions,pathnames,dummyURLs,userinput,andTwitterhandlesareshownasfollows:“IfyouusetheLinuxorOSXcommand,thecURLpackageshouldalreadybeavailable.”
Ablockofcodeissetasfollows:
{
"mappings":{
"post":{
"properties":{
"id":{"type":"long"},
"name":{"type":"string"},
"published":{"type":"date"},
"contents":{"type":"string"}
}
}
}
}
Whenwewishtodrawyourattentiontoaparticularpartofacodeblock,therelevantlinesoritemsaresetinbold:
{
"mappings":{
"post":{
"properties":{
"id":{"type":"long"},
"name":{"type":"string"},
"published":{"type":"date"},
"contents":{"type":"string"}
}
}
}
}
Anycommand-lineinputoroutputiswrittenasfollows:
curl-XPUThttp://localhost:9200/users/?pretty-d'{
"mappings":{
"user":{
"numeric_detection":true
}
}
}'
NoteWarningsorimportantnotesappearinaboxlikethis.
www.EBooksWorld.ir
![Page 48: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/48.jpg)
www.EBooksWorld.ir
![Page 49: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/49.jpg)
ReaderfeedbackFeedbackfromourreadersisalwayswelcome.Letusknowwhatyouthinkaboutthisbook—whatyoulikedordisliked.Readerfeedbackisimportantforusasithelpsusdeveloptitlesthatyouwillreallygetthemostoutof.
Tosendusgeneralfeedback,simplye-mail<[email protected]>,andmentionthebook’stitleinthesubjectofyourmessage.
Ifthereisatopicthatyouhaveexpertiseinandyouareinterestedineitherwritingorcontributingtoabook,seeourauthorguideatwww.packtpub.com/authors.
www.EBooksWorld.ir
![Page 50: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/50.jpg)
www.EBooksWorld.ir
![Page 51: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/51.jpg)
CustomersupportNowthatyouaretheproudownerofaPacktbook,wehaveanumberofthingstohelpyoutogetthemostfromyourpurchase.
www.EBooksWorld.ir
![Page 52: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/52.jpg)
DownloadingtheexamplecodeYoucandownloadtheexamplecodefilesforthisbookfromyouraccountathttp://www.packtpub.com.Ifyoupurchasedthisbookelsewhere,youcanvisithttp://www.packtpub.com/supportandregistertohavethefilese-maileddirectlytoyou.
Youcandownloadthecodefilesbyfollowingthesesteps:
1. Loginorregistertoourwebsiteusingyoure-mailaddressandpassword.2. HoverthemousepointerontheSUPPORTtabatthetop.3. ClickonCodeDownloads&Errata.4. EnterthenameofthebookintheSearchbox.5. Selectthebookforwhichyou’relookingtodownloadthecodefiles.6. Choosefromthedrop-downmenuwhereyoupurchasedthisbookfrom.7. ClickonCodeDownload.
Oncethefileisdownloaded,pleasemakesurethatyouunziporextractthefolderusingthelatestversionof:
WinRAR/7-ZipforWindowsZipeg/iZip/UnRarXforMac7-Zip/PeaZipforLinux
www.EBooksWorld.ir
![Page 53: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/53.jpg)
DownloadingthecolorimagesofthisbookWealsoprovideyouwithaPDFfilethathascolorimagesofthescreenshots/diagramsusedinthisbook.Thecolorimageswillhelpyoubetterunderstandthechangesintheoutput.Youcandownloadthisfilefromhttps://www.packtpub.com/sites/default/files/downloads/ElasticsearchServerThirdEdition_ColorImages.pdf
www.EBooksWorld.ir
![Page 54: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/54.jpg)
ErrataAlthoughwehavetakeneverycaretoensuretheaccuracyofourcontent,mistakesdohappen.Ifyoufindamistakeinoneofourbooks—maybeamistakeinthetextorthecode—wewouldbegratefulifyoucouldreportthistous.Bydoingso,youcansaveotherreadersfromfrustrationandhelpusimprovesubsequentversionsofthisbook.Ifyoufindanyerrata,pleasereportthembyvisitinghttp://www.packtpub.com/submit-errata,selectingyourbook,clickingontheErrataSubmissionFormlink,andenteringthedetailsofyourerrata.Onceyourerrataareverified,yoursubmissionwillbeacceptedandtheerratawillbeuploadedtoourwebsiteoraddedtoanylistofexistingerrataundertheErratasectionofthattitle.
Toviewthepreviouslysubmittederrata,gotohttps://www.packtpub.com/books/content/supportandenterthenameofthebookinthesearchfield.TherequiredinformationwillappearundertheErratasection.
www.EBooksWorld.ir
![Page 55: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/55.jpg)
PiracyPiracyofcopyrightedmaterialontheInternetisanongoingproblemacrossallmedia.AtPackt,wetaketheprotectionofourcopyrightandlicensesveryseriously.IfyoucomeacrossanyillegalcopiesofourworksinanyformontheInternet,pleaseprovideuswiththelocationaddressorwebsitenameimmediatelysothatwecanpursuearemedy.
Pleasecontactusat<[email protected]>withalinktothesuspectedpiratedmaterial.
Weappreciateyourhelpinprotectingourauthorsandourabilitytobringyouvaluablecontent.
www.EBooksWorld.ir
![Page 56: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/56.jpg)
QuestionsIfyouhaveaproblemwithanyaspectofthisbook,youcancontactusat<[email protected]>,andwewilldoourbesttoaddresstheproblem.
www.EBooksWorld.ir
![Page 57: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/57.jpg)
www.EBooksWorld.ir
![Page 58: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/58.jpg)
Chapter1.GettingStartedwithElasticsearchClusterWelcometothewonderfulworldofElasticsearch—agreatfulltextsearchandanalyticsengine.Itdoesn’tmatterifyouarenewtoElasticsearchandfulltextsearchesingeneral,orifyoualreadyhavesomeexperienceinthis.Wehopethat,byreadingthisbook,you’llbeabletolearnandextendyourknowledgeofElasticsearch.Asthisbookisalsodedicatedtobeginners,wedecidedtostartwithashortintroductiontofulltextsearchesingeneral,andafterthat,abriefoverviewofElasticsearch.
PleaserememberthatElasticsearchisarapidlychangingofsoftware.Notonlyarefeaturesadded,buttheElasticsearchcorefunctionalityisalsoconstantlyevolvingandchanging.Wetrytokeepupwiththesechanges,andbecauseofthiswearegivingyouthethirdeditionofthebookdedicatedtoElasticsearch2.x.
ThefirstthingweneedtodowithElasticsearchisinstallandconfigureit.Withmanyapplications,youstartwiththeinstallationandconfigurationandusuallyforgettheimportanceofthesesteps.Wewilltrytoguideyouthroughthesestepssothatitbecomeseasiertoremember.Inadditiontothis,wewillshowyouthesimplestwaytoindexandretrievedatawithoutgoingintotoomuchdetail.ThefirstchapterwilltakeyouonaquickridethroughElasticsearchandthefulltextsearchworld.Bytheendofthischapter,youwillhavelearnedthefollowingtopics:
FulltextsearchingThebasicsofApacheLucenePerformingtextanalysisThebasicconceptsofElasticsearchInstallingandconfiguringElasticsearchUsingtheElasticsearchRESTAPItomanipulatedataSearchingusingbasicURIrequests
www.EBooksWorld.ir
![Page 59: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/59.jpg)
FulltextsearchingBackinthedayswhenfulltextsearchingwasatermknowntoasmallpercentageofengineers,mostofususedSQLdatabasestoperformsearchoperations.UsingSQLdatabasestosearchforthedatastoredinthemwasokaytosomeextent.Suchasearchwasn’tfast,especiallyonlargeamountsofdata.Evennow,smallapplicationsareusuallygoodwithastandardLIKE%phrase%searchinaSQLdatabase.However,aswegodeeperanddeeper,westarttoseethelimitsofsuchanapproach—alackofscalability,notenoughflexibility,andalackoflanguageanalysis.Ofcourse,thereareadditionalmodulesthatextendSQLdatabaseswithfulltextsearchcapabilities,buttheyarestilllimitedcomparedtodedicatedfulltextsearchlibrariesandsearchenginessuchasElasticsearch.SomeofthosereasonsledtothecreationofApacheLucene(http://lucene.apache.org/),alibrarywrittencompletelyinJava(http://java.com/en/),whichisveryfast,light,andprovideslanguageanalysisforalargenumberoflanguagesspokenthroughouttheworld.
www.EBooksWorld.ir
![Page 60: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/60.jpg)
TheLuceneglossaryandarchitectureBeforegoingintothedetailsoftheanalysisprocess,wewouldliketointroduceyoutotheglossaryandoverallarchitectureofApacheLucene.WedecidedthatthisinformationiscrucialforunderstandinghowElasticsearchworks,andeventhoughthebookisnotaboutApacheLucene,knowingthefoundationoftheElasticsearchanalyticsandindexingengineisvitaltofullyunderstandhowthisgreatsearchengineworks.
Thebasicconceptsofthementionedlibraryareasfollows:
Document:Thisisthemaindatacarrierusedduringindexingandsearching,comprisingoneormorefieldsthatcontainthedataweputinandgetfromLucene.Field:Thisasectionofthedocument,whichisbuiltoftwoparts:thenameandthevalue.Term:Thisisaunitofsearchrepresentingawordfromthetext.Token:Thisisanoccurrenceofaterminthetextofthefield.Itconsistsofthetermtext,startandendoffsets,andatype.
ApacheLucenewritesalltheinformationtoastructurecalledtheinvertedindex.Itisadatastructurethatmapsthetermsintheindextothedocumentsandnottheotherwayaroundasarelationaldatabasedoesinitstables.Youcanthinkofaninvertedindexasadatastructurewheredataisterm-orientedratherthandocument-oriented.Let’sseehowasimpleinvertedindexwilllook.Forexample,let’sassumethatwehavedocumentswithonlyasinglefieldcalledtitletobeindexed,andthevaluesofthatfieldareasfollows:
ElasticsearchServer(document1)MasteringElasticsearchSecondEdition(document2)ApacheSolrCookbookThirdEdition(document3)
AverysimplifiedvisualizationoftheLuceneinvertedindexcouldlookasfollows:
Eachtermpointstothenumberofdocumentsitispresentin.Forexample,theterm
www.EBooksWorld.ir
![Page 61: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/61.jpg)
editionispresenttwiceinthesecondandthirddocuments.Suchastructureallowsforveryefficientandfastsearchoperationsinterm-basedqueries(butnotexclusively).Becausetheoccurrencesofthetermareconnectedtothetermsthemselves,Lucenecanuseinformationaboutthetermoccurrencestoperformfastandprecisescoringinformationbygivingeachdocumentavaluethatrepresentshowwelleachofthereturneddocumentsmatchedthequery.
Ofcourse,theactualindexcreatedbyLuceneismuchmorecomplicatedandadvancedbecauseofadditionalfilesthatincludeinformationsuchastermvectors(perdocumentinvertedindex),docvalues(columnorientedfieldinformation),storedfields(theoriginalandnottheanalyzedvalueofthefield),andsoon.However,allyouneedtoknowfornowishowthedataisorganizedandnotwhatexactlyisstored.
Eachindexisdividedintomultiplewrite-onceandread-many-timestructurescalledsegments.EachsegmentisaminiatureApacheLuceneindexonitsown.Whenindexing,afterasinglesegmentiswrittentothediskitcan’tbeupdated,orweshouldrathersayitcan’tbefullyupdated;documentscan’tberemovedfromit,theycanonlybemarkedasdeletedinaseparatefile.ThereasonthatLucenedoesn’tallowsegmentstobeupdatedisthenatureoftheinvertedindex.Afterthefieldsareanalyzedandputintotheinvertedindex,thereisnoeasywayofbuildingtheoriginaldocumentstructure.Whendeleting,Lucenewouldhavetodeletetheinformationfromthesegment,whichtranslatestoupdatingalltheinformationwithintheinvertedindexitself.
Becauseofthefactthatsegmentsarewrite-oncestructuresLuceneisabletomergesegmentstogetherinaprocesscalledsegmentmerging.Duringindexing,ifLucenethinksthattherearetoomanysegmentsfallingintothesamecriterion,anewandbiggersegmentwillbecreated—onethatwillhavedatafromtheothersegments.Duringthatprocess,Lucenewilltrytoremovedeleteddataandgetbackthespaceneededtoholdinformationaboutthosedocuments.SegmentmergingisademandingoperationbothintermsoftheI/OandCPU.Whatwehavetorememberfornowisthatsearchingwithonelargesegmentisfasterthansearchingwithmultiplesmalleronesholdingthesamedata.That’sbecause,ingeneral,searchingtranslatestojustmatchingthequerytermstotheonesthatareindexed.Youcanimaginehowsearchingthroughmultiplesmallsegmentsandmergingthoseresultswillbeslowerthanhavingasinglesegmentpreparingtheresults.
www.EBooksWorld.ir
![Page 62: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/62.jpg)
InputdataanalysisThetransformationofadocumentthatcomestoLuceneandisprocessedandputintotheinvertedindexformatiscalledindexation.OneofthethingsLucenehastododuringthisisdataanalysis.Youmaywantsomeofyourfieldstobeprocessedbyalanguageanalyzersothatwordssuchascarandcarsaretreatedasthesamebeyourindex.Ontheotherhand,youmaywantotherfieldstobedividedonlyonthewhitespacecharacterorbeonlylowercased.
Analysisisdonebytheanalyzer,whichisbuiltofatokenizerandzeroormoretokenfilters,anditcanalsohavezeroormorecharactermappers.
AtokenizerinLuceneisusedtosplitthetextintotokens,whicharebasicallythetermswithadditionalinformationsuchasitspositionintheoriginaltextanditslength.Theresultsofthetokenizer’sworkiscalledatokenstream,wherethetokensareputonebyoneandarereadytobeprocessedbythefilters.
Apartfromthetokenizer,theLuceneanalyzerisbuiltofzeroormoretokenfiltersthatareusedtoprocesstokensinthetokenstream.Someexamplesoffiltersareasfollows:
Lowercasefilter:MakesallthetokenslowercasedSynonymsfilter:ChangesonetokentoanotheronthebasisofsynonymrulesLanguagestemmingfilters:Responsibleforreducingtokens(actually,thetextpartthattheyprovide)intotheirrootorbaseformscalledthestem(https://en.wikipedia.org/wiki/Word_stem)
Filtersareprocessedoneafteranother,sowehavealmostunlimitedanalyticalpossibilitieswiththeadditionofmultiplefilters,oneafteranother.
Finally,thecharactermappersoperateonnon-analyzedtext—theyareusedbeforethetokenizer.Therefore,wecaneasilyremoveHTMLtagsfromwholepartsoftextwithoutworryingabouttokenization.
www.EBooksWorld.ir
![Page 63: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/63.jpg)
IndexingandqueryingYoumaywonderhowalltheinformationwe’vedescribedsofaraffectsindexingandqueryingwhenusingLuceneandallthesoftwarethatisbuiltontopofit.Duringindexing,Lucenewilluseananalyzerofyourchoicetoprocessthecontentsofyourdocument;ofcourse,differentanalyzerscanbeusedfordifferentfields,sothenamefieldofyourdocumentcanbeanalyzeddifferentlycomparedtothesummaryfield.Forexample,thenamefieldmayonlybetokenizedonwhitespacesandlowercased,sothatexactmatchesaredoneandthesummaryfieldisstemmedinadditiontothat.Wecanalsodecidetonotanalyzethefieldsatall—wehavefullcontrolovertheanalysisprocess.
Duringaquery,yourquerytextcanbeanalyzedaswell.However,youcanalsochoosenottoanalyzeyourqueries.ThisiscrucialtorememberbecausesomeElasticsearchqueriesareanalyzedandsomearenot.Forexample,prefixandtermqueriesarenotanalyzed,andmatchqueriesareanalyzed(wewillgettothatinChapter3,SearchingYourData).Havingqueriesthatareanalyzedandnotanalyzedisveryuseful;sometimes,youmaywanttoqueryafieldthatisnotanalyzed,whilesometimesyoumaywanttohaveafulltextsearchanalysis.Forexample,ifwesearchfortheLightRedtermandthequeryisbeinganalyzedbythestandardanalyzer,thenthetermsthatwouldbesearchedarelightandred.Ifweuseaquerytypethathasnotbeenanalyzed,thenwewillexplicitlysearchfortheLightRedterm.Wemaynotwanttoanalyzethecontentofthequeryifweareonlyinterestedinexactmatches.
Whatyoushouldrememberaboutindexingandqueryinganalysisisthattheindexshouldmatchthequeryterm.Iftheydon’tmatch,Lucenewon’treturnthedesireddocuments.Forexample,ifyouusestemmingandlowercasingduringindexing,youneedtoensurethatthetermsinthequeryarealsolowercasedandstemmed,oryourquerieswon’treturnanyresultsatall.Forexample,let’sgetbacktoourLightRedtermthatweanalyzedduringindexing;wehaveitastwotermsintheindex:lightandred.IfwerunaLightRedqueryagainstthatdataanddon’tanalyzeit,wewon’tgetthedocumentintheresults—thequerytermdoesnotmatchtheindexedterms.Itisimportanttokeepthetokenfiltersinthesameorderduringindexingandquerytimeanalysissothatthetermsresultingfromsuchananalysisarethesame.
www.EBooksWorld.ir
![Page 64: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/64.jpg)
ScoringandqueryrelevanceThereisoneadditionalthingthatweonlymentionedoncetillnow—scoring.Whatisthescoreofadocument?Thescoreisaresultofascoringformulathatdescribeshowwellthedocumentmatchesthequery.Bydefault,ApacheLuceneusestheTF/IDF(termfrequency/inversedocumentfrequency)scoringmechanism,whichisanalgorithmthatcalculateshowrelevantthedocumentisinthecontextofourquery.Ofcourse,itisnottheonlyalgorithmavailable,andwewillmentionotheralgorithmsintheMappingsconfigurationsectionofChapter2,IndexingYourData.
NoteIfyouwanttoreadmoreabouttheApacheLuceneTF/IDFscoringformula,pleasevisitApacheLuceneJavadocsfortheTFIDF.Thesimilarityclassisavailableathttp://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html
www.EBooksWorld.ir
![Page 65: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/65.jpg)
www.EBooksWorld.ir
![Page 66: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/66.jpg)
ThebasicsofElasticsearchElasticsearchisanopensourcesearchserverprojectstartedbyShayBanonandpublishedinFebruary2010.Duringthistime,theprojectgrewintoamajorplayerinthefieldofsearchanddataanalysissolutionsandiswidelyusedinmanycommonorlesser-knownsearchanddataanalysisplatforms.Inaddition,duetoitsdistributednatureandreal-timesearchandanalyticscapabilities,manyorganizationsuseitasadocumentstore.
www.EBooksWorld.ir
![Page 67: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/67.jpg)
KeyconceptsofElasticsearchInthenextfewpages,wewillgetyouthroughthebasicconceptsofElasticsearch.YoucanskipthissectionifyouarealreadyfamiliarwithElasticsearcharchitecture.However,ifyouarenotfamiliarwithElasticsearch,westronglyadviseyoutoreadthissection.Wewillrefertothekeywordsusedinthissectionintherestofthebook,andunderstandingthoseconceptsiscrucialtofullyutilizeElasticsearch.
IndexAnindexisthelogicalplacewhereElasticsearchstoresthedata.EachindexcanbespreadontomultipleElasticsearchnodesandisdividedintooneormoresmallerpiecescalledshardsthatarephysicallyplacedontheharddrives.Ifyouarecomingfromtherelationaldatabaseworld,youcanthinkofanindexlikeatable.However,theindexstructureispreparedforfastandefficientfulltextsearchingand,inparticular,doesnotstoreoriginalvalues.Thatstructureiscalledaninvertedindex(https://en.wikipedia.org/wiki/Inverted_index).
IfyouknowMongoDB,youcanthinkoftheElasticsearchindexasacollectioninMongoDB.IfyouarefamiliarwithCouchDB,youcanthinkaboutanindexasyouwouldabouttheCouchDBdatabase.Elasticsearchcanholdmanyindiceslocatedononemachineorspreadthemovermultipleservers.Aswehavealreadysaid,everyindexisbuiltofoneormoreshards,andeachshardcanhavemanyreplicas.
DocumentThemainentitystoredinElasticsearchisadocument.Adocumentcanhavemultiplefields,eachhavingitsowntypeandtreateddifferently.Usingtheanalogytorelationaldatabases,adocumentisarowofdatainadatabasetable.WhenyoucompareanElasticsearchdocumenttoaMongoDBdocument,youwillseethatbothcanhavedifferentstructures.ThethingtokeepinmindwhenitcomestoElasticsearchisthatfieldsthatarecommontomultipletypesinthesameindexneedtohavethesametype.Thismeansthatallthedocumentswithafieldcalledtitleneedtohavethesamedatatypeforit,forexample,string.
Documentsconsistoffields,andeachfieldmayoccurseveraltimesinasingledocument(suchafieldiscalledmultivalued).Eachfieldhasatype(text,number,date,andsoon).Thefieldtypescanalsobecomplex—afieldcancontainothersubdocumentsorarrays.ThefieldtypeisimportanttoElasticsearchbecausetypedetermineshowvariousoperationssuchasanalysisorsortingareperformed.Fortunately,thiscanbedeterminedautomatically(however,westillsuggestusingmappings;takealookatwhatfollows).
Unliketherelationaldatabases,documentsdon’tneedtohaveafixedstructure—everydocumentmayhaveadifferentsetoffields,andinadditiontothis,fieldsdon’thavetobeknownduringapplicationdevelopment.Ofcourse,onecanforceadocumentstructurewiththeuseofschema.Fromtheclient’spointofview,adocumentisaJSONobject(seemoreabouttheJSONformatathttps://en.wikipedia.org/wiki/JSON).Eachdocumentisstoredinoneindexandhasitsownuniqueidentifier,whichcanbegenerated
www.EBooksWorld.ir
![Page 68: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/68.jpg)
automaticallybyElasticsearch,anddocumenttype.Thethingtorememberisthatthedocumentidentifierneedstobeuniqueinsideanindexandshouldbeforagiventype.Thismeansthat,inasingleindex,twodocumentscanhavethesameuniqueidentifieriftheyarenotofthesametype.
DocumenttypeInElasticsearch,oneindexcanstoremanyobjectsservingdifferentpurposes.Forexample,ablogapplicationcanstorearticlesandcomments.Thedocumenttypeletsuseasilydifferentiatebetweentheobjectsinasingleindex.Everydocumentcanhaveadifferentstructure,butinreal-worlddeployments,dividingdocumentsintotypessignificantlyhelpsindatamanipulation.Ofcourse,oneneedstokeepthelimitationsinmind.Thatis,differentdocumenttypescan’tsetdifferenttypesforthesameproperty.Forexample,afieldcalledtitlemusthavethesametypeacrossalldocumenttypesinagivenindex.
MappingInthesectionaboutthebasicsoffulltextsearching(theFulltextsearchingsection),wewroteabouttheprocessofanalysis—thepreparationoftheinputtextforindexingandsearchingdonebytheunderlyingApacheLucenelibrary.Everyfieldofthedocumentmustbeproperlyanalyzeddependingonitstype.Forexample,adifferentanalysischainisrequiredforthenumericfields(numbersshouldn’tbesortedalphabetically)andforthetextfetchedfromwebpages(forexample,thefirststepwouldrequireyoutoomittheHTMLtagsasitisuselessinformation).Tobeabletoproperlyanalyzeatindexingandqueryingtime,Elasticsearchstorestheinformationaboutthefieldsofthedocumentsinso-calledmappings.Everydocumenttypehasitsownmapping,evenifwedon’texplicitlydefineit.
www.EBooksWorld.ir
![Page 69: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/69.jpg)
KeyconceptsoftheElasticsearchinfrastructureNow,wealreadyknowthatElasticsearchstoresitsdatainoneormoreindicesandeveryindexcancontaindocumentsofvarioustypes.WealsoknowthateachdocumenthasmanyfieldsandhowElasticsearchtreatsthesefieldsisdefinedbythemappings.Butthereismore.Fromthebeginning,Elasticsearchwascreatedasadistributedsolutionthatcanhandlebillionsofdocumentsandhundredsofsearchrequestspersecond.Thisisduetoseveralimportantkeyfeaturesandconceptsthatwearegoingtodescribeinmoredetailnow.
NodesandclustersElasticsearchcanworkasastandalone,single-searchserver.Nevertheless,tobeabletoprocesslargesetsofdataandtoachievefaulttoleranceandhighavailability,Elasticsearchcanberunonmanycooperatingservers.Collectively,theseserversconnectedtogetherarecalledaclusterandeachserverformingaclusteriscalledanode.
ShardsWhenwehavealargenumberofdocuments,wemaycometoapointwhereasinglenodemaynotbeenough—forexample,becauseofRAMlimitations,harddiskcapacity,insufficientprocessingpower,andaninabilitytorespondtoclientrequestsfastenough.Insuchcases,anindex(andthedatainit)canbedividedintosmallerpartscalledshards(whereeachshardisaseparateApacheLuceneindex).Eachshardcanbeplacedonadifferentserver,andthusyourdatacanbespreadamongtheclusternodes.Whenyouqueryanindexthatisbuiltfrommultipleshards,Elasticsearchsendsthequerytoeachrelevantshardandmergestheresultinsuchawaythatyourapplicationdoesn’tknowabouttheshards.Inadditiontothis,havingmultipleshardscanspeedupindexing,becausedocumentsendupindifferentshardsandthustheindexingoperationisparallelized.
ReplicasInordertoincreasequerythroughputorachievehighavailability,shardreplicascanbeused.Areplicaisjustanexactcopyoftheshard,andeachshardcanhavezeroormorereplicas.Inotherwords,Elasticsearchcanhavemanyidenticalshardsandoneofthemisautomaticallychosenasaplacewheretheoperationsthatchangetheindexaredirected.Thisspecialshardiscalledaprimaryshard,andtheothersarecalledreplicashards.Whentheprimaryshardislost(forexample,aserverholdingthesharddataisunavailable),theclusterwillpromotethereplicatobethenewprimaryshard.
GatewayTheclusterstateisheldbythegateway,whichstorestheclusterstateandindexeddataacrossfullclusterrestarts.Bydefault,everynodehasthisinformationstoredlocally;itissynchronizedamongnodes.WewilldiscussthegatewaymoduleinThegatewayandrecoverymodulessectionofChapter9,ElasticsearchCluster,indetail.
www.EBooksWorld.ir
![Page 70: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/70.jpg)
IndexingandsearchingYoumaywonderhowyoucantiealltheindices,shards,andreplicastogetherinasingleenvironment.Theoretically,itwouldbeverydifficulttofetchdatafromtheclusterwhenyouhavetoknowwhereyourdocumentis:onwhichserver,andinwhichshard.Evenmoredifficultwouldbesearchingwhenonequerycanreturndocumentsfromdifferentshardsplacedondifferentnodesinthewholecluster.Infact,thisisacomplicatedproblem;fortunately,wedon’thavetocareaboutthisatall—itishandledautomaticallybyElasticsearch.Let’slookatthefollowingdiagram:
Whenyousendanewdocumenttothecluster,youspecifyatargetindexandsendittoanyofthenodes.Thenodeknowshowmanyshardsthetargetindexhasandisabletodeterminewhichshardshouldbeusedtostoreyourdocument.Elasticsearchcanalterthisbehavior;wewilltalkaboutthisintheIntroductiontoroutingsectioninChapter2,IndexingYourData.TheimportantinformationthatyouhavetorememberfornowisthatElasticsearchcalculatestheshardinwhichthedocumentshouldbeplacedusingtheuniqueidentifierofthedocument—thisisoneofthereasonseachdocumentneedsauniqueidentifier.Aftertheindexingrequestissenttoanode,thatnodeforwardsthedocumenttothetargetnode,whichhoststherelevantshard.
Now,let’slookatthefollowingdiagramonsearchingrequestexecution:
www.EBooksWorld.ir
![Page 71: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/71.jpg)
Whenyoutrytofetchadocumentbyitsidentifier,thenodeyousendthequerytousesthesameroutingalgorithmtodeterminetheshardandthenodeholdingthedocumentandagainforwardstherequest,fetchestheresult,andsendstheresulttoyou.Ontheotherhand,thequeryingprocessisamorecomplicatedone.Thenodereceivingthequeryforwardsittoallthenodesholdingtheshardsthatbelongtoagivenindexandasksforminimuminformationaboutthedocumentsthatmatchthequery(theidentifierandscorearematchedbydefault),unlessroutingisused,whenthequerywillgodirectlytoasingleshardonly.Thisiscalledthescatterphase.Afterreceivingthisinformation,theaggregatornode(thenodethatreceivestheclientrequest)sortstheresultsandsendsasecondrequesttogetthedocumentsthatareneededtobuildtheresultslist(alltheotherinformationapartfromthedocumentidentifierandscore).Thisiscalledthegatherphase.Afterthisphaseisexecuted,theresultsarereturnedtotheclient.
Nowthequestionarises:whatisthereplica’sroleinthepreviouslydescribedprocess?Whileindexing,replicasareonlyusedasanadditionalplacetostorethedata.Whenexecutingaquery,bydefault,Elasticsearchwilltrytobalancetheloadamongtheshardanditsreplicassothattheyareevenlystressed.Also,rememberthatwecanchangethisbehavior;wewilldiscussthisintheUnderstandingthequeryingprocesssectioninChapter3,SearchingYourData.
www.EBooksWorld.ir
![Page 72: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/72.jpg)
www.EBooksWorld.ir
![Page 73: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/73.jpg)
InstallingandconfiguringyourclusterInstallingandrunningElasticsearcheveninproductionenvironmentsisveryeasynowadays,comparedtohowitwasinthedaysofElasticsearch0.20.x.FromasystemthatisnotreadytoonewithElasticsearch,thereareonlyafewstepsthatoneneedstogo.Wewillexplorethesestepsinthefollowingsection:
www.EBooksWorld.ir
![Page 74: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/74.jpg)
InstallingJavaElasticsearchisaJavaapplicationandtouseitweneedtomakesurethattheJavaSEenvironmentisinstalledproperly.ElasticsearchrequiresJavaVersion7orlatertorun.Youcandownloaditfromhttp://www.oracle.com/technetwork/java/javase/downloads/index.html.YoucanalsouseOpenJDK(http://openjdk.java.net/)ifyouwish.Youcan,ofcourse,useJavaVersion7,butitisnotsupportedbyOracleanymore,atleastwithoutcommercialsupport.Forexample,youcan’texpectnew,patchedversionsofJava7tobereleased.Becauseofthis,westronglysuggestthatyouinstallJava8,especiallygiventhatJava9seemstoberightaroundthecornerwiththegeneralavailabilityplannedtobereleasedinSeptember2016.
www.EBooksWorld.ir
![Page 75: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/75.jpg)
InstallingElasticsearchToinstallElasticsearchyoujustneedtogotohttps://www.elastic.co/downloads/elasticsearch,choosethelaststableversionofElasticsearch,downloadit,andunpackit.That’sit!Theinstallationiscomplete.
NoteAtthetimeofwriting,weusedasnapshotofElasticsearch2.2.Thismeansthatwe’veskippeddescribingsomepropertiesthatweremarkedasdeprecatedandareorwillberemovedinthefutureversionsofElasticsearch.
ThemaininterfacetocommunicatewithElasticsearchisbasedontheHTTPprotocolandREST.Thismeansthatyoucanevenuseawebbrowserforsomebasicqueriesandrequests,butforanythingmoresophisticatedyou’llneedtouseadditionalsoftware,suchasthecURLcommand.IfyouusetheLinuxorOSXcommand,thecURLpackageshouldalreadybeavailable.IfyouuseWindows,youcandownloadthepackagefromhttp://curl.haxx.se/download.html.
www.EBooksWorld.ir
![Page 76: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/76.jpg)
RunningElasticsearchLet’srunourfirstinstancethatwejustdownloadedastheZIParchiveandunpacked.GotothebindirectoryandrunthefollowingcommandsdependingontheOS:
LinuxorOSX:./elasticsearchWindows:elasticsearch.bat
Congratulations!Now,youhaveyourElasticsearchinstanceup-and-running.Duringitswork,theserverusuallyusestwoportnumbers:thefirstoneforcommunicationwiththeRESTAPIusingtheHTTPprotocol,andthesecondoneforthetransportmoduleusedforcommunicationinaclusterandbetweenthenativeJavaclientandthecluster.ThedefaultportusedfortheHTTPAPIis9200,sowecanchecksearchreadinessbypointingthewebbrowsertohttp://127.0.0.1:9200/.Thebrowsershouldshowacodesnippetsimilartothefollowing:
{
"name":"Blob",
"cluster_name":"elasticsearch",
"version":{
"number":"2.2.0",
"build_hash":"5b1dd1cf5a1957682d84228a569e124fedf8e325",
"build_timestamp":"2016-01-13T18:12:26Z",
"build_snapshot":true,
"lucene_version":"5.4.0"
},
"tagline":"YouKnow,forSearch"
}
TheoutputisstructuredasaJavaScriptObjectNotation(JSON)object.IfyouarenotfamiliarwithJSON,pleasetakeaminuteandreadthearticleavailableathttps://en.wikipedia.org/wiki/JSON.
NoteElasticsearchissmart.Ifthedefaultportisnotavailable,theenginebindstothenextfreeport.Youcanfindinformationaboutthisontheconsoleduringbootingasfollows:
[2016-01-1320:04:49,953][INFO][http][Blob]publish_address
{127.0.0.1:9201},bound_addresses{[fe80::1]:9200},{[::1]:9200},
{127.0.0.1:9201}
Notethefragmentwith[http].Elasticsearchusesafewportsforvarioustasks.TheinterfacethatweareusingishandledbytheHTTPmodule.
Now,wewillusethecURLprogramtocommunicatewithElasticsearch.Forexample,tochecktheclusterhealth,wewillusethefollowingcommand:
curl-XGEThttp://127.0.0.1:9200/_cluster/health?pretty
The-XparameterisadefinitionoftheHTTPrequestmethod.ThedefaultvalueisGET(sointhisexample,wecanomitthisparameter).Fornow,donotworryabouttheGETvalue;wewilldescribeitinmoredetaillaterinthischapter.
www.EBooksWorld.ir
![Page 77: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/77.jpg)
Asastandard,theAPIreturnsinformationinaJSONobjectinwhichnewlinecharactersareomitted.TheprettyparameteraddedtoourrequestsforcesElasticsearchtoaddanewlinecharactertotheresponse,makingtheresponsemoreuser-friendly.Youcantryrunningtheprecedingquerywithandwithoutthe?prettyparametertoseethedifference.
Elasticsearchisusefulinsmallandmedium-sizedapplications,butithasbeenbuiltwithlargeclustersinmind.So,nowwewillsetupourbigtwo-nodecluster.UnpacktheElasticsearcharchiveinadifferentdirectoryandrunthesecondinstance.Ifwelookatthelog,wewillseethefollowing:
[2016-01-1320:07:58,561][INFO][cluster.service][BigMan]
detected_master{Blob}{5QPh00RUQraeLHAInbR4Jw}{127.0.0.1}{127.0.0.1:9300},
added{{Blob}{5QPh00RUQraeLHAInbR4Jw}{127.0.0.1}{127.0.0.1:9300},},reason:
zen-disco-receive(frommaster[{Blob}{5QPh00RUQraeLHAInbR4Jw}{127.0.0.1}
{127.0.0.1:9300}])
Thismeansthatoursecondinstance(namedBigMan)discoveredthepreviouslyrunninginstance(namedBlob).Here,Elasticsearchautomaticallyformedanewtwo-nodecluster.StartingfromElasticsearch2.0,thiswillonlyworkwithnodesrunningonthesamephysicalmachine—becauseElasticsearch2.0nolongersupportsmulticast.Toallowyourclustertoform,youneedtoinformElasticsearchaboutthenodesthatshouldbecontactedinitiallyusingthediscovery.zen.ping.unicast.hostsarrayinelasticsearch.yml.Forexample,likethis:
discovery.zen.ping.unicast.hosts:["192.168.2.1","192.168.2.2"]
www.EBooksWorld.ir
![Page 78: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/78.jpg)
ShuttingdownElasticsearchEventhoughweexpectourcluster(ornode)torunflawlesslyforalifetime,wemayneedtorestartitorshutitdownproperly(forexample,formaintenance).ThefollowingarethetwowaysinwhichwecanshutdownElasticsearch:
Ifyournodeisattachedtotheconsole,justpressCtrl+CThesecondoptionistokilltheserverprocessbysendingtheTERMsignal(seethekillcommandontheLinuxboxesandProgramManageronWindows)
NoteThepreviousversionsofElasticsearchexposedadedicatedshutdownAPIbut,in2.0,thisoptionhasbeenremovedbecauseofsecurityreasons.
www.EBooksWorld.ir
![Page 79: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/79.jpg)
ThedirectorylayoutNow,let’sgotothenewlycreateddirectory.Weshouldseethefollowingdirectorystructure:
Directory Description
Bin ThescriptsneededtorunElasticsearchinstancesandforpluginmanagement
Config Thedirectorywhereconfigurationfilesarelocated
Lib ThelibrariesusedbyElasticsearch
Modules ThepluginsbundledwithElasticsearch
AfterElasticsearchstarts,itwillcreatethefollowingdirectories(iftheydon’texist):
Directory Description
Data ThedirectoryusedbyElasticsearchtostoreallthedata
Logs Thefileswithinformationabouteventsanderrors
Plugins Thelocationtostoretheinstalledplugins
Work ThetemporaryfilesusedbyElasticsearch
www.EBooksWorld.ir
![Page 80: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/80.jpg)
ConfiguringElasticsearchOneofthereasons—ofcourse,nottheonlyone—whyElasticsearchisgainingmoreandmorepopularityisthatgettingstartedwithElasticsearchisquiteeasy.Becauseofthereasonabledefaultvaluesandautomaticsettingsforsimpleenvironments,wecanskiptheconfigurationandgostraighttoindexingandquerying(ortothenextchapterofthebook).Wecandoallthiswithoutchangingasinglelineinourconfigurationfiles.However,inordertotrulyunderstandElasticsearch,itisworthunderstandingsomeoftheavailablesettings.
WewillnowexplorethedefaultdirectoriesandthelayoutofthefilesprovidedwiththeElasticsearchtar.gzarchive.Theentireconfigurationislocatedintheconfigdirectory.Wecanseetwofileshere:elasticsearch.yml(orelasticsearch.json,whichwillbeusedifpresent)andlogging.yml.Thefirstfileisresponsibleforsettingthedefaultconfigurationvaluesfortheserver.Thisisimportantbecausesomeofthesevaluescanbechangedatruntimeandcanbekeptasapartoftheclusterstate,sothevaluesinthisfilemaynotbeaccurate.Thetwovaluesthatwecannotchangeatruntimearecluster.nameandnode.name.
Thecluster.namepropertyisresponsibleforholdingthenameofourcluster.Theclusternameseparatesdifferentclustersfromeachother.Nodesconfiguredwiththesameclusternamewilltrytoformacluster.
Thesecondvalueistheinstance(thenode.nameproperty)name.Wecanleavethisparameterundefined.Inthiscase,Elasticsearchautomaticallychoosesauniquenameforitself.Notethatthisnameischosenduringeachstartup,sothenamecanbedifferentoneachrestart.DefiningthenamecanhelpfulwhenreferringtoconcreteinstancesbytheAPIorwhenusingmonitoringtoolstoseewhatishappeningtoanodeduringlongperiodsoftimeandbetweenrestarts.Thinkaboutgivingdescriptivenamestoyournodes.
Otherparametersarecommentedwellinthefile,soweadviseyoutolookthroughit;don’tworryifyoudonotunderstandtheexplanation.Wehopethateverythingwillbecomeclearerafterreadingthenextfewchapters.
NoteRememberthatmostoftheparametersthathavebeensetintheelasticsearch.ymlfilecanbeoverwrittenwiththeuseoftheElasticsearchRESTAPI.WewilltalkaboutthisAPIinTheupdatesettingsAPIsectionofChapter9,ElasticsearchClusterinDetail.
Thesecondfile(logging.yml)defineshowmuchinformationiswrittentosystemlogs,definesthelogfiles,andcreatesnewfilesperiodically.Changesinthisfileareusuallyrequiredonlywhenyouneedtoadapttomonitoringorbackupsolutionsorduringsystemdebugging;however,ifyouwanttohaveamoredetailedlogging,youneedtoadjustitaccordingly.
Let’sleavetheconfigurationfilesfornowandlookatthebaseforalltheapplications—theoperatingsystem.TuningyouroperatingsystemisoneofthekeypointstoensurethatyourElasticsearchinstancewillworkwell.Duringindexing,especiallywhenhaving
www.EBooksWorld.ir
![Page 81: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/81.jpg)
manyshardsandreplicas,Elasticsearchwillcreatemanyfiles;so,thesystemcannotlimittheopenfiledescriptorstolessthan32,000.ForLinuxservers,thiscanusuallybechangedin/etc/security/limits.confandthecurrentvaluecanbedisplayedusingtheulimitcommand.Ifyouendupreachingthelimit,Elasticsearchwillnotbeabletocreatenewfiles;somergingwillfail,indexingmayfail,andnewindiceswillnotbecreated.
NoteOnMicrosoftWindowsplatforms,thedefaultlimitismorethan16millionhandlesperprocess,whichshouldbemorethanenough.YoucanreadmoreaboutfilehandlesontheMicrosoftWindowsplatformathttps://blogs.technet.microsoft.com/markrussinovich/2009/09/29/pushing-the-limits-of-windows-handles/.
ThenextsetofsettingsisconnectedtotheJavaVirtualMachine(JVM)heapmemorylimitforasingleElasticsearchinstance.Forsmalldeployments,thedefaultmemorylimit(1,024MB)willbesufficient,butforlargeonesitwillnotbeenough.IfyouspotentriesthatindicateOutOfMemoryErrorexceptionsinalogfile,settheES_HEAP_SIZEvariabletoavaluegreaterthan1024.WhenchoosingtherightamountofmemorysizetobegiventotheJVM,rememberthat,ingeneral,nomorethan50percentofyourtotalsystemmemoryshouldbegiven.However,aswithalltherules,thereareexceptions.Wewilldiscussthisingreaterdetaillater,butyoushouldalwaysmonitoryourJVMheapusageandadjustitwhenneeded.
www.EBooksWorld.ir
![Page 82: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/82.jpg)
Thesystem-specificinstallationandconfigurationAlthoughdownloadinganarchivewithElasticsearchandunpackingitworksandisconvenientfortesting,therearededicatedmethodsforLinuxoperatingsystemsthatgiveyouseveraladvantageswhenyoudoproductiondeployment.Inproductiondeployments,theElasticsearchserviceshouldberunautomaticallywithasystemboot;weshouldhavededicatedstartandstopscripts,unifiedpaths,andsoon.ElasticsearchsupportsinstallationpackagesforvariousLinuxdistributionsthatwecanuse.Let’sseehowthisworks.
InstallingElasticsearchonLinuxTheotherwaytoinstallElasticsearchonaLinuxoperatingsystemistousepackagessuchasRPMorDEB,dependingonyourLinuxdistributionandthesupportedpackagetype.Thiswaywecanautomaticallyadapttosystemdirectorylayout;forexample,configurationandlogswillgointotheirstandardplacesinthe/etc/or/var/logdirectories.Butthisisnottheonlything.Whenusingpackages,Elasticsearchwillalsoinstallstartupscriptsandmakeourlifeeasier.What’smore,wewillbeabletoupgradeElasticsearcheasilybyrunningasinglecommandfromthecommandline.Ofcourse,thementionedpackagescanbefoundatthesameURLaddressaswementionedpreviouslywhenwetalkedaboutinstallingElasticsearchfromziportar.gzpackages:https://www.elastic.co/downloads/elasticsearch.Elasticsearchcanalsobeinstalledfromremoterepositoriesviastandarddistributiontoolssuchasapt-getoryum.
NoteBeforeinstallingElasticsearch,makesurethatyouhaveaproperversionofJavaVirtualMachineinstalled.
InstallingElasticsearchusingRPMpackages
WhenusingaLinuxdistributionthatsupportsRPMpackagessuchasFedoraLinux,(https://getfedora.org/)Elasticsearchinstallationisveryeasy.AfterdownloadingtheRPMpackage,wejustneedtorunthefollowingcommandasroot:
yumelasticsearch-2.2.0.noarch.rpm
Alternatively,youcanaddtheremoterepositoryandinstallElasticsearchfromit(thiscommandneedstoberunasrootaswell):
rpm--importhttps://packages.elastic.co/GPG-KEY-elasticsearch
ThiscommandaddstheGPGkeyandallowsthesystemtoverifythatthefetchedpackagereallycomesfromElasticsearchdevelopers.Inthesecondstep,weneedtocreatetherepositorydefinitioninthe/etc/yum.repos.d/elasticsearch.repofile.Weneedtoaddthefollowingentriestothisfile:
[elasticsearch-2.2]
name=Elasticsearchrepositoryfor2.2.xpackages
baseurl=http://packages.elastic.co/elasticsearch/2.x/centos
gpgcheck=1
www.EBooksWorld.ir
![Page 83: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/83.jpg)
gpgkey=http://packages.elastic.co/GPG-KEY-elasticsearch
enabled=1
Nowit’stimetoinstalltheElasticsearchserver,whichisassimpleasrunningthefollowingcommand(again,don’tforgettorunitasroot):
yuminstallelasticsearch
Elasticsearchwillbeautomaticallydownloaded,verified,andinstalled.
InstallingElasticsearchusingtheDEBpackage
WhenusingaLinuxdistributionthatsupportsDEBpackages(suchasDebian),installingElasticsearchisagainveryeasy.AfterdownloadingtheDEBpackage,allyouneedtodoisrunthefollowingcommand:
sudodpkg-ielasticsearch-2.2.0.deb
Itisassimpleasthat.Anotherway,whichissimilartowhatwedidwithRPMpackages,isbycreatinganewpackagessourceandinstallingElasticsearchfromtheremoterepository.ThefirststepistoaddthepublicGPGkeyusedforpackageverification.Wecandothatusingthefollowingcommand:
wget-qO-https://packages.elastic.co/GPG-KEY-elasticsearch|sudoapt-key
add-
ThesecondstepisbyaddingtheDEBpackagelocation.Weneedtoaddthefollowinglinetothe/etc/apt/sources.listfile:
debhttp://packages.elastic.co/elasticsearch/2.2/debianstablemain
ThisdefinesthesourcefortheElasticsearchpackages.ThelaststepisupdatingthelistofremotepackagesandinstallingElasticsearchusingthefollowingcommand:
sudoapt-getupdate&&sudoapt-getinstallelasticsearch
Elasticsearchconfigurationfilelocalization
WhenusingpackagestoinstallElasticsearch,theconfigurationfilesareinslightlydifferentdirectoriesthanthedefaultconfdirectory.Aftertheinstallation,theconfigurationfilesshouldbestoredinthefollowinglocation:
/etc/sysconfig/elasticsearchor/etc/default/elasticsearch:AfilewiththeconfigurationoftheElasticsearchprocessasausertorunas,directoriesforlogs,dataandmemorysettings/etc/elasticsearch/:AdirectoryfortheElasticsearchconfigurationfiles,suchastheelasticsearch.ymlfile
ConfiguringElasticsearchasasystemserviceonLinuxIfeverythinggoeswell,youcanrunElasticsearchusingthefollowingcommand:
/bin/systemctlstartelasticsearch.service
IfyouwantElasticsearchtostartautomaticallyeverytimetheoperatingsystemstarts,you
www.EBooksWorld.ir
![Page 84: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/84.jpg)
cansetupElasticsearchasasystemservicebyrunningthefollowingcommand:
/bin/systemctlenableelasticsearch.service
ElasticsearchasasystemserviceonWindowsInstallingElasticsearchasasystemserviceonWindowsisalsoveryeasy.YoujustneedtogotoyourElasticsearchinstallationdirectory,thengotothebinsubdirectory,andrunthefollowingcommand:
service.batinstall
You’llbeaskedforpermissiontodoso.Ifyouallowthescripttorun,ElasticsearchwillbeinstalledasaWindowsservice.
Ifyouwouldliketoseeallthecommandsexposedbytheservice.batscriptfile,justrunthefollowingcommandinthesamedirectoryasearlier:
service.bat
Forexample,tostartElasticsearch,wewilljustrunthefollowingcommand:
service.batstart
www.EBooksWorld.ir
![Page 85: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/85.jpg)
www.EBooksWorld.ir
![Page 86: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/86.jpg)
ManipulatingdatawiththeRESTAPIElasticsearchexposesaveryrichRESTAPIthatcanbeusedtosearchthroughthedata,indexthedata,andcontrolElasticsearchbehavior.YoucanimaginethatusingtheRESTAPIallowsyoutogetasingledocument,indexorupdateadocument,gettheinformationonElasticsearchcurrentstate,createordeleteindices,orforceElasticsearchtomovearoundshardsofyourindices.Ofcourse,theseareonlyexamplesthatshowwhatyoucanexpectfromtheElasticsearchRESTAPI.Fornow,wewillconcentrateonusingthecreate,retrieve,update,delete(CRUD)partoftheElasticsearchAPI(https://en.wikipedia.org/wiki/Create,_read,_update_and_delete),whichallowsustouseElasticsearchinafashionsimilartohowwewoulduseanyotherNoSQL(https://en.wikipedia.org/wiki/NoSQL)datastore.
www.EBooksWorld.ir
![Page 87: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/87.jpg)
UnderstandingtheRESTAPIIfyou’veneverusedanapplicationexposingtheRESTAPI,youmaybesurprisedhoweasyitistousesuchapplicationsandrememberhowtousethem.InREST-likearchitectures,everyrequestisdirectedtoaconcreteobjectindicatedbyapathintheaddress.Forexample,let’sassumethatourhypotheticalapplicationexposesthe/booksRESTend-pointasareferencetothelistofbooks.Insuchcase,acallto/books/1couldbeareferencetoaconcretebookwiththeidentifier1.Youcanthinkofitasadata-orientedmodelofanAPI.Ofcourse,wecannestthepaths—forexample,apathsuchas/books/1/chapterscouldreturnthelistofchaptersofourbookwithidentifier1andapathsuchas/books/1/chapters/6couldbeareferencetothesixthchapterinthatparticularbook.
Wetalkedaboutpaths,butwhenusingtheHTTPprotocol,(https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol)wehavesomeadditionalverbs(suchasPOST,GET,PUT,andsoon.)thatwecanusetodefinesystembehaviorinadditiontopaths.Soifwewouldliketoretrievethebookwithidentifier1,wewouldusetheGETrequestmethodwiththe/books/1path.However,wewouldusethePUTrequestmethodwiththesamepathtocreateabookrecordwiththeidentifierorone,thePOSTrequestmethodtoaltertherecord,DELETEtoremovethatentry,andtheHEADrequestmethodtogetbasicinformationaboutthedatareferencedbythepath.
Now,let’slookatexampleHTTPrequeststhataresenttorealElasticsearchRESTAPIendpoints,sotheprecedinghypotheticalinformationwillbeturnedintosomethingreal:
GEThttp://localhost:9200/:ThisretrievesbasicinformationaboutElasticsearch,suchastheversion,thenameofthenodethatthecommandhasbeensentto,thenameoftheclusterthatnodeisconnectedto,theApacheLuceneversion,andsoon.
GEThttp://localhost:9200/_cluster/state/nodes/Thisretrievesinformationaboutallthenodesinthecluster,suchastheiridentifiers,names,transportaddresseswithports,andadditionalnodeattributesforeachnode.
DELETEhttp://localhost:9200/books/book/123:Thisdeletesadocumentthatisindexedinthebooksindex,withthebooktypeandanidentifierof123.
WenowknowwhatRESTmeansandwecanstartconcentratingonElasticsearchtoseehowwecanstore,retrieve,alter,anddeletethedatafromitsindices.IfyouwouldliketoreadmoreaboutREST,pleaserefertohttp://en.wikipedia.org/wiki/Representational_state_transfer.
www.EBooksWorld.ir
![Page 88: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/88.jpg)
StoringdatainElasticsearchInElasticsearch,everydocumentisrepresentedbythreeattributes—theindex,thetype,andtheidentifier.Eachdocumentmustbeindexedintoasingleindex,needstohaveitstypecorrespondtothedocumentstructure,andisdescribedbytheidentifier.ThesethreeattributesallowsustoidentifyanydocumentinElasticsearchandneedstobeprovidedwhenthedocumentisphysicallywrittentotheunderlyingApacheLuceneindex.Havingtheknowledge,wearenowreadytocreateourfirstElasticsearchdocument.
CreatinganewdocumentWewillstartlearningtheElasticsearchRESTAPIbyindexingonedocument.Let’simaginethatwearebuildingaCMSsystem(http://en.wikipedia.org/wiki/Content_management_system)thatwillprovidethefunctionalityofabloggingplatformforourinternalusers.Wewillhavedifferenttypesofdocumentsinourindices,butthemostimportantonesarethearticlesthatwillbepublishedandarereadablebyusers.
BecausewetalktoElasticsearchusingJSONnotationandElasticsearchrespondstousagainusingJSON,ourexampledocumentcouldlookasfollows:
{
"id":"1",
"title":"NewversionofElasticsearchreleased!",
"content":"Version2.2releasedtoday!",
"priority":10,
"tags":["announce","elasticsearch","release"]
}
Asyoucanseeintheprecedingcodesnippet,theJSONdocumentisbuiltwithasetoffields,whereeachfieldcanhaveadifferentformat.Inourexample,wehaveasetoftextfields(id,title,andcontent),wehaveanumber(thepriorityfield),andanarrayoftextvalues(thetagsfield).Wewillshowdocumentsthataremorecomplicatedinthenextexamples.
NoteOneofthechangesintroducedinElasticsearch2.0hasbeenthatfieldnamescan’tcontainthedotcharacter.SuchfieldnameswerepossibleinolderversionsofElasticsearch,butcouldresultinserializationerrorsincertaincasesandthusElasticsearchcreatorsdecidedtoremovethatpossibility.
OnethingtorememberisthatbydefaultElasticsearchworksasaschema-lessdatastore.ThismeansthatitcantrytoguessthetypeofthefieldinadocumentsenttoElasticsearch.Itwilltrytousenumerictypesforthevaluesthatarenotenclosedinquotationmarksandstringsfordataenclosedinquotationmarks.Itwilltrytoguessthedateandindexthemindedicatedfieldsandsoon.ThisispossiblebecausetheJSONformatissemi-typed.Internally,whenthefirstdocumentwithanewfieldissenttoElasticsearch,itwillbeprocessedandmappingswillbewritten(wewilltalkmoreaboutmappingsintheMappingsconfigurationsectionofChapter2,IndexingYourData).
www.EBooksWorld.ir
![Page 89: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/89.jpg)
NoteAschema-lessapproachanddynamicmappingscanbeproblematicwhendocumentscomewithaslightlydifferentstructure—forexample,thefirstdocumentwouldcontainthevalueofthepriorityfieldwithoutquotationmarks(liketheoneshowninthediscussedexample),whiletheseconddocumentwouldhavequotationmarksforthevalueinthepriorityfield.ThiswillresultinanerrorbecauseElasticsearchwilltrytoputatextvalueinthenumericfieldandthisisnotpossibleinLucene.Becauseofthis,itisadvisabletodefineyourownmappings,whichyouwilllearnintheMappingsconfigurationsectionofChapter2,IndexingYourData.
Let’snowindexourdocumentandmakeitavailableforretrievalandsearching.Wewillindexourarticlestoanindexcalledblogunderatypenamedarticle.Wewillalsogiveourdocumentanidentifierof1,asthisisourfirstdocument.Toindexourexampledocument,wewillexecutethefollowingcommand:
curl-XPUT'http://localhost:9200/blog/article/1'-d'{"title":"New
versionofElasticsearchreleased!","content":"Version2.2released
today!","priority":10,"tags":["announce","elasticsearch","release"]
}'
Noteanewoptiontothecurlcommand,the-dparameter.Thevalueofthisoptionisthetextthatwillbeusedasarequestpayload—arequestbody.Thisway,wecansendadditionalinformationsuchasthedocumentdefinition.Also,notethattheuniqueidentifierisplacedintheURLandnotinthebody.Ifyouomitthisidentifier(whileusingtheHTTPPUTrequest),theindexingrequestwillreturnthefollowingerror:
Nohandlerfoundforuri[/blog/article]andmethod[PUT]
Ifeverythingworkedcorrectly,ElasticsearchwillreturnaJSONresponseinformingusaboutthestatusoftheindexingoperation.Thisresponseshouldbesimilartothefollowingone:
{
"_index":"blog",
"_type":"article",
"_id":"1",
"_version":1,
"_shards":{
"total":2,
"successful":1,
"failed":0},
"created":true
}
Intheprecedingresponse,Elasticsearchincludedinformationaboutthestatusoftheoperation,index,type,identifier,andversion.Wecanalsoseeinformationabouttheshardsthattookpartintheoperation—allofthem,theonesthatweresuccessfulandtheonesthatfailed.
Automaticidentifiercreation
www.EBooksWorld.ir
![Page 90: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/90.jpg)
Inthepreviousexample,wespecifiedthedocumentidentifiermanuallywhenweweresendingthedocumenttoElasticsearch.However,thereareusecaseswhenwedon’thaveanidentifierforourdocuments—forexample,whenhandlinglogsasourdata.Insuchcases,wewouldlikesomeapplicationtocreatetheidentifierforusandElasticsearchcanbesuchanapplication.Ofcourse,generatingdocumentidentifiersdoesn’tmakesensewhenyourdocumentalreadyhasthem,suchasdatainarelationaldatabase.Insuchcases,youmaywanttoupdatethedocuments;inthiscase,automaticidentifiergenerationisnotthebestidea.However,whenweareinneedofsuchfunctionality,insteadofusingtheHTTPPUTmethodwecanusePOSTandomittheidentifierintheRESTAPIpath.SoifwewouldlikeElasticsearchtogeneratetheidentifierinthepreviousexample,wewouldsendacommandlikethis:
curl-XPOST'http://localhost:9200/blog/article/'-d'{"title":"New
versionofElasticsearchreleased!","content":"Version2.2released
today!","priority":10,"tags":["announce","elasticsearch","release"]
}'
We’veusedtheHTTPPOSTmethodinsteadofPUTandwe’veomittedtheidentifier.TheresponseproducedbyElasticsearchinsuchacasewouldbeasfollows:
{
"_index":"blog",
"_type":"article",
"_id":"AU1y-s6w2WzST_RhTvCJ",
"_version":1,
"_shards":{
"total":2,
"successful":1,
"failed":0},
"created":true
}
Asyoucansee,theresponsereturnedbyElasticsearchisalmostthesameasinthepreviousexample,withaminordifference—the_idfieldisreturned.Now,insteadofthe1value,wehaveavalueofAU1y-s6w2WzST_RhTvCJ,whichistheidentifierElasticsearchgeneratedforourdocument.
www.EBooksWorld.ir
![Page 91: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/91.jpg)
RetrievingdocumentsWenowhavetwodocumentsindexedintoourElasticsearchinstance—oneusingaexplicitidentifierandoneusingageneratedidentifier.Let’snowtrytoretrieveoneofthedocumentsusingitsuniqueidentifier.Todothis,wewillneedinformationabouttheindexthedocumentisindexedin,whattypeithas,andofcoursewhatidentifierithas.Forexample,togetthedocumentfromtheblogindexwiththearticletypeandtheidentifierof1,wewouldrunthefollowingHTTPGETrequest:
curl-XGET'localhost:9200/blog/article/1?pretty'
NoteTheadditionalURIpropertycalledprettytellsElasticsearchtoincludenewlinecharactersandadditionalwhitespacesinresponsetomaketheoutputeasiertoreadforusers.
Elasticsearchwillreturnaresponsesimilartothefollowing:
{
"_index":"blog",
"_type":"article",
"_id":"1",
"_version":1,
"found":true,
"_source":{
"title":"NewversionofElasticsearchreleased!",
"content":"Version2.2releasedtoday!",
"priority":10,
"tags":["announce","elasticsearch","release"]
}
}
Asyoucanseeintheprecedingresponse,Elasticsearchreturnedthe_sourcefield,whichistheoriginaldocumentsenttoElasticsearchandafewadditionalfieldsthattellusaboutthedocument,suchastheindex,type,identifier,documentversion,andofcourseinformationastowhetherthedocumentwasfoundornot(thefoundproperty).
Ifwetrytoretrieveadocumentthatisnotpresentintheindex,suchastheonewiththe12345identifier,wegetaresponselikethis:
{
"_index":"blog",
"_type":"article",
"_id":"12345",
"found":false
}
Asyoucansee,thistimethevalueofthefoundpropertywassettofalseandtherewasno_sourcefieldbecausethedocumenthasnotbeenretrieved.
www.EBooksWorld.ir
![Page 92: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/92.jpg)
UpdatingdocumentsUpdatingdocumentsintheindexisamorecomplicatedtaskcomparedtoindexing.WhenthedocumentisindexedandElasticsearchflushesthedocumenttoadisk,itcreatessegments—animmutablestructurethatiswrittenonceandreadmanytimes.ThisisdonebecausetheinvertedindexcreatedbyApacheLuceneiscurrentlyimpossibletoupdate(atleastmostofitsparts).Toupdateadocument,ElasticsearchinternallyfirstfetchesthedocumentusingtheGETrequest,modifiesits_sourcefield,removestheolddocument,andindexesanewdocumentusingtheupdatedcontent.ThecontentupdateisdoneusingscriptsinElasticsearch(wewilltalkmoreaboutscriptinginElasticsearchintheScriptingcapabilitiesofElasticsearchsectioninChapter6,MakeYourSearchBetter).
NotePleasenotethatthefollowingdocumentupdateexamplesrequireyoutoputthescript.inline:onpropertyintoyourelasticsearch.ymlconfigurationfile.ThisisneededbecauseinlinescriptingisdisabledinElasticsearchforsecurityreasons.TheotherwaytohandleupdatesistostorethescriptcontentinthefileintheElasticsearchconfigurationdirectory,butwewilltalkaboutthatintheScriptingcapabilitiesofElasticsearchsectioninChapter6,MakeYourSearchBetter.
Let’snowtrytoupdateourdocumentwithidentifier1bymodifyingitscontentfieldtocontaintheThisistheupdateddocumentsentence.Todothis,weneedtorunaPOSTHTTPrequestonthedocumentpathusingthe_updateRESTend-point.Ourrequesttomodifythedocumentwouldlookasfollows:
curl-XPOST'http://localhost:9200/blog/article/1/_update'-d'{
"script":"ctx._source.content=new_content",
"params":{
"new_content":"Thisistheupdateddocument"
}
}'
Asyoucansee,we’vesenttherequesttothe/blog/article/1/_updateRESTend-point.Intherequestbody,we’veprovidedtwoparameters—theupdatescriptinthescriptpropertyandtheparametersofthescript.Thescriptisverysimple;ittakesthe_sourcefieldandmodifiesthecontentfieldbysettingitsvaluetothevalueofthenew_contentparameter.Theparamspropertycontainsallthescriptparameters.
Fortheprecedingupdatecommandexecution,Elasticsearchwouldreturnthefollowingresponse:
{"_index":"blog","_type":"article","_id":"1","_version":2,"_shards":
{"total":2,"successful":1,"failed":0}}
Thethingtolookatintheprecedingresponseisthe_versionfield.Rightnow,theversionis2,whichmeansthatthedocumenthasbeenupdated(orre-indexed)once.Basically,eachupdatemakesElasticsearchupdatethe_versionfield.
Wecouldalsoupdatethedocumentusingthedocsectionandprovidingthechangedfield,
www.EBooksWorld.ir
![Page 93: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/93.jpg)
forexample:
curl-XPOST'http://localhost:9200/blog/article/1/_update'-d'{
"doc":{
"content":"Thisistheupdateddocument"
}
}'
Wenowretrievethedocumentusingthefollowingcommand:
curl-XGET'http://localhost:9200/blog/article/1?pretty'
AndwegetthefollowingresponsefromElasticsearch:
{
"_index":"blog",
"_type":"article",
"_id":"1",
"_version":2,
"found":true,
"_source":{
"title":"NewversionofElasticsearchreleased!",
"content":"Thisistheupdateddocument",
"priority":10,
"tags":["announce","elasticsearch","release"]
}
}
Asyoucansee,thedocumenthasbeenupdatedproperly.
NoteThethingtorememberwhenusingtheupdateAPIofElasticsearchisthatthe_sourcefieldneedstobepresentbecausethisisthefieldthatElasticsearchusestoretrievetheoriginaldocumentcontentfromtheindex.Bydefault,thatfieldisenabledandElasticsearchusesittostoretheoriginaldocument.
Dealingwithnon-existingdocumentsThenicethingwhenitcomestodocumentupdates,whichwewouldliketomentionasitcancomeinhandywhenusingElasticsearchUpdateAPI,isthatwecandefinewhatElasticsearchshoulddowhenthedocumentwetrytoupdateisnotpresent.
Forexample,let’stryincrementingthepriorityfieldvalueforanon-existingdocumentwithidentifier2:
curl-XPOST'http://localhost:9200/blog/article/2/_update'-d'{
"script":"ctx._source.priority+=1"
}'
TheresponsereturnedbyElasticsearchwouldlookmoreorlessasfollows:
{"error":{"root_cause":[{"type":"document_missing_exception","reason":"
[article][2]:document
missing","shard":"2","index":"blog"}],"type":"document_missing_exception","
reason":"[article][2]:document
www.EBooksWorld.ir
![Page 94: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/94.jpg)
missing","shard":"2","index":"blog"},"status":404}
Asyoucanimagine,thedocumenthasnotbeenupdatedbecauseitdoesn’texist.Sonow,let’smodifyourrequesttoincludetheupsertsectioninourrequestbodythatwilltellElasticsearchwhattodowhenthedocumentisnotpresent.Thenewcommandwouldlookasfollows:
curl-XPOST'http://localhost:9200/blog/article/2/_update'-d'{
"script":"ctx._source.priority+=1",
"upsert":{
"title":"Emptydocument",
"priority":0,
"tags":["empty"]
}
}'
Withthemodifiedrequest,anewdocumentwouldbeindexed;ifweretrieveitusingtheGETAPI,itwilllookasfollows:
{
"_index":"blog",
"_type":"article",
"_id":"2",
"_version":1,
"found":true,
"_source":{
"title":"Emptydocument",
"priority":0,
"tags":["empty"]
}
}
Asyoucansee,thefieldsfromtheupsertsectionofourupdaterequestweretakenbyElasticsearchandusedasdocumentfields.
AddingpartialdocumentsInadditiontowhatwealreadywroteabouttheupdateAPI,Elasticsearchisalsocapableofmergingpartialdocumentsfromtheupdaterequesttoalreadyexistingdocumentsorindexingnewdocumentsusinginformationabouttherequest,similartowhatwesawseenwiththeupsertsection.
Let’simaginethatwewouldliketoupdateourinitialdocumentandaddanewfieldcalledcounttoit(settingitto1initially).Wewouldalsoliketoindexthedocumentunderthespecifiedidentifierifthedocumentisnotpresent.Wecandothisbyrunningthefollowingcommand:
curl-XPOST'http://localhost:9200/blog/article/1/_update'-d'{
"doc":{
"count":1
},
"doc_as_upsert":true
}
Wespecifiedthenewfieldinthedocsectionandwesaidthatwewantthedocsectionto
www.EBooksWorld.ir
![Page 95: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/95.jpg)
betreatedastheupsertsectionwhenthedocumentisnotpresent(withthedoc_as_upsertpropertysettotrue).
Ifwenowretrievethatdocument,weseethefollowingresponse:
{
"_index":"blog",
"_type":"article",
"_id":"1",
"_version":3,
"found":true,
"_source":{
"title":"NewversionofElasticsearchreleased!",
"content":"Thisistheupdateddocument",
"priority":10,
"tags":["announce","elasticsearch","release"],
"count":1
}
}
NoteForafullreferenceondocumentupdates,pleaserefertotheofficialElasticsearchdocumentationontheUpdateAPI,whichisavailableathttps://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update.html.
www.EBooksWorld.ir
![Page 96: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/96.jpg)
DeletingdocumentsNowthatweknowhowtoindexdocuments,updatethem,andretrievethem,itistimetolearnabouthowwecandeletethem.DeletingadocumentfromanElasticsearchindexisverysimilartoretrievingit,butwithonemajordifference—insteadofusingtheHTTPGETmethod,wehavetouseHTTPDELETEone.
Forexample,ifwewouldliketodeletethedocumentindexedintheblogindexunderthearticletypeandwithanidentifierof1,wewouldrunthefollowingcommand:
curl-XDELETE'localhost:9200/blog/article/1'
TheresponsefromElasticsearchindicatesthatthedocumenthasbeendeletedandshouldlookasfollows:
{
"found":true,
"_index":"blog",
"_type":"article",
"_id":"1",
"_version":4,
"_shards":{
"total":2,
"successful":1,
"failed":0
}
}
Ofcourse,thisisnottheonlythingwhenitcomestodeleting.Wecanalsoremoveallthedocumentsofagiventype.Forexample,ifwewouldliketodeletetheentireblogindex,weshouldjustomittheidentifierandthetype,sothecommandwouldlooklikethis:
curl-XDELETE'localhost:9200/blog'
Theprecedingcommandwouldresultinthedeletionoftheblogindex.
www.EBooksWorld.ir
![Page 97: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/97.jpg)
VersioningFinally,thereisonelastthingthatwewouldliketotalkaboutwhenitcomestodatamanipulationinElasticsearch—thegreatfeatureofversioning.Asyoumayhavealreadynoticed,Elasticsearchincrementsthedocumentversionwhenitdoesupdatestoit.Wecanleveragethisfunctionalityanduseoptimisticlocking(http://en.wikipedia.org/wiki/Optimistic_concurrency_control),andavoidconflictsandoverwriteswhenmultipleprocessesorthreadsaccessthesamedocumentconcurrently.Youcanassumethatyourindexingapplicationmaywanttotrytoupdatethedocument,whiletheuserwouldliketoupdatethedocumentwhiledoingsomemanualwork.Thequestionthatarisesis:Whichdocumentshouldbethecorrectone—theoneupdatedbytheindexingapplication,theoneupdatedbytheuser,orthemergeddocumentofthechanges?Whatifthechangesareconflicting?Tohandlesuchcases,wecanuseversioning.
UsageexampleLet’sindexanewdocumenttoourblogindex—onewithanidentifierof10,andlet’sindexitssecondversionsoonafterwedothat.Thecommandsthatdothislookasfollows:
curl-XPUT'localhost:9200/blog/article/10'-d'{"title":"Testdocument"}'
curl-XPUT'localhost:9200/blog/article/10'-d'{"title":"Updatedtest
document"}'
Becausewe’veindexedthedocumentwiththesameidentifier,itshouldhaveaversion2(youcancheckitusingtheGETrequest).
Now,let’strydeletingthedocumentwe’vejustindexedbutlet’sspecifyaversionpropertyequalto1.Bydoingthis,wetellElasticsearchthatweareinterestedindeletingthedocumentwiththeprovidedversion.Becausethedocumentisadifferentversionnow,Elasticsearchshouldn’tallowindexingwithversion1.Let’scheckifwhatwesayistrue.Thecommandwewillusetosendthedeleterequestlooksasfollows:
curl-XDELETE'localhost:9200/blog/article/10?version=1'
TheresponsegeneratedbyElasticsearchshouldbesimilartothefollowingone:
{
"error":{
"root_cause":[{
"type":"version_conflict_engine_exception",
"reason":"[article][10]:versionconflict,current[2],provided
[1]",
"shard":1,
"index":"blog"
}],
"type":"version_conflict_engine_exception",
"reason":"[article][10]:versionconflict,current[2],provided
[1]",
"shard":1,
"index":"blog"
},
"status":409
www.EBooksWorld.ir
![Page 98: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/98.jpg)
}
Asyoucansee,thedeleteoperationwasnotsuccessful—theversionsdidn’tmatch.Ifwesettheversionpropertyto2,thedeleteoperationwouldbesuccessful:
curl-XDELETE'localhost:9200/blog/article/10?version=2&pretty'
Theresponsethistimewilllookasfollows:
{
"found":true,
"_index":"blog",
"_type":"article",
"_id":"10",
"_version":3,
"_shards":{
"total":2,
"successful":1,
"failed":0
}
}
Thistimethedeleteoperationhasbeensuccessfulbecausetheprovidedversionwasproper.
VersioningfromexternalsystemsTheverygoodthingaboutElasticsearchversioningcapabilitiesisthatwecanprovidetheversionofthedocumentthatwewouldlikeElasticsearchtouse.Thisallowsustoprovideversionsfromexternaldatasystemsthatareourprimarydatastores.Todothis,weneedtoprovideanadditionalparameterduringindexing—version_type=externaland,ofcourse,theversionitself.Forexample,ifwewouldlikeourdocumenttohavethe12345version,wecouldsendarequestlikethis:
curl-XPUT'localhost:9200/blog/article/20?
version=12345&version_type=external'-d'{"title":"Testdocument"}'
TheresponsereturnedbyElasticsearchisasfollows:
{
"_index":"blog",
"_type":"article",
"_id":"20",
"_version":12345,
"_shards":{
"total":2,
"successful":1,
"failed":0
},
"created":true
}
Wejustneedtorememberthat,whenusingversion_type=external,weneedtoprovidetheversionincaseswhereweindexthedocument.Incaseswherewewouldliketochangethedocumentanduseoptimisticlocking,weneedtoprovideaversionparameter
www.EBooksWorld.ir
![Page 99: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/99.jpg)
equalto,orhigherthan,theversionpresentinthedocument.
www.EBooksWorld.ir
![Page 100: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/100.jpg)
www.EBooksWorld.ir
![Page 101: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/101.jpg)
SearchingwiththeURIrequestqueryBeforegettingintothewonderfulworldoftheElasticsearchquerylanguage,wewouldliketointroduceyoutothesimplebutprettyflexibleURIrequestsearch,whichallowsustouseasimpleElasticsearchquerycombinedwiththeLucenequerylanguage.Ofcourse,wewillextendoursearchknowledgeusingElasticsearchinChapter3,SearchingYourData,butfornowwewillsticktothesimplestapproach.
www.EBooksWorld.ir
![Page 102: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/102.jpg)
SampledataForthepurposeofthissectionofthebook,wewillcreateasimpleindexwithtwodocumenttypes.Todothis,wewillrunthefollowingsixcommands:
curl-XPOST'localhost:9200/books/es/1'-d'{"title":"Elasticsearch
Server","published":2013}'
curl-XPOST'localhost:9200/books/es/2'-d'{"title":"ElasticsearchServer
SecondEdition","published":2014}'
curl-XPOST'localhost:9200/books/es/3'-d'{"title":"Mastering
Elasticsearch","published":2013}'
curl-XPOST'localhost:9200/books/es/4'-d'{"title":"Mastering
ElasticsearchSecondEdition","published":2015}'
curl-XPOST'localhost:9200/books/solr/1'-d'{"title":"ApacheSolr4
Cookbook","published":2012}'
curl-XPOST'localhost:9200/books/solr/2'-d'{"title":"SolrCookbookThird
Edition","published":2015}'
Runningtheprecedingcommandswillcreatethebook’sindexwithtwotypes:esandsolr.Thetitleandpublishedfieldswillbeindexedandthus,searchable.
www.EBooksWorld.ir
![Page 103: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/103.jpg)
URIsearchAllqueriesinElasticsearcharesenttothe_searchendpoint.Youcansearchasingleindexormultipleindices,andyoucanrestrictyoursearchtoagivendocumenttypeormultipletypes.Forexample,inordertosearchourbook’sindex,wewillrunthefollowingcommand:
curl-XGET'localhost:9200/books/_search?pretty'
TheresultsreturnedbyElasticsearchwillincludeallthedocumentsfromourbook’sindex(becausenoqueryhasbeenspecified)andshouldlooksimilartothefollowing:
{
"took":3,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":6,
"max_score":1.0,
"hits":[{
"_index":"books",
"_type":"es",
"_id":"2",
"_score":1.0,
"_source":{
"title":"ElasticsearchServerSecondEdition",
"published":2014
}
},{
"_index":"books",
"_type":"es",
"_id":"4",
"_score":1.0,
"_source":{
"title":"MasteringElasticsearchSecondEdition",
"published":2015
}
},{
"_index":"books",
"_type":"solr",
"_id":"2",
"_score":1.0,
"_source":{
"title":"SolrCookbookThirdEdition",
"published":2015
}
},{
"_index":"books",
"_type":"es",
"_id":"1",
"_score":1.0,
www.EBooksWorld.ir
![Page 104: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/104.jpg)
"_source":{
"title":"ElasticsearchServer",
"published":2013
}
},{
"_index":"books",
"_type":"solr",
"_id":"1",
"_score":1.0,
"_source":{
"title":"ApacheSolr4Cookbook",
"published":2012
}
},{
"_index":"books",
"_type":"es",
"_id":"3",
"_score":1.0,
"_source":{
"title":"MasteringElasticsearch",
"published":2013
}
}]
}
}
Asyoucansee,theresponsehasaheaderthattellsyouthetotaltimeofthequeryandtheshardsusedinthequeryprocess.Inadditiontothis,wehavedocumentsmatchingthequery—thetop10documentsbydefault.Eachdocumentisdescribedbytheindex,type,identifier,score,andthesourceofthedocument,whichistheoriginaldocumentsenttoElasticsearch.
Wecanalsorunqueriesagainstmanyindices.Forexample,ifwehadanotherindexcalledclients,wecouldalsorunasinglequeryagainstthesetwoindicesasfollows:
curl-XGET'localhost:9200/books,clients/_search?pretty'
WecanalsorunqueriesagainstallthedatainElasticsearchbyomittingtheindexnamescompletelyorsettingthequeriesto_all:
curl-XGET'localhost:9200/_search?pretty'
curl-XGET'localhost:9200/_all/_search?pretty'
Inasimilarmanner,wecanalsochoosethetypeswewanttouseduringsearching.Forexample,ifwewanttosearchonlyintheestypeinthebook’sindex,werunacommandasfollows:
curl-XGET'localhost:9200/books/es/_search?pretty'
Pleaserememberthat,inordertosearchforagiventype,weneedtospecifytheindexormultipleindices.Elasticsearchallowsustohavequitearichsemanticswhenitcomestochoosingindexnames.Ifyouareinterested,pleaserefertohttps://www.elastic.co/guide/en/elasticsearch/reference/current/multi-index.html;however,thereisonethingwewouldliketopointout.Whenrunningaqueryagainstmultiple
www.EBooksWorld.ir
![Page 105: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/105.jpg)
indices,itmayhappenthatsomeofthemdonotexistorareclosed.Insuchcases,theignore_unavailablepropertycomesinhandy.Whensettotrue,ittellsElasticsearchtoignoreunavailableorclosedindices.
Forexample,let’stryrunningthefollowingquery:
curl-XGET'localhost:9200/books,non_existing/_search?pretty'
Theresponsewouldbesimilartothefollowingone:
{
"error":{
"root_cause":[{
"type":"index_missing_exception",
"reason":"nosuchindex",
"index":"non_existing"
}],
"type":"index_missing_exception",
"reason":"nosuchindex",
"index":"non_existing"
},
"status":404
}
Nowlet’scheckwhatwillhappenifweaddtheignore_unavailable=truetoourrequestandexecutethefollowingcommand:
curl-XGET'localhost:9200/books,non_existing/_search?
pretty&ignore_unavailable=true'
Inthiscase,Elasticsearchwouldreturntheresultswithoutanyerror.
ElasticsearchqueryresponseLet’sassumethatwewanttofindallthedocumentsinourbook’sindexthatcontaintheelasticsearchterminthetitlefield.Wecandothisbyrunningthefollowingquery:
curl-XGET'localhost:9200/books/_search?pretty&q=title:elasticsearch'
TheresponsereturnedbyElasticsearchfortheprecedingrequestwillbeasfollows:
{
"took":37,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
"max_score":0.625,
"hits":[{
"_index":"books",
"_type":"es",
"_id":"1",
"_score":0.625,
www.EBooksWorld.ir
![Page 106: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/106.jpg)
"_source":{
"title":"ElasticsearchServer",
"published":2013
}
},{
"_index":"books",
"_type":"es",
"_id":"2",
"_score":0.5,
"_source":{
"title":"ElasticsearchServerSecondEdition",
"published":2014
}
},{
"_index":"books",
"_type":"es",
"_id":"4",
"_score":0.5,
"_source":{
"title":"MasteringElasticsearchSecondEdition",
"published":2015
}
},{
"_index":"books",
"_type":"es",
"_id":"3",
"_score":0.19178301,
"_source":{
"title":"MasteringElasticsearch",
"published":2013
}
}]
}
}
Thefirstsectionoftheresponsegivesusinformationabouthowmuchtimetherequesttook(thetookpropertyisspecifiedinmilliseconds),whetheritwastimedout(thetimed_outproperty),andinformationabouttheshardsthatwerequeriedduringtherequestexecution—thenumberofqueriedshards(thetotalpropertyofthe_shardsobject),thenumberofshardsthatreturnedtheresultssuccessfully(thesuccessfulpropertyofthe_shardsobject),andthenumberoffailedshards(thefailedpropertyofthe_shardsobject).Thequerymayalsotimeoutifitisexecutedforalongerperiodthanwewant.(Wecanspecifythemaximumqueryexecutiontimeusingthetimeoutparameter.)Thefailedshardmeansthatsomethingwentwrongwiththatshardoritwasnotavailableduringthesearchexecution.
Ofcourse,thementionedinformationcanbeuseful,butusually,weareinterestedintheresultsthatarereturnedinthehitsobject.Wehavethetotalnumberofdocumentsreturnedbythequery(inthetotalproperty)andthemaximumscorecalculated(inthemax_scoreproperty).Finally,wehavethehitsarraythatcontainsthereturneddocuments.Inourcase,eachreturneddocumentcontainsitsindexname(the_indexproperty),thetype(the_typeproperty),theidentifier(the_idproperty),thescore(the_scoreproperty),andthe
www.EBooksWorld.ir
![Page 107: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/107.jpg)
_sourcefield(usually,thisistheJSONobjectsentforindexing.
www.EBooksWorld.ir
![Page 108: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/108.jpg)
QueryanalysisYoumaywonderwhythequerywe’verunintheprevioussectionworked.WeindexedtheElasticsearchtermandranaqueryforElasticsearchandeventhoughtheydiffer(capitalization),therelevantdocumentswerefound.Thereasonforthisistheanalysis.Duringindexing,theunderlyingLucenelibraryanalyzesthedocumentsandindexesthedataaccordingtotheElasticsearchconfiguration.Bydefault,ElasticsearchwilltellLucenetoindexandanalyzebothstring-baseddataaswellasnumbers.ThesamehappensduringqueryingbecausetheURIrequestquerymapstothequery_stringquery(whichwillbediscussedinChapter3,SearchingYourData),andthisqueryisanalyzedbyElasticsearch.
Let’susetheindices-analyzeAPI(https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-analyze.html).Itallowsustoseehowtheanalysisprocessisdone.Withthis,wecanseewhathappenedtooneofthedocumentsduringindexingandwhathappenedtoourqueryphraseduringquerying.
InordertoseewhatwasindexedinthetitlefieldoftheElasticsearchserverphrase,wewillrunthefollowingcommand:
curl-XGET'localhost:9200/books/_analyze?pretty&field=title'-d
'ElasticsearchServer'
Theresponsewillbeasfollows:
{
"tokens":[{
"token":"elasticsearch",
"start_offset":0,
"end_offset":13,
"type":"<ALPHANUM>",
"position":0
},{
"token":"server",
"start_offset":14,
"end_offset":20,
"type":"<ALPHANUM>",
"position":1
}]
}
YoucanseethatElasticsearchhasdividedthetextintotwoterms—thefirstonehasatokenvalueofelasticsearchandthesecondonehasatokenvalueoftheserver.
Nowlet’slookathowthequerytextwasanalyzed.Wecandothisbyrunningthefollowingcommand:
curl-XGET'localhost:9200/books/_analyze?pretty&field=title'-d
'elasticsearch'
Theresponseoftherequestwilllookasfollows:
www.EBooksWorld.ir
![Page 109: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/109.jpg)
{
"tokens":[{
"token":"elasticsearch",
"start_offset":0,
"end_offset":13,
"type":"<ALPHANUM>",
"position":0
}]
}
Wecanseethatthewordisthesameastheoriginalonethatwepassedtothequery.Wewon’tgetintotheLucenequerydetailsandhowthequeryparserconstructedthequery,butingeneraltheindexedtermaftertheanalysiswasthesameastheoneinthequeryaftertheanalysis;so,thedocumentmatchedthequeryandtheresultwasreturned.
www.EBooksWorld.ir
![Page 110: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/110.jpg)
URIquerystringparametersThereareafewparametersthatwecanusetocontrolURIquerybehavior,whichwewilldiscussnow.Thethingtorememberisthateachparameterinthequeryshouldbeconcatenatedwiththe&character,asshowninthefollowingexample:
curl-XGET'localhost:9200/books/_search?
pretty&q=published:2013&df=title&explain=true&default_operator=AND'
PleaseremembertoenclosetheURLoftherequestusingthe'charactersbecause,onLinux-basedsystems,the&characterwillbeanalyzedbytheLinuxshell.
ThequeryTheqparameterallowsustospecifythequerythatwewantourdocumentstomatch.ItallowsustospecifythequeryusingtheLucenequerysyntaxdescribedintheLucenequerysyntaxsectionlaterinthischapter.Forexample,asimplequerywouldlooklikethis:q=title:elasticsearch.
ThedefaultsearchfieldUsingthedfparameter,wecanspecifythedefaultsearchfieldthatshouldbeusedwhennofieldindicatorisusedintheqparameter.Bydefault,the_allfieldwillbeused.(ThisisthefieldthatElasticsearchusestocopythecontentofalltheotherfields.WewilldiscussthisingreaterdepthinChapter2,IndexingYourData).Anexampleofthedfparametervaluecanbedf=title.
AnalyzerTheanalyzerpropertyallowsustodefinethenameoftheanalyzerthatshouldbeusedtoanalyzeourquery.Bydefault,ourquerywillbeanalyzedbythesameanalyzerthatwasusedtoanalyzethefieldcontentsduringindexing.
ThedefaultoperatorpropertyThedefault_operatorpropertythatcanbesettoORorAND,allowsustospecifythedefaultBooleanoperatorusedforourquery(http://en.wikipedia.org/wiki/Boolean_algebra).Bydefault,itissettoOR,whichmeansthatasinglequerytermmatchwillbeenoughforadocumenttobereturned.SettingthisparametertoANDforaquerywillresultinreturningthedocumentsthatmatchallthequeryterms.
QueryexplanationIfwesettheexplainparametertotrue,Elasticsearchwillincludeadditionalexplaininformationwitheachdocumentintheresult—suchastheshardfromwhichthedocumentwasfetchedandthedetailedinformationaboutthescoringcalculation(wewilltalkmoreaboutitintheUnderstandingtheexplaininformationsectioninChapter6,MakeYourSearchBetter).Alsoremembernottofetchtheexplaininformationduringnormalsearchqueriesbecauseitrequiresadditionalresourcesandaddsperformancedegradationtothequeries.Forexample,aquerythatincludesexplaininformationcouldlookasfollows:
www.EBooksWorld.ir
![Page 111: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/111.jpg)
curl-XGET'localhost:9200/books/_search?pretty&explain=true&q=title:solr'
TheresultsreturnedbyElasticsearchfortheprecedingquerywouldbeasfollows:
{
"took":2,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":2,
"max_score":0.70273256,
"hits":[{
"_shard":2,
"_node":"v5iRsht9SOWVzu-GY-YHlA",
"_index":"books",
"_type":"solr",
"_id":"2",
"_score":0.70273256,
"_source":{
"title":"SolrCookbookThirdEdition",
"published":2015
},
"_explanation":{
"value":0.70273256,
"description":"weight(title:solrin0)[PerFieldSimilarity],
resultof:",
"details":[{
"value":0.70273256,
"description":"fieldWeightin0,productof:",
"details":[{
"value":1.0,
"description":"tf(freq=1.0),withfreqof:",
"details":[{
"value":1.0,
"description":"termFreq=1.0",
"details":[]
}]
},{
"value":1.4054651,
"description":"idf(docFreq=1,maxDocs=3)",
"details":[]
},{
"value":0.5,
"description":"fieldNorm(doc=0)",
"details":[]
}]
}]
}
},{
"_shard":3,
"_node":"v5iRsht9SOWVzu-GY-YHlA",
"_index":"books",
www.EBooksWorld.ir
![Page 112: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/112.jpg)
"_type":"solr",
"_id":"1",
"_score":0.5,
"_source":{
"title":"ApacheSolr4Cookbook",
"published":2012
},
"_explanation":{
"value":0.5,
"description":"weight(title:solrin1)[PerFieldSimilarity],
resultof:",
"details":[{
"value":0.5,
"description":"fieldWeightin1,productof:",
"details":[{
"value":1.0,
"description":"tf(freq=1.0),withfreqof:",
"details":[{
"value":1.0,
"description":"termFreq=1.0",
"details":[]
}]
},{
"value":1.0,
"description":"idf(docFreq=1,maxDocs=2)",
"details":[]
},{
"value":0.5,
"description":"fieldNorm(doc=1)",
"details":[]
}]
}]
}
}]
}
}
ThefieldsreturnedBydefault,foreachdocumentreturned,Elasticsearchwillincludetheindexname,thetypename,thedocumentidentifier,score,andthe_sourcefield.Wecanmodifythisbehaviorbyaddingthefieldsparameterandspecifyingacomma-separatedlistoffieldnames.Thefieldwillberetrievedfromthestoredfields(iftheyexist;wewilldiscusstheminChapter2,IndexingYourData)orfromtheinternal_sourcefield.Bydefault,thevalueofthefieldsparameteris_source.Anexampleis:fields=title,priority.
Wecanalsodisablethefetchingofthe_sourcefieldbyaddingthe_sourceparameterwithitsvaluesettofalse.
SortingtheresultsUsingthesortparameter,wecanspecifycustomsorting.ThedefaultbehaviorofElasticsearchistosortthereturneddocumentsindescendingorderofthevalueofthe_scorefield.Ifwewanttosortourdocumentsdifferently,weneedtospecifythesort
www.EBooksWorld.ir
![Page 113: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/113.jpg)
parameter.Forexample,addingsort=published:descwillsortthedocumentsindescendingorderofpublishedfield.Byaddingthesort=published:ascparameter,wewilltellElasticsearchtosortthedocumentsonthebasisofthepublishedfieldinascendingorder.
Ifwespecifycustomsorting,Elasticsearchwillomitthe_scorefieldcalculationforthedocuments.Thismaynotbethedesiredbehaviorinyourcase.Ifyouwanttostillkeepatrackofthescoresforeachdocumentwhenusingacustomsort,youshouldaddthetrack_scores=truepropertytoyourquery.Pleasenotethattrackingthescoreswhendoingcustomsortingwillmakethequeryalittlebitslower(youmaynotevennoticethedifference)duetotheprocessingpowerneededtocalculatethescore.
ThesearchtimeoutBydefault,Elasticsearchdoesn’thavetimeoutforqueries,butyoumaywantyourqueriestotimeoutafteracertainamountoftime(forexample,5seconds).Elasticsearchallowsyoutodothisbyexposingthetimeoutparameter.Whenthetimeoutparameterisspecified,thequerywillbeexecuteduptoagiventimeoutvalueandtheresultsthatweregathereduptothatpointwillbereturned.Tospecifyatimeoutof5seconds,youwillhavetoaddthetimeout=5sparametertoyourquery.
TheresultswindowElasticsearchallowsyoutospecifytheresultswindow(therangeofdocumentsintheresultslistthatshouldbereturned).Wehavetwoparametersthatallowustospecifytheresultswindowsize:sizeandfrom.Thesizeparameterdefaultsto10anddefinesthemaximumnumberofresultsreturned.Thefromparameterdefaultsto0andspecifiesfromwhichdocumenttheresultsshouldbereturned.Inordertoreturnfivedocumentsstartingfromthe11thone,wewilladdthefollowingparameterstothequery:size=5&from=10.
Limitingper-shardresultsElasticsearchallowsustospecifythemaximumnumberofdocumentsthatshouldbefetchedfromeachshardusingterminate_afterpropertyandspecifyingthemaximumnumberofdocuments.Forexample,ifwewanttogetnomorethan100documentsfromeachshard,wecanaddterminate_after=100toourURIrequest.
IgnoringunavailableindicesWhenrunningqueriesagainstmultipleindices,itishandytotellElasticsearchthatwedon’tcareabouttheindicesthatarenotavailable.Bydefault,Elasticsearchwillthrowanerrorifoneoftheindicesisnotavailable,butwecanchangethisbysimplyaddingtheignore_unavailable=trueparametertoourURIrequest.
ThesearchtypeTheURIqueryallowsustospecifythesearchtypeusingthesearch_typeparameter,whichdefaultstoquery_then_fetch.Twovaluesthatwecanusehereare:dfs_query_then_fetchandquery_then_fetch.TherestofthesearchtypesavailableinolderElasticsearchversionsarenowdeprecatedorremoved.We’lllearnmoreabout
www.EBooksWorld.ir
![Page 114: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/114.jpg)
searchtypesintheUnderstandingthequeryingprocesssectionofChapter3,SearchingYourData.
LowercasingtermexpansionSomequeries,suchastheprefixquery,usequeryexpansion.WewilldiscussthisintheQueryrewritesectioninChapter4,ExtendingYourQueryingKnowledge.Weareallowedtodefinewhethertheexpandedtermsshouldbelowercasedornotusingthelowercase_expanded_termsproperty.Bydefault,thelowercase_expanded_termspropertyissettotrue,whichmeansthattheexpandedtermswillbelowercased.
WildcardandprefixanalysisBydefault,wildcardqueriesandprefixqueriesarenotanalyzed.Ifwewanttochangethisbehavior,wecansettheanalyze_wildcardpropertytotrue.
NoteIfyouwanttoseealltheparametersexposedbyElasticsearchastheURIrequestparameters,pleaserefertotheofficialdocumentationavailableat:https://www.elastic.co/guide/en/elasticsearch/reference/current/search-uri-request.html.
www.EBooksWorld.ir
![Page 115: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/115.jpg)
LucenequerysyntaxWethoughtthatitwouldbegoodtoknowabitmoreaboutwhatsyntaxcanbeusedintheqparameterpassedintheURIquery.SomeofthequeriesinElasticsearch(suchastheonecurrentlybeingdiscussed)supporttheLucenequeryparsersyntax—thelanguagethatallowsyoutoconstructqueries.Let’stakealookatitanddiscusssomebasicfeatures.
AquerythatwepasstoLuceneisdividedintotermsandoperatorsbythequeryparser.Let’sstartwiththeterms;youcandistinguishthemintotwotypes—singletermsandphrases.Forexample,toqueryforabookterminthetitlefield,wewillpassthefollowingquery:
title:book
Toqueryfortheelasticsearchbookphraseinthetitlefield,wewillpassthefollowingquery:
title:"elasticsearchbook"
Youmayhavenoticedthenameofthefieldinthebeginningandinthetermorthephraselater.
Aswealreadysaid,theLucenequerysyntaxsupportsoperators.Forexample,the+operatortellsLucenethatthegivenpartmustbematchedinthedocument,meaningthatthetermwearesearchingformustpresentinthefieldinthedocument.The-operatoristheopposite,whichmeansthatsuchapartofthequerycan’tbepresentinthedocument.Apartofthequerywithoutthe+or-operatorwillbetreatedasthegivenpartofthequerythatcanbematchedbutitisnotmandatory.So,ifwewanttofindadocumentwiththebookterminthetitlefieldandwithoutthecatterminthedescriptionfield,wesendthefollowingquery:
+title:book-description:cat
Wecanalsogroupmultipletermswithparentheses,asshowninthefollowingquery:
title:(crimepunishment)
Wecanalsoboostpartsofthequery(thisincreasestheirimportanceforthescoringalgorithm—thehighertheboost,themoreimportantthequerypartis)withthe^operatorandtheboostvalueafterit,asshowninthefollowingquery:
title:book^4
ThesearethebasicsoftheLucenequerylanguageandshouldallowyoutouseElasticsearchandconstructquerieswithoutanyproblems.However,ifyouareinterestedintheLucenequerysyntaxandyouwouldliketoexplorethatindepth,pleaserefertotheofficialdocumentationofthequeryparseravailableathttp://lucene.apache.org/core/5_4_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html.
www.EBooksWorld.ir
![Page 116: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/116.jpg)
www.EBooksWorld.ir
![Page 117: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/117.jpg)
SummaryInthischapter,welearnedwhatfulltextsearchisandthecontributionApacheLucenemakestothis.Inadditiontothis,wearenowfamiliarwiththebasicconceptsofElasticsearchanditstop-levelarchitecture.WeusedtheElasticsearchRESTAPInotonlytoindexdata,butalsotoupdate,retrieve,andfinallydeleteit.We’velearnedwhatversioningisandhowwecanuseitforoptimisticlockinginElasticsearch.Finally,wesearchedourdatausingthesimpleURIquery.
Inthenextchapter,we’llfocusonindexingourdata.WewillseehowElasticsearchindexingworksandwhattheroleofprimaryshardsandreplicasis.We’llseehowElasticsearchhandlesdatathatitdoesn’tknowandhowtocreateourownmappings—theJSONstructurethatdescribesthestructureofourindex.We’llalsolearnhowtousebatchindexingtospeeduptheindexingprocessandwhatadditionalinformationcanbestoredalongwithourindextohelpusachieveourgoal.Inaddition,wewilldiscusswhatanindexsegmentis,whatsegmentmergingis,andhowtotuneasegment.Finally,we’llseehowroutingworksinElasticsearchandwhatoptionswehavewhenitcomestobothindexingandqueryingrouting.
www.EBooksWorld.ir
![Page 118: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/118.jpg)
www.EBooksWorld.ir
![Page 119: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/119.jpg)
Chapter2.IndexingYourDataInthepreviouschapter,welearnedwhatfulltextsearchisandhowApacheLucenefitsthere.WewereintroducedtothebasicconceptsofElasticsearchandwearenowfamiliarwithitstop-levelarchitecture,soweknowhowitworks.WeusedtheRESTAPItoindexdata,toupdateit,todeleteit,andofcoursetoretrieveit.WesearchedourdatawiththesimpleURIqueryandweusedversioningthatallowedustouseoptimisticlockingfunctionality.Bytheendofthischapter,youwillhavelearnedthefollowingtopics:
BasicinformationaboutElasticsearchindexingAdjustingElasticsearchschema-lessbehaviorCreatingyourownmappingsUsingoutoftheboxanalyzersConfiguringyourownanalyzersIndexdatainbatchesAddingadditionalinternalinformationtoindicesSegmentmergingRouting
www.EBooksWorld.ir
![Page 120: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/120.jpg)
ElasticsearchindexingSofarwehaveourElasticsearchclusterupandrunning.WealsoknowhowtouseElasticsearchRESTAPItoindexourdata,weknowhowtoretrieveit,andwealsoknowhowtoremovethedatathatwenolongerneed.We’vealsolearnedhowtosearchinourdatabyusingtheURIrequestsearchandApacheLucenequerylanguage.However,untilnowwe’veusedElasticsearchfunctionalitythatallowsusnottocareaboutindices,shards,anddatastructure.ThisisnotsomethingthatyoumaybeusedtowhenyouarecomingfromtheworldofSQLdatabases,whereyouneedthedatabaseandthetableswithallthecolumnscreatedupfront.Ingeneral,youneededtodescribethedatastructuretobeabletoputdataintothedatabase.Elasticsearchisschema-lessandbydefaultcreatesindicesautomaticallyandbecauseofthatwecanjustinstallitandindexdatawithouttheneedofanypreparations.However,thisisusuallynotthebestsituationwhenitcomestoproductionenvironmentswhereyouwanttocontroltheanalysisofyourdata.BecauseofthatwewillstartwithshowingyouhowtomanageyourindicesandthenwewillgetyouthroughtheworldofmappingsinElasticsearch.
www.EBooksWorld.ir
![Page 121: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/121.jpg)
ShardsandreplicasInChapter1,GettingStartedwithElasticsearchCluster,wetoldyouthatindicesinElasticsearcharebuiltfromoneormoreshards.EachofthoseshardscontainspartofthedocumentsetandeachshardisaseparateLuceneindex.Inadditiontothat,eachshardcanhavereplicas–physicalcopiesoftheprimarysharditself.Whenwecreateanindex,wecantellElasticsearchhowmanyshardsitshouldbebuiltfrom.
NoteThedefaultnumberofshardsthatElasticsearchusesis5andeachindexwillalsocontainasinglereplica.Thedefaultconfigurationcanbechangedbysettingtheindex.number_of_shardsandindex.number_of_replicaspropertiesintheelasticsearch.ymlconfigurationfile.
Whendefaultsareused,wewillendupwithfiveApacheLuceneindicesthatourElasticsearchindexisbuiltofandonereplicaforeachofthose.So,withfiveshardsandonereplica,wewouldactuallyget10shards.Thisisbecauseeachshardwouldgetitsowncopy,sothetotalnumberofshardsintheclusterwouldbe10.
Dividingindicesinsuchawayallowsustospreadtheshardsacrossthecluster.Thenicethingaboutthatisthatalltheshardswillbeautomaticallyspreadthroughoutthecluster.Ifwehaveasinglenode,Elasticsearchwillputthefiveprimaryshardsonthatnodeandwillleavethereplicasunassigned,becauseElasticsearchdoesn’tassignshardsandtheirreplicastothesamenode.Thereasonforthatissimple–ifanodewouldcrash,wewouldloseboththeprimarysourceofthedataandallthecopies.So,ifyouhaveoneElasticsearchnode,don’tworryaboutreplicasnotbeingassigned–itissomethingtobeexpected.OfcoursewhenyouhaveenoughnodesforElasticsearchtoassignallthereplicas(inadditiontoshards),itisnotgoodtonothavethemassignedandyoushouldlookfortheprobablecausesofthatsituation.
Thethingtorememberisthathavingshardsandreplicasisnotfree.Firstofall,eachreplicaneedsadditionaldiskspace,exactlythesameamountofspacethattheoriginalshardneeds.Soifwehave3replicasforourindex,wewillactuallyneed4timesmorespace.Ifourprimaryshardweighs100GBintotal,with3replicaswewouldneed400GB–100GBforeachreplica.However,thisisnottheonlycost.EachreplicaisaLuceneindexonitsownandElasticsearchneedssomememorytohandlethat.Themoreshardsinthecluster,themorememoryisbeingused.Andfinally,havingreplicasmeansthatwewillhavetodoindexationoneachofthereplica,inadditiontotheindexationontheprimaryshard.Thereisanotionofshadowreplicaswhichcancopythewholebinaryindex,but,inmostcases,eachreplicawilldoitsownindexation.ThegoodthingaboutreplicasisthatElasticsearchwilltrytospreadthequeryandgetrequestsevenlybetweentheshardsandtheirreplicas,whichmeansthatwecanscaleourclusterhorizontallybyusingthem.
Sotosumuptheconclusions:
Havingmoreshardsintheindexallowsustospreadtheindexbetweenmoreservers
www.EBooksWorld.ir
![Page 122: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/122.jpg)
andparallelizetheindexingoperationsandthushavebetterindexingthroughput.Dependingonyourdeployment,havingmoreshardsmayincreasequerythroughputandlowerquerieslatency–especiallyinenvironmentsthatdon’thavealargenumberofqueriespersecond.Havingmoreshardsmaybeslowercomparedtoasingleshardquery,becauseElasticsearchneedstoretrievethedatafrommultipleserversandcombinethemtogetherinmemory,beforereturningthefinalqueryresults.Havingmorereplicasresultsinamoreresilientcluster,becausewhentheprimaryshardisnotavailable,itscopywilltakethatrole.Basically,havingasinglereplicaallowsustoloseonecopyofashardandstillservethewholedata.Havingtworeplicasallowsustolosetwocopiesoftheshardandstillseethewholedata.Thehigherthereplicacount,thehigherqueriesthroughputtheclusterwillhave.That’sbecauseeachreplicacanservethedataithasindependentlyfromalltheothers.Thehighernumberofshards(bothprimaryandreplicas)willresultinmorememoryneededbyElasticsearch.
Ofcourse,thesearenottheonlyrelationshipsbetweenthenumberofshardsandreplicasinElasticsearch.Wewilltalkaboutmostofthemlaterinthebook.
So,howmanyshardsandreplicasshouldwehaveforourindices?Thatdepends.Webelievethatthedefaultsarequitegoodbutnothingcanreplaceagoodtest.Notethatthenumberofreplicasisnotveryimportantbecauseyoucanadjustitonaliveclusterafterindexcreation.Youcanremoveandaddthemifyouwantandhavetheresourcestorunthem.Unfortunately,thisisnottruewhenitcomestothenumberofshards.Onceyouhaveyourindexcreated,theonlywaytochangethenumberofshardsistocreateanotherindexandre-indexyourdata.
WriteconsistencyElasticsearchallowsustocontrolthewriteconsistencytopreventwriteshappeningwhentheyshouldnot.Bydefault,Elasticsearchindexingoperationissuccessfulwhenthewriteissuccessfulonthequorumonactiveshards–meaning50%oftheactiveshardsplusone.Wecancontrolthisbehaviorbyaddingaction.write_consitencytoourelasticsearch.ymlfileorbyaddingtheconsistencyparametertoourindexrequest.Thementionedpropertiescantakethefollowingvalues:
quorum:Thedefaultvalue,requiring50%plus1activeshardstobesuccessfulfortheindexoperationtosucceedone:Requiresonlyasingleactiveshardtobesuccessfulfortheindexoperationtosucceedall:Requiresalltheactiveshardstobesuccessfulfortheindexoperationtosucceed
www.EBooksWorld.ir
![Page 123: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/123.jpg)
CreatingindicesWhenwewereindexingourdocumentsinChapter1,GettingStartedwithElasticsearchCluster,wedidn’tcareaboutindexcreationatall.WeassumedthatElasticsearchwilldoeverythingforusandactuallyitwastrue;wejustusedthefollowingcommand:
curl-XPUT'http://localhost:9200/blog/article/1'-d'{"title":"New
versionofElasticsearchreleased!","content":"Version1.0released
today!","tags":["announce","elasticsearch","release"]}'
Thisisjustfine.Ifsuchanindexdoesnotexist,Elasticsearchautomaticallycreatestheindexforus.However,therearetimeswhenwewanttocreateindicesourselvesforvariousreasons.Maybewewouldliketohavecontroloverwhichindicesarecreatedtoavoiderrorsormaybewehavesomenondefaultsettingsthatwewouldliketousewhencreatingaparticularindex.Thereasonsmaydiffer,butit’sgoodtoknowthatwecancreateindiceswithoutindexingdocuments.
ThesimplestwaytocreateanindexistorunaPUTHTTPrequestwiththenameoftheindexwewanttocreate.Forexample,tocreateanindexcalledblog,wecouldusethefollowingcommand:
curl-XPUThttp://localhost:9200/blog/
WejusttoldElasticsearchthatwewanttocreatetheindexwiththenameblog.Ifeverythinggoesright,youwillseethefollowingresponsefromElasticsearch:
{"acknowledged":true}
AlteringautomaticindexcreationWealreadymentionedthatautomaticindexcreationisnotthebestideainsomecases.Forexample,asimpletypoduringindexcreationcanleadtocreatinghundredsofunusedindicesandmakeclusterstateinformationlargerthanitshouldbe,puttingmorepressureonElasticsearchandtheunderlyingJVM.Becauseofthat,wecanturnoffautomaticindexcreationbyaddingasimplepropertytotheelasticsearch.ymlconfigurationfile:
action.auto_create_index:false
Let’sstopforawhileanddiscusstheaction.auto_create_indexproperty,becauseitallowsustodomorecomplicatedthingsthanjustallowing(settingittotrue)anddisabling(settingittofalse)automaticindexcreation.Thementionedpropertyallowsustousepatternsthatspecifytheindexnameswhichshouldbeallowedtobeautomaticallycreatedandwhichshouldbedisallowed.Forexample,let’sassumethatwewouldliketoallowautomaticindexcreationforindicesstartingwithlogsandwewouldliketodisallowalltheothers.Todosomethinglikethis,wewouldsettheaction.auto_create_indexpropertytosomethingasfollows:
action.auto_create_index:+logs*,-*
Nowifwewouldliketocreateanindexcalledlogs_2015-10-01,wewouldsucceed.Tocreatesuchanindex,wewouldusethefollowingcommand:
www.EBooksWorld.ir
![Page 124: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/124.jpg)
curl-XPUThttp://localhost:9200/logs_2015-10-01/log/1-d'{"message":
"Testlogmessage"}'
Elasticsearchwouldrespondwith:
{
"_index":"logs_2015-10-01",
"_type":"log",
"_id":"1",
"_version":1,
"_shards":{
"total":2,
"successful":1,
"failed":0
},
"created":true
}
However,supposewenowtrytocreatetheblogusingthefollowingcommand:
curl-XPUThttp://localhost:9200/blog/article/1-d'{"title":"Testarticle
title"}'
Elasticsearchwouldrespondwithanerrorsimilartothefollowingone:
{
"error":{
"root_cause":[{
"type":"index_not_found_exception",
"reason":"nosuchindex",
"resource.type":"index_expression",
"resource.id":"blog",
"index":"blog"
}],
"type":"index_not_found_exception",
"reason":"nosuchindex",
"resource.type":"index_expression",
"resource.id":"blog",
"index":"blog"
},
"status":404
}
Onethingtorememberisthattheorderofpatterndefinitionsmatters.Elasticsearchchecksthepatternsuptothefirstpatternthatmatches,soifwemove-*asthefirstpattern,the+logs*patternwon’tbeusedatall.
SettingsforanewlycreatedindexManualindexcreationisalsonecessarywhenwewanttopassnondefaultconfigurationoptionsduringindexcreation;forexample,initialnumberofshardsandreplicas.WecandothatbyincludingJSONpayloadwithsettingsasthePUTHTTPrequestbody.Forexample,ifwewouldliketotellElasticsearchthatourblogindexshouldonlyhaveasingleshardandtworeplicasinitially,thefollowingcommandcouldbeused:
curl-XPUThttp://localhost:9200/blog/-d'{
www.EBooksWorld.ir
![Page 125: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/125.jpg)
"settings":{
"number_of_shards":1,
"number_of_replicas":2
}
}'
Theprecedingcommandwillresultinthecreationoftheblogindexwithoneshardandtworeplicas,makingatotalofthreephysicalLuceneindices–calledshardsaswealreadyknow.Ofcoursetherearealotmoresettingsthatwecanuse,butwhatwedidisenoughfornowandwewilllearnabouttherestthroughoutthebook.
IndexdeletionOfcourse,similartohowwehandleddocuments,Elasticsearchallowsustodeleteindicesaswell.Deletinganindexisverysimilartocreatingit,butinsteadofusingthePUTHTTPmethod,weusetheDELETEone.Forexample,ifwewouldliketodeleteourpreviouslycreatedblogindex,wewouldrunthefollowingcommand:
curl-XDELETEhttp://localhost:9200/blog
Theresponsewillbethesameastheonewesawearlierwhenwecreatedanindexandshouldlookasfollows:
{"acknowledged":true}
Nowthatweknowwhatanindexis,howtocreateit,andhowtodeleteit,wearereadytocreateindiceswiththemappingswehavedefined.EventhoughElasticsearchisschema–less,therearealotofsituationswherewewouldliketomanuallycreatetheschema,toavoidanyproblemswiththeindexstructure.
www.EBooksWorld.ir
![Page 126: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/126.jpg)
www.EBooksWorld.ir
![Page 127: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/127.jpg)
MappingsconfigurationIfyouareusedtoSQLdatabases,youmayknowthatbeforeyoucanstartinsertingthedatainthedatabase,youneedtocreateaschema,whichwilldescribewhatyourdatalookslike.AlthoughElasticsearchisaschema-less(werathercallitdatadrivenschema)searchengineandcanfigureoutthedatastructureonthefly,wethinkthatcontrollingthestructureandthusdefiningitourselvesisabetterway.Thefieldtypedeterminingmechanismisnotgoingtoguessthefuture.Forexample,ifyoufirstsendanintegervalue,suchas60,andyousendafloatvaluesuchas70.23forthesamefield,anerrorcanhappenorElasticsearchwilljustcutoffthedecimalpartofthefloatvalue(whichisactuallywhathappens).ThisisbecauseElasticsearchwillfirstsetthefieldtypetointegerandwilltrytoindexthefloatvaluetotheintegerfieldwhichwillcausecuttingofthedecimalpointinthefloatingpointnumber.Inthenextfewpagesyou’llseehowtocreatemappingsthatsuityourneedsandmatchyourdatastructure.
NoteNotethatwedidn’tincludealltheinformationabouttheavailabletypesinthischapterandsomefeaturesofElasticsearch,suchasnestedtype,parent-childhandling,storinggeographicalpoints,andsearch,aredescribedinthefollowingchaptersofthisbook.
www.EBooksWorld.ir
![Page 128: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/128.jpg)
TypedeterminingmechanismBeforewestartdescribinghowtocreatemappingsmanually,wewanttogetbacktotheautomatictypedeterminingalgorithmusedinElasticsearch.Aswealreadysaid,ElasticsearchcantryguessingtheschemaforourdocumentsbylookingattheJSONthatthedocumentisbuiltfrom.BecauseJSONisstructured,thatseemseasytodo.Forexample,stringsaresurroundedbyquotationmarks,Booleansaredefinedusingspecificwords,andnumbersarejustafewdigits.Thisisasimpletrick,butitusuallyworks.Forexample,let’slookatthefollowingdocument:
{
"field1":10,
"field2":"10"
}
Theprecedingdocumenthastwofields.Thefield1fieldwillbegivenatypenumber(tobeprecise,thatfieldwillbegivenalongtype).Thesecondfield,calledfield2willbegivenastringtype,becauseitissurroundedbyquotationmarks.Ofcourse,forsomeusecasesthiscanbethedesiredbehavior.However,ifsomehowwewouldsurroundallthedatausingquotationmark(whichisnotthebestideaanyway)ourindexstructurewouldcontainonlystringtypefields.
NoteDon’tworryaboutthefactthatyouarenotfamiliarwithwhatarethenumerictypes,thestringtypes,andsoon.WewilldescribethemafterweshowyouwhatyoucandototunetheautomatictypedeterminingmechanisminElasticsearch.
DisablingthetypedeterminingmechanismThefirstsolutionistocompletelydisabletheschema-lessbehaviorinElasticsearch.Wecandothatbyaddingtheindex.mapper.dynamicpropertytoourindexpropertiesandsettingittofalse.Wecandothatbyrunningthefollowingcommandtocreatetheindex:
curl-XPUT'localhost:9200/sites'-d'{
"index.mapper.dynamic":false
}'
BydoingthatwetoldElasticsearchthatwedon’twantittoguessthetypeofourdocumentsinthesite’sindexandthatwewillprovidethemappingsourselves.Ifwewilltryindexingsomeexampledocumenttothesite’sindex,wewillgetthefollowingerror:
{
"error":{
"root_cause":[{
"type":"type_missing_exception",
"reason":"type[[doc,tryingtoautocreatemapping,butdynamic
mappingisdisabled]]missing",
"index":"sites"
}],
"type":"type_missing_exception",
"reason":"type[[doc,tryingtoautocreatemapping,butdynamic
www.EBooksWorld.ir
![Page 129: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/129.jpg)
mappingisdisabled]]missing",
"index":"sites"
},
"status":404
}
Thisisbecausewedidn’tcreateanymappings–noschemafordocumentswascreated.Elasticsearchcouldn’tcreateoneforusbecausewedidn’tallowitandtheindexationcommandfailed.
Ofcoursethisisnottheonlythingwecandowhenitcomestoconfiguringhowthetypedeterminingmechanismworks.Wecanalsotuneitordisableitforagiventypeontheobjectlevel.WewilltalkaboutthesecondcaseinChapter5,ExtendingYourIndexStructure.Fornow,let’slookatthepossibilitiesoftuningtypedeterminingmechanisminElasticsearch.
TuningthetypedeterminingmechanismfornumerictypesOneofthesolutionstotheproblemswithJSONdocumentsandtypeguessingisthatwearenotalwaysincontrolofthedata.Thedocumentsthatweareindexingcancomefrommultipleplacesandsomesystemsinourenvironmentmayincludequotationmarksforallthefieldsinthedocument.Thiscanleadtoproblemsandbadguesses.Becauseofthat,Elasticsearchallowsustoenablemoreaggressivefieldsvaluecheckingfornumericfieldsbysettingthenumeric_detectionpropertytotrueinthemappingsdefinition.Forexample,let’sassumethatwewanttocreateanindexcalledusersandwewantittohavetheusertypeonwhichwewillwantmoreaggressivenumericfieldsparsing.Todothat,wewillusethefollowingcommand:
curl-XPUThttp://localhost:9200/users/?pretty-d'{
"mappings":{
"user":{
"numeric_detection":true
}
}
}'
Nowlet’srunthefollowingcommandtoindexasingledocumenttotheusersindex:
curl-XPOSThttp://localhost:9200/users/user/1-d'{"name":"User1",
"age":"20"}'
Earlier,withthedefaultsettings,theagefieldwouldbesettostringtype.Withthenumeric_detectionpropertysettotrue,thetypeoftheagefieldwillbesettolong.Wecancheckthatbyrunningthefollowingcommand(itwillretrievethemappingsforallthetypesintheusersindex):
curl-XGET'localhost:9200/users/_mapping?pretty'
TheprecedingcommandshouldresultinthefollowingresponsereturnedbyElasticsearch:
{
"users":{
"mappings":{
www.EBooksWorld.ir
![Page 130: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/130.jpg)
"user":{
"numeric_detection":true,
"properties":{
"age":{
"type":"long"
},
"name":{
"type":"string"
}
}
}
}
}
}
Aswecansee,theagefieldwasreallysettobeoftypelong.
TuningthetypedeterminingmechanismfordatesAnothertypeofdatathatcausestroublearefieldswithdates.Datescancomeindifferentflavors,forexample,2015-10-0111:22:33isaproperdateandsois2015-10-01T11:22:33+00.Becauseofthat,Elasticsearchtriestomatchthefieldstotimestampsorstringsthatmatchsomegivendateformat.Ifthatmatchingoperationissuccessful,thefieldistreatedasadatebasedone.Ifweknowhowourdatefieldslook,wecanhelpElasticsearchbyprovidingalistofrecognizeddateformatsusingthedynamic_date_formatsproperty,whichallowsustospecifytheformatsarray.Let’slookatthefollowingcommandforcreatinganindex:
curl-XPUT'http://localhost:9200/blog/'-d'{
"mappings":{
"article":{
"dynamic_date_formats":["yyyy-MM-ddhh:mm"]
}
}
}'
Theprecedingcommandwillresultinthecreationofanindexcalledblogwiththesingletypecalledarticle.We’vealsousedthedynamic_date_formatspropertywithasingledateformatthatwillresultinElasticsearchusingthedatecoretype(refertotheCoretypessectioninthischapterformoreinformationaboutfieldtypes)forfieldsmatchingthedefinedformat.Elasticsearchusesthejoda-timelibrarytodefinethedateformats,sovisithttp://joda-time.sourceforge.net/api-release/org/joda/time/format/DateTimeFormat.htmlifyouareinterestedinknowingaboutthem.
NoteRememberthatthedynamic_date_formatpropertyacceptsanarrayofvalues.Thatmeansthatwecanhandleseveraldateformatssimultaneously.
Withtheprecedingindex,wecannowtryindexinganewdocumentusingthefollowingcommand:
www.EBooksWorld.ir
![Page 131: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/131.jpg)
curl-XPUTlocalhost:9200/blog/article/1-d'{"name":"Test",
"test_field":"2015-10-0111:22"}'
Elasticsearchwillofcourseindexthatdocument,butlet’slookatthemappingscreatedforourindex:
curl-XGET'localhost:9200/blog/_mapping?pretty'
Theresponsefortheprecedingcommandwillbeasfollows:
{
"blog":{
"mappings":{
"article":{
"dynamic_date_formats":["yyyy-MM-ddhh:mm"],
"properties":{
"name":{
"type":"string"
},
"test_field":{
"type":"date",
"format":"yyyy-MM-ddhh:mm"
}
}
}
}
}
}
Aswecansee,thetest_fieldfieldwasgivenadatetype,soourtuningworks.
Unfortunately,theproblemstillexistsifwewanttheBooleantypetobeguessed.ThereisnooptiontoforcetheguessingofBooleantypesfromthetext.Insuchcases,whenachangeofsourceformatisimpossible,wecanonlydefinethefielddirectlyinthemappingsdefinition.
www.EBooksWorld.ir
![Page 132: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/132.jpg)
IndexstructuremappingEachdatahasitsownstructure–someareverysimple,andsomeincludecomplicatedobjectrelations,childrendocuments,andnestedproperties.Ineachcase,weneedtohaveaschemainElasticsearchcalledmappingsthatdefinehowthedatalooks.Ofcourse,wecanusetheschema-lessnatureofElasticsearch,butwecanandweusuallywanttopreparethemappingsupfront,soweknowhowthedataishandled.
Forthepurposesofthischapter,wewilluseasingletypeintheindex.Ofcourse,Elasticsearchasamultitenantsystemallowsustohavemultipletypesinasingleindex,butwewanttosimplifytheexample,tomakeiteasiertounderstand.So,forthepurposeofthenextfewpages,wewillcreateanindexcalledpoststhatwillholddatafordocumentsinaposttype.Wealsoassumethattheindexwillholdthefollowinginformation:
UniqueidentifieroftheblogpostNameoftheblogpostPublicationdateContents–textofthepostitself
InElasticsearch,mappings,aswithalmostallcommunication,aresentasJSONobjectsintherequestbody.So,ifwewanttocreatethesimplestmappingsthatmatchesourneed,itwilllookasfollows(westoredthemappingsintheposts.jsonfile,sowecaneasilysendit):
{
"mappings":{
"post":{
"properties":{
"id":{"type":"long"},
"name":{"type":"string"},
"published":{"type":"date"},
"contents":{"type":"string"}
}
}
}
}
Tocreateourpostsindexwiththeprecedingmappingsfile,wewilljustrunthefollowingcommand:
curl-XPOST'http://localhost:9200/posts'[email protected]
NoteNotethatyoucanstoreyourmappingsandsetafilenametowhatevernameyoulike.Thecurlcommandwilljusttakethecontentsofit.
Andagain,ifeverythinggoeswell,weseethefollowingresponse:
{"acknowledged":true}
Elasticsearchreportedthatourindexhasbeencreated.IfwelookattheElasticsearchnodewww.EBooksWorld.ir
![Page 133: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/133.jpg)
–onthecurrentmaster,wewillseesomethingasfollows:
[2015-10-1415:02:12,840][INFO][cluster.metadata][Shalla-Bal]
[posts]creatingindex,cause[api],templates[],shards[5]/[1],mappings
[post]
Wecanseethatthepostsindexhasbeencreated,with5shardsand1replica(shards[5]/[1])andwithmappingsforasingleposttype(mappings[post]).Let’snowdiscussthecontentsoftheposts.jsonfileandthepossibilitieswhenitcomestomappings.
TypeandtypesdefinitionThemappingsdefinitioninElasticsearchisjustanotherJSONobject,soitneedstobeproperlystartedandendedwithcurlybrackets.Allthemappingsdefinitionsarenestedinsideasinglemappingsobject.Inourexample,wehadasingleposttype,butwecanhavemultipleofthem.Forexample,ifwewouldliketohavemorethanasingletypeinourmappings,wejustneedtoseparatethemwithacommacharacter.Let’sassumethatwewouldliketohaveanadditionalusertypeinourpostsindex.Themappingsdefinitioninsuchcasewilllookasfollows(westoreditintheposts_with_user.jsonfile):
{
"mappings":{
"post":{
"properties":{
"id":{"type":"long"},
"name":{"type":"string"},
"published":{"type":"date"},
"contents":{"type":"string"}
}
},
"user":{
"properties":{
"id":{"type":"long"},
"name":{"type":"string"}
}
}
}
}
Asyoucansee,wecannamethetypeswiththenameswewant.Undereachtypewehavethepropertiesobjectinwhichwestoretheactualnameofthefieldsandtheirdefinition.
FieldsEachfieldinthemappingsdefinitionisjustanameandanobjectdescribingthepropertiesofthefield.Forexample,wecanhaveafielddefinedasthefollowing:
"body":{"type":"string","store":"yes","index":"analyzed"}
Theprecedingfielddefinitionstartswithaname–body.Afterthatwehaveanobjectwiththreeproperties–thetypeofthefield(thetypeproperty),iftheoriginalfieldvalueshouldbestored(thestoreproperty),andifthefieldshouldbeindexedandhow(theindexproperty).And,ofcourse,multiplefielddefinitionsareseparatedfromeachotherusingthecommacharacter,justlikeotherJSONobjects.
www.EBooksWorld.ir
![Page 134: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/134.jpg)
CoretypesEachfieldtypeinElasticsearchcanbegivenoneoftheprovidedcoretypes.ThecoretypesinElasticsearchareasfollows:
StringNumber(integer,long,float,double)DateBooleanBinary
Inadditiontothecoretypes,Elasticsearchprovidesadditionaltypesthatcanhandlemorecomplicateddata–suchasnesteddocuments,object,andsoon.WewilltalkabouttheminChapter5,ExtendingYourIndexStructure.
Commonattributes
Beforecontinuingwithallthecoretypedescriptions,wewouldliketodiscusssomecommonattributesthatyoucanusetodescribeallthetypes(exceptforthebinaryone):
index_name:Thisattributedefinesthenameofthefieldthatwillbestoredintheindex.Ifthisisnotdefined,thenamewillbesettothenameoftheobjectthatthefieldisdefinedwith.Usually,youdon’tneedtosetthisproperty,butitmaybeusefulinsomecases;forexample,whenyoudon’thavecontroloverthenameofthefieldsintheJSONdocumentsthataresenttoElasticsearch.index:Thisattributecantakethevaluesanalyzedandnoand,forstring-basedfields,itcanalsobesettotheadditionalnot_analyzedvalue.Ifsettoanalyzed,thefieldwillbeindexedandthussearchable.Ifsettono,youwon’tbeabletosearchonsuchafield.Thedefaultvalueisanalyzed.Incaseofstring-basedfields,thereisanadditionaloption,not_analyzed.This,whenset,willmeanthatthefieldwillbeindexedbutnotanalyzed.So,thefieldiswrittenintheindexasitwassenttoElasticsearchandonlyaperfectmatchwillbecountedduringasearch–thequerywillhavetoincludeexactlythesamevalueasthevalueintheindex.IfwecompareittotheSQLdatabasesworld,settingtheindexpropertyofafieldtonot_analyzedwouldworkjustlikeusingwherefield=value.Alsorememberthatsettingtheindexpropertytonowillresultinthedisablinginclusionofthatfieldininclude_in_all(theinclude_in_allpropertyisdiscussedasthelastpropertyinthelist).store:Thisattributecantakethevaluesyesandnoandspecifiesiftheoriginalvalueofthefieldshouldbewrittenintotheindex.Thedefaultvalueisno,whichmeansthatElasticsearchwon’tstoretheoriginalvalueofthefieldandwilltrytousethe_sourcefield(theJSONrepresentingtheoriginaldocumentthathasbeensenttoElasticsearch)whenyouwanttoretrievethefieldvalue.Storedfieldsarenotusedforsearching,howevertheycanbeusedforhighlightingifenabled(whichmaybemoreefficientthatloadingthe_sourcefieldincaseitisbig).doc_values:Thisattributecantakethevaluesoftrueandfalse.Whensettotrue,Elasticsearchwillcreateaspecialondiskstructureduringindexationfornottokenizedfields(likenotanalyzedstringfields,numberbasedfields,Booleanfields,
www.EBooksWorld.ir
![Page 135: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/135.jpg)
anddatefields).ThisstructureishighlyefficientandisusedbyElasticsearchforoperationsthatrequireun-inverteddata,suchasaggregations,sorting,orscripting.StartingwithElasticsearch2.0thedefaultvalueofthisistruefornottokenizedfields.SettingthisvaluetofalsewillresultinElasticsearchusingfielddatacacheinsteadofdocvalues,whichhashighermemorydemand,butmaybefasterinsomeraresituations.boost:Thisattributedefineshowimportantthefieldisinsidethedocument;thehighertheboost,themoreimportantthevaluesinthefieldare.Thedefaultvalueofthisattributeis1,whichmeansaneutralvalue–anythingabove1willmakethefieldmoreimportant,anythinglessthan1willmakeitlessimportant.null_value:Thisattributespecifiesavaluethatshouldbewrittenintotheindexincasethatfieldisnotapartofanindexeddocument.Thedefaultbehaviorwilljustomitthatfield.copy_to:Thisattributespecifiesanarrayoffieldstowhichtheoriginalvaluewillbecopiedto.Thisallowsfordifferentkindofanalysisofthesamedata.Forexample,youcouldimaginehavingtwofields–onecalledtitleandonecalledtitle_sort,eachhavingthesamevaluebutprocesseddifferently.Wecouldusecopy_totocopythetitlefieldvaluetotitle_sort.include_in_all:Thisattributespecifiesifthefieldshouldbeincludedinthe_allfield.The_allfieldisaspecialfieldusedbyElasticsearchtoalloweasysearchinginthecontentsofthewholeindexeddocument.Elasticsearchcreatesthecontentofthe_allfieldbycopyingallthedocumentfieldsthere.Bydefault,ifthe_allfieldisused,allthefieldswillbeincludedinit.
String
Stringisthebasictexttypewhichallowsustostoreoneormorecharactersinsideit.Asampledefinitionofsuchafieldisasfollows:
"body":{"type":"string","store":"yes","index":"analyzed"}
Inadditiontothecommonattributes,thefollowingattributescanalsobesetforthestring-basedfields:
term_vector:Thisattributecantakethevaluesno(thedefaultone),yes,with_offsets,with_positions,andwith_positions_offsets.ItdefineswhetherornottocalculatetheLucenetermvectorsforthatfield.Ifyouareusinghighlighting(distinctionwhichtermswherematchedinadocumentduringthequery),youwillneedtocalculatethetermvectorforthesocalledfastvectorhighlighting–amoreefficienthighlightingversion.analyzer:Thisattributedefinesthenameoftheanalyzerusedforindexingandsearching.Itdefaultstotheglobally-definedanalyzername.search_analyzer:Thisattributedefinesthenameoftheanalyzerusedforprocessingthepartofthequerystringthatissenttoaparticularfield.norms.enabled:Thisattributespecifieswhetherthenormsshouldbeloadedforafield.Bydefault,itissettotrueforanalyzedfields(whichmeansthatthenormswillbeloadedforsuchfields)andtofalsefornon-analyzedfields.Normsarevalues
www.EBooksWorld.ir
![Page 136: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/136.jpg)
insideofLuceneindexthatareusedwhencalculatingascoreforadocument–usuallynotneededfornotanalyzedfieldsandusedonlyduringquerytime.Anexampleindexcreationcommandthatdisablesnormforasinglefieldpresentwouldlookasfollows:
curl-XPOST'localhost:9200/essb'-d'{
"mappings":{
"book":{
"properties":{
"name":{
"type":"string",
"norms":{
"enabled":false
}
}
}
}
}
}'
norms.loading:ThisattributetakesthevalueseagerandlazyanddefineshowElasticsearchwillloadthenorms.Thefirstvaluemeansthatthenormsforsuchfieldsarealwaysloaded.Thesecondvaluemeansthatthenormswillbeloadedonlywhenneeded.Normsareusefulforscoring,butmayrequireavastamountofmemoryforlargedatasets.Havingnormsloadedeagerly(propertysettoeager)meanslessworkduringquerytime,butwillleadtomorememoryconsumption.Anexampleindexcreationcommandthateagerlyloadnormsforasinglefieldpresentlookasfollows:
curl-XPOST'localhost:9200/essb_eager'-d'{
"mappings":{
"book":{
"properties":{
"name":{
"type":"string",
"norms":{
"loading":"eager"
}
}
}
}
}
}'
position_offset_gap:Thisattributedefaultsto0andspecifiesthegapintheindexbetweeninstancesofthegivenfieldwiththesamename.Settingthistoahighervaluemaybeusefulifyouwantposition-basedqueries(suchasphrasequeries)tomatchonlyinsideasingleinstanceofthefield.index_options:Thisattributedefinestheindexingoptionsforthepostingslist–thestructureholdingtheterms(wetalkmoreaboutitinthePostingsformatsectionofthischapter).Thepossiblevaluesaredocs(onlydocumentnumbersareindexed),freqs(documentnumbersandtermfrequenciesareindexed),positions(documentnumbers,termfrequencies,andtheirpositionsareindexed),andoffsets(document
www.EBooksWorld.ir
![Page 137: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/137.jpg)
numbers,termfrequencies,theirpositions,andoffsetsareindexed).Thedefaultvalueforthispropertyispositionsforanalyzedfieldsanddocsforfieldsthatareindexedbutnotanalyzed.ignore_above:Thisattributedefinesthemaximumsizeofthefieldincharacters.Afieldwhosesizeisabovethespecifiedvaluewillbeignoredbytheanalyzer.
NoteInoneoftheupcomingElasticsearchversions,thestringtypemaybedeprecatedandmaybereplacedbytwonewtypes,textandkeyword,tobetterindicatewhatthestringbasedfieldisrepresenting.Thetexttypewillbeusedforanalyzedtextfieldsandthekeywordtypewillbeusedfornotanalyzedtextfields.Ifyouareinterestedintheincomingchanges,refertothefollowingGitHubissue:https://github.com/elastic/elasticsearch/issues/12394.
Number
Thisisthecommonnameforafewcoretypesthatgatherallthenumericfieldtypesthatareavailableandwaitingtobeused.ThefollowingtypesareavailableinElasticsearch(wespecifythembyusingthetypeproperty):
byte:Thistypedefinesabytevalue;forexample,1.Itallowsforvaluesbetween-128and127inclusive.short:Thistypedefinesashortvalue;forexample,12.Itallowsforvaluesbetween-32768and32767inclusive.integer:Thistypedefinesanintegervalue;forexample,134.Itallowsforvaluesbetween-231and231-1inclusiveuptoJava7andvaluesbetween0and232-1inJava8.long:Thistypedefinesalongvalue;forexample,123456789.Itallowsforvaluesbetween-263and263-1inclusiveuptoJava7andvaluesbetween0and264-1inJava8.float:Thistypedefinesafloatvalue;forexample,12.23.Forinformationaboutthepossiblevalues,refertohttps://docs.oracle.com/javase/specs/jls/se8/html/jls-4.html#jls-4.2.3.double:Thistypedefinesadoublevalue;forexample,123.45.Forinformationaboutthepossiblevalues,refertohttps://docs.oracle.com/javase/specs/jls/se8/html/jls-4.html#jls-4.2.3.
NoteYoucanlearnmoreaboutthementionedJavatypesathttp://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html.
Asampledefinitionofafieldbasedononeofthenumerictypesisasfollows:
"price":{"type":"float","precision_step":"4"}
Inadditiontothecommonattributes,thefollowingonescanalsobesetforthenumericfields:
www.EBooksWorld.ir
![Page 138: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/138.jpg)
precision_step:Thisattributedefinesthenumberoftermsgeneratedforeachvalueinthenumericfield.Thelowerthevalue,thehigherthenumberoftermsgenerated.Forfieldswithahighernumberoftermspervalue,rangequerieswillbefasteratthecostofaslightlylargerindex.Thedefaultvalueis16forlonganddouble,8forinteger,short,andfloat,and2147483647forbyte.coerce:Thisattributedefaultstotrueandcantakethevalueoftrueorfalse.ItdefinesifElasticsearchshouldtrytoconvertthestringvaluestonumbersforagivenfieldandifthedecimalpartsofthefloatvalueshouldbetruncatedfortheintegerbasedfields.ignore_malformed:Thisattributecantakethevaluetrueorfalse(whichisthedefault).Itshouldbesettotrueinordertoomitthebadlyformattedvalues.
Boolean
ThebooleancoretypeisdesignedforindexingtheBooleanvalues(trueorfalse).Asampledefinitionofafieldbasedonthebooleantypeisasfollows:
"allowed":{"type":"boolean","store":"yes"}
Binary
ThebinaryfieldisaBASE64representationofthebinarydatastoredintheindex.Youcanuseittostoredatathatisnormallywritteninbinaryform,suchasimages.Fieldsbasedonthistypearebydefaultstoredandnotindexed,soyoucanonlyretrievethemandnotperformsearchoperationsonthem.Thebinarytypeonlysupportstheindex_name,type,store,anddoc_valuesproperties.Thesamplefielddefinitionbasedonthebinaryfieldmaylooklikethefollowing:
"image":{"type":"binary"}
Date
Thedatecoretypeisdesignedtobeusedfordateindexing.ThedateinthefieldallowsustospecifyaformatthatwillberecognizedbyElasticsearch.ItisworthnotingthatallthedatesareindexedinUTCandareinternallyindexedaslongvalues.Inadditiontothat,forthedatebasedfields,ElasticsearchacceptslongvaluesrepresentingUTCmillisecondssinceepochregardlessoftheformatspecifiedforthedatefield.
ThedefaultdateformatrecognizedbyElasticsearchisquiteuniversalandallowsustoprovidethedateandoptionallythetime;forexample,2012-12-24T12:10:22.Asampledefinitionofafieldbasedonthedatetypeisasfollows:
"published":{"type":"date","format":"YYYY-mm-dd"}
Asampledocumentthatusestheabovedatefieldwiththespecifiedformatisasfollows:
{
"name":"Sampledocument",
"published":"2012-12-22"
}
Inadditiontothecommonattributes,thefollowingonescanalsobesetforthefields
www.EBooksWorld.ir
![Page 139: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/139.jpg)
basedonthedatetype:
format:Thisattributespecifiestheformatofthedate.ThedefaultvalueisdateOptionalTime.Forafulllistofformats,visithttps://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-date-format.html.precision_step:Thisattributedefinesthenumberoftermsgeneratedforeachvalueinthenumericfield.Refertothenumericcoretypedescriptionformoreinformationaboutthisparameter.numeric_resolution:ThisattributedefinestheunitoftimethatElasticsearchwillusewhenanumericvalueispassedtothedatebasedfieldinsteadofthedatefollowingaformat.Bydefault,Elasticsearchusesthemillisecondsvalue,whichmeansthatthenumericvaluewillbetreatedasmillisecondssinceepoch.Anothervalueisseconds.ignore_malformed:Thisattributecantakethevaluetrueorfalse.Thedefaultvalueisfalse.Itshouldbesettotrueinordertoomitbadlyformattedvalues.
MultifieldsTherearesituationswhereweneedtohavethesamefieldanalyzeddifferently.Forexample,oneforsorting,oneforsearching,andoneforanalysiswithaggregations,butallusingthesamefieldvalue,justindexeddifferently.Wecouldofcourseusethepreviouslydescribedfieldvaluecopying,butwecanalsousesocalledmultifields.TobeabletousethatfeatureofElasticsearch,weneedtodefineanadditionalpropertyinourfielddefinitioncalledfields.Thefieldsisanobjectthatcancontainoneormoreadditionalfieldsthatwillbepresentinourindexandwillhavethevalueofthefieldthattheyareassignedto.Forexample,ifwewouldliketohaveaggregationsdoneonthenamefieldandinadditiontothatsearchonthatfield,wewoulddefineitasfollows:
"name":{
"type":"string",
"fields":{
"agg":{"type":"string","index":"not_analyzed"}
}
}
Theprecedingdefinitionwillcreatetwofields–onecallednameandthesecondcalledname.agg.Ofcourse,youdon’thavetospecifytwoseparatefieldsinthedatayouaresendingtoElasticsearch–asingleonenamednameisenough.Elasticsearchwilldotherest,whichmeanscopyingthevalueofthefieldtoallthefieldsfromtheprecedingdefinition.
TheIPaddresstypeTheipfieldtypewasaddedtoElasticsearchtosimplifytheuseofIPv4addressesinanumericform.ThisfieldtypeallowsustosearchdatathatisindexedasanIPaddress,sortonsuchdata,anduserangequeriesusingIPvalues.
Asampledefinitionofafieldbasedononeofthenumerictypesisasfollows:
www.EBooksWorld.ir
![Page 140: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/140.jpg)
"address":{"type":"ip"}
Inadditiontothecommonattributes,theprecision_stepattributecanalsobesetfortheiptypebasedfields.Refertothenumerictypedescriptionformoreinformationaboutthatproperty.
Asampledocumentthatusestheipbasedfieldlooksasfollows:
{
"name":"TomPC",
"address":"192.168.2.123"
}
TokencounttypeThetoken_countfieldtypeallowsustostoreandindexinformationabouthowmanytokensthegivenfieldhasinsteadofstoringandindexingthetextprovidedtothefield.Itacceptsthesameconfigurationoptionsasthenumbertype,butinadditiontothat,weneedtospecifytheanalyzerwhichwillbeusedtodividethefieldvalueintotokens.Wedothatbyusingtheanalyzerproperty.
Asampledefinitionofafieldbasedonthetoken_countfieldtypelooksasfollows:
"title_count":{"type":"token_count","analyzer":"standard"}
www.EBooksWorld.ir
![Page 141: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/141.jpg)
UsinganalyzersThegreatthingaboutElasticsearchisthatitleveragestheanalysiscapabilitiesofApacheLucene.Thismeansthatforfieldsthatarebasedonthestringtype,wecanspecifywhichanalyzerElasticsearchshoulduse.AsyourememberfromtheFulltextsearchingsectionofChapter1,GettingStartedwithElasticsearchCluster,theanalyzerisafunctionalitythatisusedtoanalyzedataorqueriesinthewaywewant.Forexample,whenwedividewordsonthebasisofwhitespacesandlowercasecharacters,wedon’thavetoworryabouttheuserssendingwordsthatarelowercasedoruppercased.ThismeansthatElasticsearch,elasticsearch,andElAstIcSeaRChwillbetreatedasthesameword.What’smoreisthatElasticsearchallowsustousenotonlytheanalyzersprovidedoutofthebox,butalsocreateourownconfigurations.Wecanalsousedifferentanalyzersatthetimeofindexinganddifferentanalyzersatthetimeofquerying—wecanchoosehowwewantourdatatobeprocessedateachstageofthesearchprocess.Let’snowhavealookattheanalyzersprovidedbyElasticsearchandatElasticsearchanalysisfunctionalityingeneral.
Out-of-the-boxanalyzersElasticsearchallowsustouseoneofthemanyanalyzersdefinedbydefault.Thefollowinganalyzersareavailableoutofthebox:
standard:ThisanalyzerisconvenientformostEuropeanlanguages(refertohttps://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-standard-analyzer.htmlforthefulllistofparameters).simple:Thisanalyzersplitstheprovidedvalueonnon-lettercharactersandconvertsthemtolowercase.whitespace:Thisanalyzersplitstheprovidedvalueonthebasisofwhitespacecharacters.stop:Thisissimilartoasimpleanalyzer,butinadditiontothefunctionalityofthesimpleanalyzer,itfiltersthedataonthebasisoftheprovidedsetofstopwords(refertohttps://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-stop-analyzer.htmlforthefulllistofparameters).keyword:Thisisaverysimpleanalyzerthatjustpassestheprovidedvalue.You’llachievethesamebyspecifyingaparticularfieldasnot_analyzed.pattern:Thisanalyzerallowsflexibletextseparationbytheuseofregularexpressions(refertohttps://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-pattern-analyzer.htmlforthefulllistofparameters).Thekeypointtorememberwhenitcomestothepatternanalyzeristhattheprovidedpatternshouldmatchtheseparatorsofthewords,notthewordsthemselves.language:Thisanalyzerisdesignedtoworkwithaspecificlanguage.Thefulllistoflanguagessupportedbythisanalyzercanbefoundathttps://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-lang-analyzer.html.snowball:Thisisananalyzerthatissimilartostandard,butadditionallyprovidesthe
www.EBooksWorld.ir
![Page 142: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/142.jpg)
stemmingalgorithm(refertohttps://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-snowball-analyzer.htmlforthefulllistofparameters).
NoteStemmingistheprocessofreducingtheinflectedandderivedwordstotheirstemorbaseform.Suchaprocessallowsforthereductionofwords,forexample,withcarsandcar.Forthementionedwords,stemmer(whichisanimplementationofthestemmingalgorithm)willproduceasinglestem,car.Afterindexing,thedocumentscontainingsuchwordswillbematchedwhileusinganyofthem.Withoutstemming,thedocumentswiththeword“cars”willonlybematchedbyaquerycontainingthesameword.YoucanfindmoreinformationaboutstemmingonWikipediaathttps://en.wikipedia.org/wiki/Stemming.
DefiningyourownanalyzersInadditiontotheanalyzersmentionedpreviously,ElasticsearchallowsustodefinenewoneswithouttheneedforwritingasinglelineofJavacode.Inordertodothat,weneedtoaddanadditionalsectiontoourmappingsfile;thatis,thesettingssection,whichholdsadditionalinformationusedbyElasticsearchduringindexcreation.Thefollowingcodesnippetshowshowwecandefineourcustomsettingssection:
"settings":{
"index":{
"analysis":{
"analyzer":{
"en":{
"tokenizer":"standard",
"filter":[
"asciifolding",
"lowercase",
"ourEnglishFilter"
]
}
},
"filter":{
"ourEnglishFilter":{
"type":"kstem"
}
}
}
}
}
Wespecifiedthatwewantanewanalyzernamedentobepresent.Eachanalyzerisbuiltfromasingletokenizerandmultiplefilters.Acompletelistofthedefaultfiltersandtokenizerscanbefoundathttps://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-tokenizers.html.Ourenanalyzerincludesthestandardtokenizerandthreefilters:asciifoldingandlowercase,whicharetheonesavailablebydefault,andacustomourEnglishFilter,whichisafilterwehavedefined.
www.EBooksWorld.ir
![Page 143: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/143.jpg)
Todefineafilter,weneedtoprovideitsname,itstype(thetypeproperty),andanynumberofadditionalparametersrequiredbythatfiltertype.ThefulllistoffiltertypesavailableinElasticsearchcanbefoundathttps://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-tokenfilters.html.Pleasebeaware,thatwewon’tbediscussingeachfilterasthelistoffiltersisconstantlychanging.Ifyouareinterestedinthefullfilterslist,pleaserefertothementionedpageinthedocumentation.
So,thefinalmappingsfilewithourcustomanalyzerdefinedwillbeasfollows:
{
"settings":{
"index":{
"analysis":{
"analyzer":{
"en":{
"tokenizer":"standard",
"filter":[
"asciifolding",
"lowercase",
"ourEnglishFilter"
]
}
},
"filter":{
"ourEnglishFilter":{
"type":"kstem"
}
}
}
}
},
"mappings":{
"post":{
"properties":{
"id":{"type":"long"},
"name":{"type":"string","analyzer":"en"}
}
}
}
}
Ifwesavetheprecedingmappingstoafilecalledposts_mappings.json,wecanrunthefollowingcommandtocreatethepostsindex:
curl-XPOST'http://localhost:9200/posts'-d@posts_mappings.json
WecanseehowouranalyzerworksbyusingtheAnalyzeAPI(https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-analyze.html).Forexample,let’slookatthefollowingcommand:
curl-XGET'localhost:9200/posts/_analyze?pretty&field=name'-d'robots
cars'
www.EBooksWorld.ir
![Page 144: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/144.jpg)
ThecommandasksElasticsearchtoshowthecontentoftheanalysisofthegivenphrase(robotscars)withtheuseoftheanalyzerdefinedfortheposttypeanditsnamefield.TheresponsethatwewillgetfromElasticsearchisasfollows:
{
"tokens":[{
"token":"robot",
"start_offset":0,
"end_offset":6,
"type":"<ALPHANUM>",
"position":0
},{
"token":"car",
"start_offset":7,
"end_offset":11,
"type":"<ALPHANUM>",
"position":1
}]
}
Asyoucansee,therobotscarsphrasewasdividedintotwotokens.Inadditiontothat,therobotswordwaschangedtorobotandthecarswordwaschangedtocar.
DefaultanalyzersThereisonemorethingtosayaboutanalyzers.Elasticsearchallowsustospecifytheanalyzerthatshouldbeusedbydefaultifnoanalyzerisdefined.Thisisdoneinthesamewayasweconfiguredacustomanalyzerinthesettingssectionofthemappingsfile,butinsteadofspecifyingacustomnamefortheanalyzer,adefaultkeywordshouldbeused.Sotomakeourpreviouslydefinedanalyzerthedefault,wecanchangetheenanalyzertothefollowing:
{
"settings":{
"index":{
"analysis":{
"analyzer":{
"default":{
"tokenizer":"standard",
"filter":[
"asciifolding",
"lowercase",
"ourEnglishFilter"
]
}
},
"filter":{
"ourEnglishFilter":{
"type":"kstem"
}
}
}
}
}
www.EBooksWorld.ir
![Page 145: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/145.jpg)
}
Wecanalsochooseadifferentdefaultanalyzerforsearchingandadifferentoneforindexing.Ifwewouldliketodothatinsteadofusingthedefaultkeywordfortheanalyzername,weshouldusedefault_searchanddefault_indexrespectively.
www.EBooksWorld.ir
![Page 146: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/146.jpg)
DifferentsimilaritymodelsWiththereleaseofApacheLucene4.0in2012,alltheusersofthisgreatfulltextsearchlibraryweregiventheopportunitytoalterthedefaultTF/IDF-basedalgorithmanduseadifferentone(we’vementioneditintheFulltextsearchingsectionofChapter1,GettingStartedwithElasticsearchCluster).BecauseofthatweareabletochooseasimilaritymodelinElasticsearch,whichbasicallyallowsustousedifferentscoringformulasforourdocuments.
NoteNotethatthesimilaritymodelstopicrangesfromintermediatetoadvancedandinmostcasestheTF/IDFbasedalgorithmwillbesufficientforyourusecase.However,wedecidedtohaveitdescribedinthebook,soyouknowthatyouhavethepossibilityofchangingthescoringalgorithmbehaviorifneeded.
Settingper-fieldsimilaritySinceElasticsearch0.90,weareallowedtosetadifferentsimilarityforeachofthefieldsthatwehaveinourmappingsfile.Forexample,let’sassumethatwehavethefollowingsimplemappingsthatweuseinordertoindextheblogposts:
{
"mappings":{
"post":{
"properties":{
"id":{"type":"long"},
"name":{"type":"string"},
"contents":{"type":"string"}
}
}
}
}
Todothis,wewillusetheBM25similaritymodelforthenamefieldandthecontentsfield.Inordertodothat,weneedtoextendourfielddefinitionsandaddthesimilaritypropertywiththevalueofthechosensimilarityname.Ourchangedmappingswilllooklikethefollowing:
{
"mappings":{
"post":{
"properties":{
"id":{"type":"long"},
"name":{"type":"string","similarity":"BM25"},
"contents":{"type":"string","similarity":"BM25"}
}
}
}
}
Andthat’sall,nothingmoreisneeded.Aftertheabovechange,ApacheLucenewillusetheBM25similaritytocalculatethescorefactorforthenameandthecontentsfields.
www.EBooksWorld.ir
![Page 147: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/147.jpg)
AvailablesimilaritymodelsThereareatleastfivenewsimilaritymodelsavailable.Formostoftheusecases,apartfromthedefaultone,youmayfindthefollowingmodelsuseful:
OkapiBM25model:Thissimilaritymodelisbasedonaprobabilisticmodelthatestimatestheprobabilityoffindingadocumentforagivenquery.InordertousethissimilarityinElasticsearch,youneedtousetheBM25name.OkapiBM25similarityissaidperformbestwhendealingwithshorttextdocumentswheretermrepetitionsareespeciallyhurtfultotheoveralldocumentscore.Tousethissimilarity,oneneedstosetthesimilaritypropertyforafieldtoBM25.Thissimilarityisdefinedoutoftheboxanddoesn’tneedadditionalpropertiestobeset.Divergencefromrandomnessmodel:Thissimilaritymodelisbasedontheprobabilisticmodelofthesamename.InordertousethissimilarityinElasticsearch,youneedtousetheDFRname.Itissaidthatthedivergencefromrandomnesssimilaritymodelperformswellontextthatissimilartonaturallanguage.Information-basedmodel:Thisisthelastmodelofthenewlyintroducedsimilaritymodelsandisverysimilartothedivergencefromrandomnessmodel.InordertousethissimilarityinElasticsearch,youneedtousetheIBname.SimilartotheDFRsimilarity,itissaidthattheinformation-basedmodelperformswellondatasimilartonaturallanguagetext.
ThetwoothersimilaritymodelscurrentlyavailableareLMDirichletsimilarity(touseit,setthetypepropertytoLMDirichlet)andLMJelinekMercersimilarity(touseit,setthetypepropertytoLMJelinekMercer).YoucanfindmoreaboutthesesimilaritymodelsinApacheLuceneJavadocs,MasteringElasticsearchSecondEdition,publishedbyPacktPublishingorinofficialdocumentationofElasticsearchavailableathttps://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-similarity.html.
Configuringdefaultsimilarity
Thedefaultsimilarityallowsustoprovideanadditionaldiscount_overlapsproperty.Itallowsustocontrolifthetokensonthesamepositionsinthetokenstream(withpositionincrementof0)areomittedduringscorecalculation.Bydefault,itissettotrue,whichmeansthatthetokensonthesamepositionsareomitted;ifyouwantthemtobecounted,youcansetthatpropertytofalse.Forexample,thefollowingcommandshowshowtocreateanindexwiththediscount_overlapspropertychangedforthedefaultsimilarity:
curl-XPUT'localhost:9200/test_similarity'-d'{
"settings":{
"similarity":{
"altered_default":{
"type":"default",
"discount_overlaps":false
}
}
},
"mappings":{
"doc":{
www.EBooksWorld.ir
![Page 148: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/148.jpg)
"properties":{
"name":{"type":"string","similarity":"altered_default"}
}
}
}
}'
ConfiguringBM25similarity
Eventhoughwedon’tneedtoconfiguretheBM25similarity,wecanprovidesomeadditionaloptionstotuneitsbehavior.TheBM25similarityallowsustoprovidethediscount_overlapspropertysimilartothedefaultsimilarityandtwoadditionalproperties:k1andb.Thek1propertyspecifiesthetermfrequencynormalizationfactorandthebpropertyvaluedeterminestowhatdegreethedocumentlengthwillnormalizethetermfrequencyvalues.
ConfiguringDFRsimilarity
IncaseoftheDFRsimilarity,wecanconfigurethebasic_modelproperty(whichcantakethevaluebe,d,g,if,in,p,orine),theafter_effectproperty(withvaluesofno,b,orl),andthenormalizationproperty(whichcanbeno,h1,h2,h3,orz).Ifwechooseanormalizationvalueotherthanno,weneedtosetthenormalizationfactor.
Dependingonthechosennormalizationvalue,weshouldusenormalization.h1.c(thefloatvalue)forh1normalization,normalization.h2.c(thefloatvalue)forh2normalization,normalization.h3.c(thefloatvalue)forh3normalization,andnormalization.z.z(thefloatvalue)forznormalization.Forexample,thefollowingishowtheexamplesimilarityconfigurationwilllook(weputthisintothesettingssectionofourmappingsfile):
"similarity":{
"esserverbook_dfr_similarity":{
"type":"DFR",
"basic_model":"g",
"after_effect":"l",
"normalization":"h2",
"normalization.h2.c":"2.0"
}
}
ConfiguringIBsimilarity
IncaseofIBsimilarity,wehavethefollowingparametersthroughwhichwecanconfigurethedistributionproperty(whichcantakethevalueofllorspl)andthelambdaproperty(whichcantakethevalueofdfortff).Inadditiontothat,wecanchoosethenormalizationfactor,whichisthesameasfortheDFRsimilarity,sowe’llomitdescribingitasecondtime.ThefollowingishowtheexampleIBsimilarityconfigurationwilllook(weputthisintothesettingssectionofourmappingsfile):
"similarity":{
"esserverbook_ib_similarity":{
"type":"IB",
"distribution":"ll",
www.EBooksWorld.ir
![Page 149: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/149.jpg)
"lambda":"df",
"normalization":"z",
"normalization.z.z":"0.25"
}
}
www.EBooksWorld.ir
![Page 150: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/150.jpg)
www.EBooksWorld.ir
![Page 151: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/151.jpg)
BatchindexingtospeedupyourindexingprocessInChapter1,GettingStartedwithElasticsearchCluster,wesawhowtoindexaparticulardocumentintoElasticsearch.ItrequiredopeninganHTTPconnection,sendingthedocument,andclosingtheconnection.Ofcourse,wewerenotresponsibleformostofthatasweusedthecurlcommand,butinthebackgroundthisiswhathappened.However,sendingthedocumentsonebyoneisnotefficient.Becauseofthat,itisnowtimetofindouthowtoindexalargenumberofdocumentsinamoreconvenientandefficientwaythandoingsoonebyone.
www.EBooksWorld.ir
![Page 152: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/152.jpg)
PreparingdataforbulkindexingElasticsearchallowsustomergemanyrequestsintoonepackage.Thispackagecanbesentasasinglerequest.What’smore,wearenotlimitedtohavingasingletypeofrequestinthesocalledbulk–wecanmixdifferenttypesofoperationstogether,whichinclude:
Addingorreplacingtheexistingdocumentsintheindex(index)Removingdocumentsfromtheindex(delete)
Addingnewdocumentsintotheindexwhenthereisnootherdefinitionofthedocumentintheindex(create)Modifyingthedocumentsorcreatingnewonesifthedocumentdoesn’texist(update)
Theformatoftherequestwaschosenforprocessingefficiency.ItassumesthateverylineoftherequestcontainsaJSONobjectwiththedescriptionoftheoperationfollowedbythesecondlinewithadocument–anotherJSONobjectitself.Wecantreatthefirstlineasakindofinformationlineandthesecondasthedataline.Theexceptiontothisruleisthedeleteoperation,whichcontainsonlytheinformationline,becausethedocumentisnotneeded.Let’slookatthefollowingexample:
{"index":{"_index":"addr","_type":"contact","_id":1}}
{"name":"FyodorDostoevsky","country":"RU"}
{"create":{"_index":"addr","_type":"contact","_id":2}}
{"name":"ErichMariaRemarque","country":"DE"}
{"create":{"_index":"addr","_type":"contact","_id":2}}
{"name":"JosephHeller","country":"US"}
{"delete":{"_index":"addr","_type":"contact","_id":4}}
{"delete":{"_index":"addr","_type":"contact","_id":1}}
Itisveryimportantthateverydocumentoractiondescriptionisplacedinoneline(endedbyanewlinecharacter).Thismeansthatthedocumentcannotbepretty-printed.Thereisadefaultlimitationonthesizeofthebulkindexingfile,whichissetto100megabytesandcanbechangedbyspecifyingthehttp.max_content_lengthpropertyintheElasticsearchconfigurationfile.Thisletsusavoidissueswithpossiblerequesttimeoutsandmemoryproblemswhendealingwithrequeststhataretoolarge.
NoteNotethatwithasinglebatchindexingfile,wecanloadthedataintomanyindicesanddocumentsinthebulkrequestcanhavedifferenttypes.
www.EBooksWorld.ir
![Page 153: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/153.jpg)
IndexingthedataInordertoexecutethebulkrequest,Elasticsearchprovidesthe_bulkendpoint.Thiscanbeusedas/_bulkorwithanindexnameas/index_name/_bulkorevenwithatypeandindexnameas/index_name/type_name/_bulk.Thesecondandthirdformsdefinethedefaultvaluesfortheindexnameandthetypename.WecanomitthesepropertiesintheinformationlineofourrequestandElasticsearchwillusethedefaultvaluesfromtheURI.ItisalsoworthknowingthatthedefaultURIvaluescanbeoverwrittenbythevaluesintheinformationlines.
Assumingwe’vestoredourdatainthedocuments.jsonfile,wecanrunthefollowingcommandtosendthisdatatoElasticsearch:
curl-XPOST'localhost:9200/_bulk?pretty'[email protected]
The?prettyparameterisofcoursenotnecessary.We’veusedthisparameteronlyfortheeaseofanalyzingtheresponseoftheprecedingcommand.Whatisimportant,inthiscase,isusingcurlwiththe--data-binaryparameterinsteadofusing–d.Thisisbecausethestandard–dparameterignoresnewlinecharacters,which,aswesaidearlier,areimportantforparsingthebulkrequestcontentbyElasticsearch.Nowlet’slookattheresponsereturnedbyElasticsearch:
{
"took":469,
"errors":true,
"items":[{
"index":{
"_index":"addr",
"_type":"contact",
"_id":"1",
"_version":1,
"_shards":{
"total":2,
"successful":1,
"failed":0
},
"status":201
}
},{
"create":{
"_index":"addr",
"_type":"contact",
"_id":"2",
"_version":1,
"_shards":{
"total":2,
"successful":1,
"failed":0
},
"status":201
}
},{
"create":{
www.EBooksWorld.ir
![Page 154: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/154.jpg)
"_index":"addr",
"_type":"contact",
"_id":"2",
"status":409,
"error":{
"type":"document_already_exists_exception",
"reason":"[contact][2]:documentalreadyexists",
"shard":"2",
"index":"addr"
}
}
},{
"delete":{
"_index":"addr",
"_type":"contact",
"_id":"4",
"_version":1,
"_shards":{
"total":2,
"successful":1,
"failed":0
},
"status":404,
"found":false
}
},{
"delete":{
"_index":"addr",
"_type":"contact",
"_id":"1",
"_version":2,
"_shards":{
"total":2,
"successful":1,
"failed":0
},
"status":200,
"found":true
}
}]
}
Aswecansee,everyresultisapartoftheitemsarray.Let’sbrieflycomparetheseresultswithourinputdata.Thefirsttwocommands,namedindexandcreate,wereexecutedwithoutanyproblems.Thethirdoperationfailedbecausewewantedtocreatearecordwithanidentifierthatalreadyexistedintheindex.Thenexttwooperationsweredeletions.Bothsucceeded.Notethatthefirstofthemtriedtodeleteanonexistentdocument;asyoucansee,thiswasn’taproblemforElasticsearch–thethingworthnotingthoughisthatforthenonexistingdocumentwesawastatusof404,whichintheHTTPresponsecodemeansnotfound(http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html).Asyoucansee,Elasticsearchreturnsinformationabouteachoperation,soforlargebulkrequeststheresponsecanbemassive.
www.EBooksWorld.ir
![Page 155: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/155.jpg)
The_allfieldThe_allfieldisusedbyElasticsearchtostoredatafromalltheotherfieldsinasinglefieldforeaseofsearching.Thiskindoffieldmaybeusefulwhenwewanttoimplementasimplesearchfeatureandwewanttosearchallthedata(oronlythefieldswecopytothe_allfield),butwedon’twanttothinkaboutthefieldnamesandthingslikethat.Bydefault,the_allfieldisenabledandcontainsallthedatafromallthefieldsfromthedocument.However,thisfieldmakestheindexabitbiggerandthatisnotalwaysneeded.
Forexample,whenyouinputasearchphraseintoasearchboxinthelibrarycatalogsite,youexpectthatyoucansearchusingtheauthor’sname,theISBNnumber,andthewordsthatthebooktitlecontains,butsearchingforthenumberofpagesorthecovertypeusuallydoesnotmakesense.Wecaneitherdisablethe_allfieldcompletelyorexcludethecopyingofcertainfieldstoit.Inordernottoincludeacertainfieldinthe_allfield,weusetheinclude_in_allproperty,whichwasdiscussedearlierinthischapter.Tocompletelyturnoffthe_allfieldfunctionality,wemodifyourmappingsfileasfollows:
{
"book":{
"_all":{
"enabled":false
},
"properties":{
...
}
}
}
Inadditiontotheenabledproperty,the_allfieldsupportsthefollowingones:
store
term_vector
analyzer
Forinformationabouttheprecedingproperties,refertotheMappingsconfigurationsectioninthischapter.
www.EBooksWorld.ir
![Page 156: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/156.jpg)
The_sourcefieldThe_sourcefieldallowsustostoretheoriginalJSONdocumentthatwassenttoElasticsearchduringindexation.Bydefault,the_sourcefieldisturnedonassomeoftheElasticsearchfunctionalitiesdependonit(forexample,thepartialupdatefeature).Inadditiontothat,the_sourcefieldcanbeusedasthesourceofdataforthehighlightingfunctionalityifafieldisnotstored.However,ifwedon’tneedsuchafunctionality,wecandisablethe_sourcefieldasitcausessomestorageoverhead.Inordertodothat,weneedtosetthe_sourceobject’senabledpropertytofalse,asfollows:
{
"book":{ "_source":{
"enabled":false
},
"properties":{
...
}
}
}
WecanalsotellElasticsearchwhichfieldswewanttoexcludefromthe_sourcefieldandwhichfieldswewanttoinclude.Wedothatbyaddingtheincludesandexcludespropertiestothe_sourcefielddefinition.Forexample,ifwewanttoexcludeallthefieldsintheauthorpathfromthe_sourcefield,ourmappingswilllookasfollows:
{
"book":{
"_source":{
"excludes":["author.*"]
},
"properties":{
...
}
}
}
www.EBooksWorld.ir
![Page 157: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/157.jpg)
AdditionalinternalfieldsThereareadditionalfieldsthatareinternallyusedbyElasticsearch,butwhichwecan’tconfigure.Thosefieldsare:
_id:Thisfieldisusedtoholdtheidentifierofthedocumentinsidetheindexandtype_uid:Thisfieldisusedtoholdtheuniqueidentifierofthedocumentintheindexandisbuiltof_idand_type(thisallowstohavedocumentswiththesameidentifierwithdifferenttypesinsidethesameindex)_type:Thisfieldisthetypenameforthedocument_field_names:Thisfieldisthelistoffieldsexistinginthedocument
www.EBooksWorld.ir
![Page 158: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/158.jpg)
www.EBooksWorld.ir
![Page 159: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/159.jpg)
IntroductiontosegmentmergingIntheFulltextsearchingsectionofChapter1,GettingStartedwithElasticsearchCluster,wementionedsegmentsandtheirimmutability.WewrotethattheLucenelibrary,andthusElasticsearch,writesdatatocertainstructuresthatarewrittenonceandneverchange.Thisallowsforsomesimplification,butalsointroducestheneedforadditionalwork.Onesuchexampleisdeletion.Becausesegment,cannotbealtered,informationaboutdeletionsmustbestoredalongsideanddynamicallyappliedduringsearch.Thisisdonebyfilteringdeleteddocumentsfromthereturnedresultset.Theotherexampleistheinabilitytomodifythedocuments(however,somemodificationsarepossible,suchasmodifyingnumericdocvalues).Ofcourse,onecansaythatElasticsearchsupportsdocumentupdates(refertotheManipulatingdatawiththeRESTAPIsectionofChapter1,GettingStartedwithElasticsearchCluster).However,underthehood,theolddocumentismarkedasdeletedandtheonewiththeupdatedcontentsisindexed.
Astimepassesandyoucontinuetoindexordeleteyourdata,moreandmoresegmentsarecreated.Dependingonhowoftenyoumodifytheindex,Lucenecreatessegmentswithvariousnumbersofdocuments-thus,segmentshavedifferentsizes.Becauseofthat,thesearchperformancemaybelowerandyourindexmaybelargerthanitshouldbe–itstillcontainsthedeleteddocuments.Theequationissimple-themoresegmentsyourindexhas,theslowerthesearchspeedis.Thisiswhensegmentmergingcomesintoplay.Wedon’twanttodescribethisprocessindetail;inthecurrentElasticsearchversion,thispartoftheenginewassimplifiedbutitisstillaratheradvancedtopic.Wedecidedtomentionmergingbecausewethinkthatitishandytoknowwheretolookforthecauseoftroublesconnectedwithtoomanyopenfiles,suspiciousCPUusage,expandingindices,orsearchingandindexingspeeddegradingwithtime.
www.EBooksWorld.ir
![Page 160: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/160.jpg)
SegmentmergingSegmentmergingistheprocessduringwhichtheunderlyingLucenelibrarytakesseveralsegmentsandcreatesanewsegmentbasedontheinformationfoundinthem.Theresultingsegmenthasallthedocumentsstoredintheoriginalsegmentsexcepttheonesthatweremarkedfordeletion.Afterthemergeoperation,thesourcesegmentsaredeletedfromthedisk.BecausesegmentmergingisrathercostlyintermsofCPUandI/Ousage,itiscrucialtoappropriatelycontrolwhenandhowoftenthisprocessisinvoked.
www.EBooksWorld.ir
![Page 161: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/161.jpg)
TheneedforsegmentmergingYoumayaskyourselfwhyyouhavetobotherwithsegmentmerging.Firstofall,themoresegmentstheindexisbuiltfrom,theslowerthesearchwillbeandthemorememoryLucenewilluse.Thesecondisthediskspaceandresources,suchasfiledescriptors,usedbytheindex.Ifyoudeletemanydocumentsfromyourindexthen,untilthemergehappens,thosedocumentsareonlymarkedasdeletedandnotdeletedphysically.So,itmayhappenthatmostofthedocumentsthatuseourCPUandmemorydon’texist!Fortunately,Elasticsearchusesreasonabledefaultsforsegmentmerginganditisveryprobablethatnochangesarenecessary.
www.EBooksWorld.ir
![Page 162: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/162.jpg)
ThemergepolicyThemergepolicydefineswhenthemergingprocessshouldbeperformed.Elasticsearchmergessegmentsofapproximatelysimilarsizes,takingintoaccountthemaximumnumberofsegmentsallowedpertier.Thealgorithmofmergingcanfindsegmentswiththelowestcostofmergeandthemostimpactontheresultingsegment.
Thebasicpropertiesofthetieredmergepolicyareasfollows:
index.merge.policy.expunge_deletes_allowed:ThispropertytellsElasticsearchtomergesegmentswithpercentageofthedeleteddocumentshigherthanthisvalue,defaultsto10.index.merge.policy.floor_segment:Thispropertydefaultsto2mbandtellsElasticsearchtotreatsmallersegmentsasoneswithsizeequaltothevalueofthisproperty.Itpreventsflushingoftinysegmentstoavoidtheirhighnumber.index.merge.policy.max_merge_at_once:Inthisproperty,themaximumnumberofsegmentstobemergedatoncedefaultsto10.index.merge.policy.max_merge_at_once_explicit:Inthisproperty,themaximumnumberofsegmentsmergedatonceduringexpungedeletesoroptimizeoperationsdefaultsto10.index.merge.policy.max_merged_segment:Inthisproperty,themaximumsizeofsegmentthatcanbeproducedduringnormalmergingdefaultsto5gb.index.merge.policy.segments_per_tier:Thispropertydefaultsto10androughlydefinesthenumberofsegments.Smallervaluesmeanmoremergingbutfewersegments,whichresultsinhighersearchspeedbutlowerindexingspeedandmoreI/Opressure.Highervaluesofthepropertywillresultinhighersegmentscount,thusslowersearchspeedbuthigherindexingspeed.index.merge.policy.reclaim_deletes_weight–ThispropertytellsElasticsearchhowimportantitistochoosesegmentswithmanydeleteddocuments.Itdefaultsto2.0.
Forexample,toupdatemergepolicysettingsofalreadycreatedindexwecouldrunacommandlikethis:
curl-XPUT'localhost:9200/essb/_settings'-d'{
"index.merge.policy.max_merged_segment":"10gb"
}'
Togetdeeperintosegmentmerging,refertoourbookMasteringElasticsearchSecondEdition,publishedbyPacktPublishing.Youcanalsofindmoreinformationaboutthetieredmergepolicyathttps://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-merge.html.
NoteUptothe2.0versionofElasticsearch,wewereabletochoosebetweenthreemergepolicies:tiered,log_byte_size,andlog_doc.Thecurrentlyusedmergepolicyisbased
www.EBooksWorld.ir
![Page 163: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/163.jpg)
onthetieredmergepolicyandweareforcedtouseit.
www.EBooksWorld.ir
![Page 164: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/164.jpg)
ThemergeschedulerThemergeschedulertellsElasticsearchhowthemergeprocessshouldoccur.Thecurrentimplementationisbasedonaconcurrentmergeschedulerthatisstartedinaseparatethreadandusesthedefinednumberofthreadsdoingmergesinparallel.Elasticsearchallowsyoutosetthenumberofthreadsthatcanbeusedforsimultaneousmergingbyusingtheindex.merge.scheduler.max_thread_countproperty.
www.EBooksWorld.ir
![Page 165: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/165.jpg)
ThrottlingAswehavealreadymentioned,mergingmaybeexpensivewhenitcomestoserverresources.Themergeprocessusuallyworksinparalleltootheroperations,sotheoreticallyitshouldn’thavetoomuchinfluence.Inpractice,thenumberofdiskinput/outputoperationscanbesolargeastosignificantlyaffecttheoverallperformance.Insuchcases,throttlingissomethingthatmayhelp.Infact,thisfeaturecanbeusedforlimitingthespeedofthemerge,butitmayalsobeusedforalltheoperationsusingthedatastore.ThrottlingcanbesetintheElasticsearchconfigurationfile(theelasticsearch.ymlfile)ordynamicallybyusingthesettingsAPI(refertotheTheupdatesettingsAPIsectionofChapter9,ElasticsearchCluster,fordetail).Therearetwosettingsthatadjustthrottling:typeandvalue.
Tosetthethrottlingtype,settheindices.store.throttle.typeproperty,whichallowsustousethefollowingvalues:
none:Thisvaluedefinesthatnothrottlingisonmerge:Thisvaluedefinesthatthrottlingaffectsonlythemergeprocessall:Thisvaluedefinesthatthrottlingisusedforallthedatastoreactivities
Thesecondproperty,indices.store.throttle.max_bytes_per_sec,describeshowmuchthethrottlinglimitstheI/Ooperations.Asitsnamesuggests,ittellsushowmanybytescanbeprocessedpersecond.Forexample,let’slookatthefollowingconfiguration:
indices.store.throttle.type:merge
indices.store.throttle.max_bytes_per_sec:10mb
Inthisexample,welimitthemergeoperationsto10megabytespersecond.Bydefault,Elasticsearchusesthemergethrottlingtypewiththemax_bytes_per_secpropertysetto20mb.Thismeansthatallthemergeoperationsarelimitedto20megabytespersecond.
www.EBooksWorld.ir
![Page 166: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/166.jpg)
www.EBooksWorld.ir
![Page 167: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/167.jpg)
IntroductiontoroutingBydefault,Elasticsearchwilltrytodistributeyourdocumentsevenlyamongalltheshardsoftheindex.However,that’snotalwaysthedesiredsituation.Inordertoretrievethedocuments,Elasticsearchmustqueryalltheshardsandmergetheresults.Whatifwecoulddivideourdataonsomebasis(forexample,theclientidentifier)andusethatinformationtoputdatawiththesamepropertiesinthesameplaceinthecluster.Elasticsearchallowsustodothatbyexposingapowerfuldocumentandquerydistributioncontrolmechanismrouting.Inshort,itallowsustochooseashardtobeusedtoindexorsearchthedata.
www.EBooksWorld.ir
![Page 168: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/168.jpg)
DefaultindexingDuringindexingoperations,whenyousendadocumentforindexing,Elasticsearchlooksatitsidentifiertochoosetheshardinwhichthedocumentshouldbeindexed.Bydefault,Elasticsearchcalculatesthehashvalueofthedocument’sidentifierand,onthebasisofthat,itputsthedocumentinoneoftheavailableprimaryshards.Then,thosedocumentsareredistributedtothereplicas.Thefollowingdiagramshowsasimpleillustrationofhowindexingworksbydefault:
www.EBooksWorld.ir
![Page 169: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/169.jpg)
DefaultsearchingSearchingisabitdifferentfromindexing,becauseinmostsituationsyouneedtoqueryalltheshardstogetthedatayouareinterestedin(wewilltalkaboutthatinChapter3,SearchingYourData),atleastintheinitialscatterphaseofthequery.Imagineasituationwhenyouhavethefollowingmappingsdescribingyourindex:
{
"mappings":{
"post":{
"properties":{
"id":{"type":"long"},
"name":{"type":"string"},
"contents":{"type":"string"},
"userId":{"type":"long"}
}}
}}
Asyoucansee,ourindexconsistsoffourfields:theidentifier(theidfield),nameofthedocument(thenamefield),contentsofthedocument(thecontentsfield),andtheidentifieroftheusertowhichthedocumentsbelong(theuserIdfield).Togetallthedocumentsforaparticularuser,onewithuserIdequalto12,youcanrunthefollowingquery:
curl–XGET'http://localhost:9200/posts/_search?q=userId:12'
Dependingonthesearchtype(wewilltalkmoreaboutitinChapter3,SearchingYourData),Elasticsearchwillrunyourquery.Itusuallymeansthatitwillfirstqueryallthenodesfortheidentifiersandscoreofthematchingdocumentsandthenitwillsendaninternalqueryagain,butonlytotherelevantshards(theonescontainingtheneededdocuments)togetthedocumentsneededtobuildtheresponse.
Averysimplifiedviewofhowthedefaultsearchingworksduringitsinitialphaseisshowninthefollowingillustration:
www.EBooksWorld.ir
![Page 170: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/170.jpg)
Whatifwecouldputallthedocumentsforasingleuserintoasingleshardandqueryonthatshard?Wouldn’tthatbewiseforperformance?Yes,thatishandyandthatiswhatroutingallowsyoudoto.
www.EBooksWorld.ir
![Page 171: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/171.jpg)
RoutingRoutingcancontrolwhichshardyourdocumentsandquerieswillbeforwardedto.Bynow,youwillprobablyhaveguessedthatwecanspecifytheroutingvaluebothduringindexingandduringqueryingand,infact,ifyoudecidetospecifyexplicitroutingvalues,you’llprobablywanttodothatduringindexingandsearching.
Inourcase,wewillusetheuserIdvaluetosetroutingduringindexingandthesamevaluewillbeusedduringsearching.Becausewewillusethesameroutingvalueforallthedocumentsforasingleuser,thesamehashvaluewillbecalculatedandthusallthedocumentsforthatparticularuserwillbeplacedinthesameshard.Usingthesamevalueduringsearchwillresultinsearchingasingleshardinsteadofthewholeindex.
Thereisonethingyoushouldrememberwhenusingroutingwhensearching.Whensearching,youshouldaddaquerypartthatwilllimitthereturneddocumentstotheonesforthegivenuser.Routingisnotenough.Thisisbecauseyou’llprobablyhavemoredistinctroutingvaluesthanthenumberofshardsyourindexwillbebuiltwith.Forexample,youcanhave10shardsbuildingyourindex,butatthesametimehavehundredsofusers.Itisphysicallyimpossibletodedicateasingleshardtoonlyasingleuser.Itisusuallynotgoodfromascalingpointforviewaswell.Becauseofthat,afewdistinctvaluescanpointtothesameshard–inourcasedataofafewuserswillbeplacedinthesameshard.Becauseofthat,weneedaquerypartthatwilllimitthedatatoaparticularuseridentifier,suchasatermquery.
Thefollowingdiagramshowsaverysimpleillustrationofhowsearchingworkswithaprovidedcustomroutingvalue:
www.EBooksWorld.ir
![Page 172: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/172.jpg)
Asyoucansee,Elasticsearchwillsendourquerytoasingleshard.Nowlet’slookathowwecanspecifytheroutingvalues.
www.EBooksWorld.ir
![Page 173: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/173.jpg)
TheroutingparametersTheideaisverysimple.TheendpointusedforalltheoperationsconnectedwithfetchingorstoringdocumentsinElasticsearchallowsustouseadditionalparametercalledrouting.YoucanaddittoyourHTTPorsetitbyusingtheclientlibraryofyourchoice.
So,inordertoindexasampledocumenttothepreviouslyshownindex,wewillusethefollowingcommand:
curl-XPUT'http://localhost:9200/posts/post/1?routing=12'-d'{
"id":"1",
"name":"Testdocument",
"contents":"Testdocument",
"userId":"12"
}'
Ifwenowgetbacktoourpreviousqueryfetchingouruser’sdataandwemodifyittouserouting,itwouldlookasfollows:
curl-XGET'http://localhost:9200/posts/_search?routing=12&q=userId:12'
Asyoucansee,thesameroutingvaluewasusedduringindexingandquerying.Thisispossibleinmostcaseswhenroutingisused.Weknowwhichuserdataweareindexingandwewillprobablyknowwhichuserissearchingforthedata.Inourcase,ourimaginaryuserwasgiventheidentifierof12andweusedthatvalueduringindexingandsearching.
Notethatduringsearchingyoucanspecifymultipleroutingvaluesseparatedbycommas.Forexample,ifwewanttheprecedingquerytobeadditionallyroutedbythevalueofthesectionparameter(ifitexisted)andwealsowanttofilterbythisparameter,ourquerywilllooklikethefollowing:
curl-XGET'http://localhost:9200/posts/_search?
routing=12,6654&q=userId:12+AND+section:6654'
Ofcourse,theprecedingcommandcanmatchmultipleshardsnowasthevaluesgiventoroutingcanpointtomultipleshards.Becauseofthatyouneedtoprovideonlyasingleroutingvalueduringindexation(Elasticsearchneedstobepointedtoasingleshardorindexationwillfail).Youcanofcoursequerymultipleshardsatthesametimeandbecauseofthatmultipleroutingvaluescanbeprovidedduringsearching.
NoteRememberthatroutingisnottheonlythingthatisrequiredtogetresultsforagivenuser.That’sbecauseusuallywehavefewshardsthathaveuniqueroutingvalues.Thismeansthatwewillhavedatafrommultipleusersinasingleshard.So,whenusingrouting,youshouldalsonarrowdownyourresultstotheonesforagivenuser.You’lllearnmoreabouthowyoucandothatinChapter3,SearchingYourData.
www.EBooksWorld.ir
![Page 174: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/174.jpg)
RoutingfieldsSpecifyingtheroutingvaluewitheachrequestiscriticalwhenusinganindexoperation.Withoutit,Elasticsearchusesthedefaultwayofdeterminingwherethedocumentshouldbestored–itusesthehashvalueofthedocumentidentifier.Thismayleadtoasituationwhereonedocumentexistsinmanyversionsondifferentshards.Asimilarsituationmayoccurwhenfetchingthedocument.Whenadocumentisstoredwithagivenroutingvalue,wemayhitthewrongshardandthedocumentmaybenotfound.
Infact,Elasticsearchallowsustochangethedefaultbehaviorandforcesustouseroutingwhenqueryingagivenindex.Todothat,weneedtoaddthefollowingsectiontoourtypedefinition:
"_routing":{
"required":true
}
Theprecedingdefinitionmeansthattheroutingvalueneedstobeprovided(the"required":trueproperty);withoutit,anindexrequestwillfail.
www.EBooksWorld.ir
![Page 175: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/175.jpg)
www.EBooksWorld.ir
![Page 176: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/176.jpg)
SummaryInthischapter,we’velearnedalotwhenitcomestoindexationanddatahandlinginElasticsearch.WestartedwithbasicinformationaboutElasticsearchandweproceededtotuningtheschema-lessbehaviorinElasticsearch.Welearnedhowtoconfigureourmappings,useoutoftheboxlanguageanalysiscapabilitiesofElasticsearch,andcreateourownmappings.Welookedatbatchindexingtospeedupindexationandweaddedadditionalinternalinformationtothedocumentsinourindices.Finally,welookedatsegmentmergingandrouting.
Inthenextchapter,wewillfullyconcentrateonsearchingandtheextensivequerylanguageofElasticsearch.WewillstartwithhowtoqueryElasticsearchandhowtheElasticsearchqueryprocessworks.Wewilllearnaboutallthebasicqueriesandcompoundqueriestobeabletousetheminourapplications.Finally,wewillseewhichqueryshouldbechosenforthegivenusecase.
www.EBooksWorld.ir
![Page 177: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/177.jpg)
www.EBooksWorld.ir
![Page 178: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/178.jpg)
Chapter3.SearchingYourDataInthepreviouschapter,wedivedintoElasticsearchindexing.Welearnedalotwhenitcomestodatahandling.WesawhowtotuneElasticsearchschema-lessmechanismandwenowknowhowtocreateourownmappings.WealsosawthecoretypesofElasticsearchandweusedanalyzers–boththeonethatcomesoutoftheboxwithElasticsearchandtheonewedefinedourselves.Weusedbulkindexingandweaddedadditionalinternalinformationtoourindices.Finally,welearnedwhatsegmentmergingis,howwecanfinetuneit,andhowtouseroutinginElasticsearchandwhatitgivesus.Thischapterisfullydedicatedtoquerying.Bytheendofthischapter,youwillhavelearnedthefollowingtopics:
HowtoqueryElasticsearchWhathappensinternallywhenqueriesarerunWhatarethebasicqueriesinElasticsearchWhatarethecompoundqueriesinElasticsearchthatallowustogroupotherqueriesHowtousepositionawarequeries–spanqueriesHowtochoosetherightqueryforthejob
www.EBooksWorld.ir
![Page 179: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/179.jpg)
QueryingElasticsearchSofar,whenwehavesearchedourdata,weusedtheRESTAPIandasimplequeryortheGETrequest.Similarly,whenwewerechangingtheindex,wealsousedtheRESTAPIandsenttheJSON-structureddatatoElasticsearch.Regardlessofthetypeofoperationwewantedtoperform,whetheritwasamappingchangeordocumentindexation,weusedJSONstructuredrequestbodytoinformElasticsearchabouttheoperationdetails.
AsimilarsituationhappenswhenwewanttosendmorethanasimplequerytoElasticsearch,westructureitusingtheJSONobjectsandsendittoElasticsearchintherequestbody.ThisiscalledthequeryDSL.Inabroaderview,Elasticsearchsupportstwokindsofqueries:basiconesandcompoundones.Basicqueries,suchasthetermquery,areusedforqueryingtheactualdata.WewillcovertheseintheBasicqueriessectionofthischapter.Thesecondtypeofqueryisthecompoundquery,suchastheboolquery,whichcancombinemultiplequeries.WewillcovertheseintheCompoundqueriessectionofthischapter.
However,thisisnotthewholepicture.Inadditiontothesetwotypesofqueries,certainqueriescanhavefiltersthatareusedtonarrowdownyourresultswithcertaincriteria.Filterqueriesdon’taffectscoringandareusuallyveryefficientandeasilycached.
Tomakeitevenmorecomplicated,queriescancontainotherqueries(don’tworry;wewilltrytoexplainallthis!).Furthermore,somequeriescancontainfiltersandotherscancontainbothqueriesandfilters.Althoughthisisnoteverything,wewillstickwiththisworkingexplanationfornow.WewillgooverthisingreaterdetailintheCompoundqueriessectioninthischapterandtheFilteringyourresultssectioninChapter4,ExtendingYourQueryingKnowledge.
www.EBooksWorld.ir
![Page 180: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/180.jpg)
TheexampledataIfnotstatedotherwise,thefollowingmappingswillbeusedfortherestofthechapter:
{
"book":{
"properties":{
"author":{
"type":"string"
},
"characters":{
"type":"string"
},
"copies":{
"type":"long",
"ignore_malformed":false
},
"otitle":{
"type":"string"
},
"tags":{
"type":"string",
"index":"not_analyzed"
},
"title":{
"type":"string"
},
"year":{
"type":"long",
"ignore_malformed":false,
"index":"analyzed"
},
"available":{
"type":"boolean"
}
}
}
}
Theprecedingmappingsrepresentasimplelibraryandwereusedtocreatethelibraryindex.OnethingtorememberisthatElasticsearchwillanalyzethestringbasedfieldsifwedon’tconfigureitdifferently.
Theprecedingmappingswerestoredinthemapping.jsonfileand,inordertocreatethementionedlibraryindex,wecanusethefollowingcommands:
curl-XPOST'localhost:9200/library'
curl-XPUT'localhost:9200/library/book/_mapping'[email protected]
Wealsousedthefollowingsampledataastheexampleonesforthischapter:
{"index":{"_index":"library","_type":"book","_id":"1"}}
{"title":"AllQuietontheWesternFront","otitle":"ImWestennichts
Neues","author":"ErichMariaRemarque","year":1929,"characters":["Paul
Bäumer","AlbertKropp","HaieWesthus","FredrichMüller","Stanislaus
www.EBooksWorld.ir
![Page 181: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/181.jpg)
Katczinsky","Tjaden"],"tags":["novel"],"copies":1,"available":true,
"section":3}
{"index":{"_index":"library","_type":"book","_id":"2"}}
{"title":"Catch-22","author":"JosephHeller","year":1961,"characters":
["JohnYossarian","CaptainAardvark","ChaplainTappman","Colonel
Cathcart","DoctorDaneeka"],"tags":["novel"],"copies":6,"available":
false,"section":1}
{"index":{"_index":"library","_type":"book","_id":"3"}}
{"title":"TheCompleteSherlockHolmes","author":"ArthurConan
Doyle","year":1936,"characters":["SherlockHolmes","Dr.Watson","G.
Lestrade"],"tags":[],"copies":0,"available":false,"section":12}
{"index":{"_index":"library","_type":"book","_id":"4"}}
{"title":"CrimeandPunishment","otitle":"Преступлéниеи
наказáние","author":"FyodorDostoevsky","year":1886,"characters":
["Raskolnikov","SofiaSemyonovnaMarmeladova"],"tags":[],"copies":0,
"available":true}
Westoredoursampledatainthedocuments.jsonfileandweusethefollowingcommandtoindexit:
curl-s-XPOST'localhost:9200/_bulk'[email protected]
Thiscommandrunsbulkindexing.YoucanlearnmoreaboutitintheBatchindexingtospeedupyourindexingprocesssectioninChapter2,IndexingYourData.
www.EBooksWorld.ir
![Page 182: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/182.jpg)
AsimplequeryThesimplestwaytoqueryElasticsearchistousetheURIrequestquery.WealreadydiscusseditintheSearchingwiththeURIrequestquerysectionofChapter1,GettingStartedwithElasticsearchCluster.Forexample,tosearchforthewordcrimeinthetitlefield,youcouldsendaqueryusingthefollowingcommand:
curl-XGET'localhost:9200/library/book/_search?q=title:crime&pretty'
Thisisaverysimple,butlimited,wayofsubmittingqueriestoElasticsearch.IfwelookfromthepointofviewoftheElasticsearchqueryDSL,theprecedingqueryisaquery_stringquery.Itsearchesforthedocumentsthathavethetermcrimeinthetitlefieldandcanberewrittenasfollows:
{
"query":{
"query_string":{"query":"title:crime"}
}
}
SendingaqueryusingthequeryDSLisabitdifferent,butstillnotrocketscience.WesendtheGET(POSTisalsoacceptedincaseyourtoolorlibrarydoesn’tallowsendingrequestbodyinHTTPGETrequests)HTTPrequesttothe_searchRESTendpointasearlierandincludethequeryintherequestbody.Let’stakealookatthefollowingcommand:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"query":{
"query_string":{"query":"title:crime"}
}
}'
Asyoucansee,weusedtherequestbody(the-dswitch)tosendthewholeJSON-structuredquerytoElasticsearch.TheprettyrequestparametertellsElasticsearchtostructuretheresponseinsuchawaythatwehumanscanreaditmoreeasily.Inresponsetotheprecedingcommand,wegetthefollowingoutput:
{
"took":4,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":0.5,
"hits":[{
"_index":"library",
"_type":"book",
"_id":"4",
"_score":0.5,
www.EBooksWorld.ir
![Page 183: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/183.jpg)
"_source":{
"title":"CrimeandPunishment",
"otitle":"Преступлéниеинаказáние",
"author":"FyodorDostoevsky",
"year":1886,
"characters":["Raskolnikov","SofiaSemyonovnaMarmeladova"],
"tags":[],
"copies":0,
"available":true
}
}]
}
}
Nice!WegotourfirstsearchresultswiththequeryDSL.
www.EBooksWorld.ir
![Page 184: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/184.jpg)
PagingandresultsizeElasticsearchallowsustocontrolhowmanyresultswewanttoget(atmost)andfromwhichresultwewanttostart.Thefollowingarethetwoadditionalpropertiesthatcanbesetintherequestbody:
from:Thispropertyspecifiesthedocumentthatwewanttohaveourresultsfrom.Itsdefaultvalueis0,whichmeansthatwewanttogetourresultsfromthefirstdocument.size:Thispropertyspecifiesthemaximumnumberofdocumentswewantastheresultofasinglequery(whichdefaultsto10).Forexample,ifweareonlyinterestedinaggregationsresultsanddon’tcareaboutthedocumentsreturnedbythequery,wecansetthisparameterto0.
Ifwewantourquerytogetdocumentsstartingfromthetenthitemonthelistandfetch20documents,wesendthefollowingquery:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"from":9,
"size":20,
"query":{
"query_string":{"query":"title:crime"}
}
}'
TipDownloadingtheexamplecode
Youcandownloadtheexamplecodefilesforthisbookfromyouraccountathttp://www.packtpub.com.Ifyoupurchasedthisbookelsewhere,youcanvisithttp://www.packtpub.com/supportandregistertohavethefilese-maileddirectlytoyou.
Youcandownloadthecodefilesbyfollowingthesesteps:
Loginorregistertoourwebsiteusingyoure-mailaddressandpasswordHoverthemousepointerontheSUPPORTtabatthetopClickonCodeDownloads&ErrataEnterthenameofthebookintheSearchboxSelectthebookforwhichyou’relookingtodownloadthecodefilesChoosefromthedrop-downmenuwhereyoupurchasedthisbookfromClickonCodeDownload
Oncethefileisdownloaded,makesurethatyouunziporextractthefolderusingthelatestversionof:
WinRAR/7-ZipforWindowsZipeg/iZip/UnRarXforMac7-Zip/PeaZipforLinux
www.EBooksWorld.ir
![Page 185: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/185.jpg)
ReturningtheversionvalueInadditiontoalltheinformationreturned,Elasticsearchcanreturntheversionofthedocument(wementionedaboutversioninginChapter1,GettingStartedwithElasticsearchCluster.Todothis,weneedtoaddtheversionpropertywiththevalueoftruetothetoplevelofourJSONobject.So,thefinalquery,whichrequeststheversioninformation,willlookasfollows:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"version":true,
"query":{
"query_string":{"query":"title:crime"}
}
}'
Afterrunningtheprecedingquery,wegetthefollowingresults:
{
"took":4,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":0.5,
"hits":[{
"_index":"library",
"_type":"book",
"_id":"4",
"_version":1,
"_score":0.5,
"_source":{
"title":"CrimeandPunishment",
"otitle":"Преступлéниеинаказáние",
"author":"FyodorDostoevsky",
"year":1886,
"characters":["Raskolnikov","SofiaSemyonovnaMarmeladova"],
"tags":[],
"copies":0,
"available":true
}
}]
}
}
Asyoucansee,the_versionsectionispresentforthesinglehitwegot.
www.EBooksWorld.ir
![Page 186: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/186.jpg)
LimitingthescoreFornonstandardusecases,Elasticsearchprovidesafeaturethatletsusfiltertheresultsonthebasisofaminimumscorevaluethatthedocumentmusthavetobeconsideredamatch.Inordertousethisfeature,wemustprovidethemin_scorevalueatthetoplevelofourJSONobjectwiththevalueoftheminimumscore.Forexample,ifwewantourquerytoonlyreturndocumentswithascorehigherthan0.75,wesendthefollowingquery:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"min_score":0.75,
"query":{
"query_string":{"query":"title:crime"}
}
}'
Wegetthefollowingresponseafterrunningtheprecedingquery:
{
"took":3,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":0,
"max_score":null,
"hits":[]
}
}
Ifyoulookatthepreviousexamples,thescoreofourdocumentwas0.5,whichislowerthan0.75,andthuswedidn’tgetanydocumentsinresponse.
Limitingthescoreusuallydoesn’tmakemuchsensebecausecomparingscoresbetweenthequeriesisquitehard.However,maybeinyourcase,thisfunctionalitywillbeneeded.
www.EBooksWorld.ir
![Page 187: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/187.jpg)
ChoosingthefieldsthatwewanttoreturnWiththeuseofthefieldsarrayintherequestbody,Elasticsearchallowsustodefinewhichfieldstoincludeintheresponse.Rememberthatyoucanonlyreturnthesefieldsiftheyaremarkedasstoredinthemappingsusedtocreatetheindex,orifthe_sourcefieldwasused(Elasticsearchusesthe_sourcefieldtoprovidethestoredvaluesandthe_sourcefieldisturnedonbydefault).
So,forexample,toreturnonlythetitleandtheyearfieldsintheresults(foreachdocument),sendthefollowingquerytoElasticsearch:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"fields":["title","year"],
"query":{
"query_string":{"query":"title:crime"}
}
}'
Inresponse,wegetthefollowingoutput:
{
"took":5,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":0.5,
"hits":[{
"_index":"library",
"_type":"book",
"_id":"4",
"_score":0.5,
"fields":{
"title":["CrimeandPunishment"],
"year":[1886]
}
}]
}
}
Asyoucansee,everythingworkedaswewantedto.Therearefourthingswewouldliketosharewithyouatthispoint,whichareasfollows:
Ifwedon’tdefinethefieldsarray,itwillusethedefaultvalueandreturnthe_sourcefieldifavailable.Ifweusethe_sourcefieldandrequestafieldthatisnotstored,thenthatfieldwillbeextractedfromthe_sourcefield(however,thisrequiresadditionalprocessing).Ifwewanttoreturnallthestoredfields,wejustpassanasterisk(*)asthefieldname.Fromaperformancepointofview,it’sbettertoreturnthe_sourcefieldinsteadof
www.EBooksWorld.ir
![Page 188: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/188.jpg)
multiplestoredfields.Thisisbecausegettingmultiplestoredfieldsmaybeslowercomparedtoretrievingasingle_sourcefield.
www.EBooksWorld.ir
![Page 189: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/189.jpg)
SourcefilteringInadditiontochoosingwhichfieldsarereturned,Elasticsearchallowsustouseso-calledsourcefiltering.Thisfunctionalityallowsustocontrolwhichfieldsarereturnedfromthe_sourcefield.Elasticsearchexposesseveralwaystodothis.Thesimplestsourcefilteringallowsustodecidewhetheradocumentshouldbereturnedornot.Considerthefollowingquery:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"_source":false,
"query":{
"query_string":{"query":"title:crime"}
}
}'
TheresultretunedbyElasticsearchshouldbesimilartothefollowingone:
{
"took":12,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":0.5,
"hits":[{
"_index":"library",
"_type":"book",
"_id":"4",
"_score":0.5
}]
}
}
Notethattheresponseislimitedtobaseinformationaboutadocumentandthe_sourcefieldwasnotincluded.IfyouuseElasticsearchasasecondsourceofdataandcontentofthedocumentisservedfromSQLdatabaseorcache,thedocumentidentifierisallyouneed.
Thesecondwayissimilartothatdescribedintheprecedingfields,althoughwedefinewhichfieldsshouldbereturnedinthedocumentsourceitself.Let’sseethatusingthefollowingexamplequery:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"_source":["title","otitle"],
"query":{
"query_string":{"query":"title:crime"}
}
}'
Wewantedtogetthetitleandtheotitledocumentfieldsinthereturned_sourcefield.
www.EBooksWorld.ir
![Page 190: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/190.jpg)
Elasticsearchextractedthosevaluesfromtheoriginal_sourcevalueandincludedthe_sourcefieldonlywiththerequestedfields.ThewholeresponsereturnedbyElasticsearchlookedasfollows:
{
"took":2,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":0.5,
"hits":[{
"_index":"library",
"_type":"book",
"_id":"4",
"_score":0.5,
"_source":{
"otitle":"Преступлéниеинаказáние",
"title":"CrimeandPunishment"
}
}]
}
}
Wecanalsouseanasterisktoselectwhichfieldsshouldbereturnedinthe_sourcefield;forexample,title*willreturnvaluesforthetitlefieldandfortitle10(ifwehavesuchfieldinourdata).Ifwehavedocumentswithnestedparts,wecanusenotationwithadot;forexample,title.*toselectallthefieldsnestedunderthetitleobject.
Finally,wecanalsospecifyexplicitlywhichfieldswewanttoincludeandwhichtoexcludefromthe_sourcefield.Wecanincludefieldsusingtheincludepropertyandwecanexcludefieldsusingtheexcludeproperty(bothofthemarearraysofvalues).Forexample,ifwewantthereturned_sourcefieldtoincludeallthefieldsstartingwiththelettertbutnotthetitlefield,wewillrunthefollowingquery:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"_source":{
"include":["t*"],
"exclude":["title"]
},
"query":{
"query_string":{"query":"title:crime"}
}
}'
www.EBooksWorld.ir
![Page 191: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/191.jpg)
UsingthescriptfieldsElasticsearchallowsustousescript-evaluatedvaluesthatwillbereturnedwiththeresultdocuments(wewilldiscussElasticsearchscriptingcapabilitiesingreaterdetailintheScriptingcapabilitiesofElasticsearchsectioninChapter6,MakeYourSearchBetter).Tousethescriptfieldsfunctionality,weaddthescript_fieldssectiontoourJSONqueryobjectandanobjectwithanameofourchoiceforeachscriptedvaluethatwewanttoreturn.Forexample,toreturnavaluenamedcorrectYear,whichiscalculatedastheyearfieldminus1800,werunthefollowingquery:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"script_fields":{
"correctYear":{
"script":"doc[\"year\"].value-1800"
}
},
"query":{
"query_string":{"query":"title:crime"}
}
}'
NoteBydefault,Elasticsearchdoesn’tallowustousedynamicscripting.Ifyoutriedtheprecedingquery,youprobablygotanerrorwithinformationstatingthatthescriptsoftype[inline]withoperation[search]andlanguage[groovy]aredisabled.Tomakethisexamplework,youshouldaddthescript.inline:onpropertytotheelasticsearch.ymlfile.However,thisexposesasecuritythreat.MakesuretoreadtheScriptingcapabilitiesofElasticsearchsectioninChapter6,MakeYourSearchBetter,tolearnabouttheconsequences.
Usingthedocnotation,likewedidintheprecedingexample,allowsustocatchtheresultsreturnedandspeedupscriptexecutionatthecostofhighermemoryconsumption.Wealsogetlimitedtosingle-valuedandsingletermfields.Ifwecareaboutmemoryusage,orifweareusingmorecomplicatedfieldvalues,wecanalwaysusethe_sourcefield.Thesamequeryusingthe_sourcefieldlooksasfollows:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"script_fields":{
"correctYear":{
"script":"_source.year-1800"
}
},
"query":{
"query_string":{"query":"title:crime"}
}
}'
ThefollowingresponseisreturnedbyElasticsearchwithdynamicscriptingenabled:
{
"took":76,
www.EBooksWorld.ir
![Page 192: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/192.jpg)
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":0.5,
"hits":[{
"_index":"library",
"_type":"book",
"_id":"4",
"_score":0.5,
"fields":{
"correctYear":[86]
}
}]
}
}
Asyoucansee,wegotthecalculatedcorrectYearfieldinresponse.
PassingparameterstothescriptfieldsLet’stakealookatonemorefeatureofthescriptfields-thepassingofadditionalparameters.Insteadofhavingthevalue1800intheequation,wecanuseavariablenameandpassitsvalueintheparamssection.Ifwedothis,ourquerywilllookasfollows:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"script_fields":{
"correctYear":{
"script":"_source.year-paramYear",
"params":{
"paramYear":1800
}
}
},
"query":{
"query_string":{"query":"title:crime"}
}
}'
Asyoucansee,weaddedtheparamYearvariableaspartofthescriptedequationandprovideditsvalueintheparamssection.ThisallowsElasticsearchtoexecutethesamescriptwithdifferentparametervaluesinaslightlymoreefficientway.
www.EBooksWorld.ir
![Page 193: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/193.jpg)
www.EBooksWorld.ir
![Page 194: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/194.jpg)
UnderstandingthequeryingprocessAfterreadingtheprevioussection,wenowknowhowqueryingworksinElasticsearch.YouknowthatElasticsearch,inmostcases,needstoscatterthequeryacrossmultiplenodes,gettheresults,mergethem,fetchtherelevantdocumentsfromoneormoreshards,andreturnthefinalresultstotheclientrequestingthedocuments.Whatwedidn’ttalkaboutaretwoadditionalthingsthatdefinehowqueriesbehave:searchtypeandqueryexecutionpreference.WewillnowconcentrateonthesefunctionalitiesofElasticsearch.
www.EBooksWorld.ir
![Page 195: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/195.jpg)
QuerylogicElasticsearchisadistributedsearchengineandsoallfunctionalityprovidedmustbedistributedinitsnature.Itisexactlythesamewithquerying.Becausewewouldliketodiscusssomemoreadvancedtopicsonhowtocontrolthequeryprocess,wefirstneedtoknowhowitworks.
Let’snowgetbacktohowqueryingworks.Westartedthetheoryinthefirstchapterandwewouldliketogetbacktoit.Bydefault,ifwedon’talteranything,thequeryprocesswillconsistoftwophases:thescatterandthegatherphase.Theaggregatornode(theonethatreceivestherequest)willrunthescatterphasefirst.Duringthatphase,thequeryisdistributedtoalltheshardsthatourindexisbuiltfrom(ofcourseifroutingisnotused).Forexample,ifitisbuiltof5shardsand1replicathen5physicalshardswillbequeried(wedon’tneedtoqueryashardanditsreplicaastheycontainthesamedata).Eachofthequeriedshardswillonlyreturnthedocumentidentifierandthescoreofthedocument.Thenodethatsentthescatterquerywillwaitforalltheshardstocompletetheirtask,gathertheresults,andsortthemappropriately(inthiscase,fromtopscoringtothelowestscoringones).
Afterthat,anewrequestwillbesenttobuildthesearchresults.However,nowonlytothoseshardsthatheldthedocumentstobuildtheresponse.Inmostcases,Elasticsearchwon’tsendtherequesttoalltheshardsbuttoitssubset.That’sbecauseweusuallydon’tgetthecompleteresultofthequerybutonlyaportionofit.Thisphaseiscalledthegatherphase.Afterallthedocumentsaregathered,thefinalresponseisbuiltandreturnedasthequeryresult.ThisisthebasicanddefaultElasticsearchbehaviorbutwecanchangeit.
www.EBooksWorld.ir
![Page 196: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/196.jpg)
SearchtypeElasticsearchallowsustochoosehowwewantourquerytobeprocessedinternally.Wecandothatbyspecifyingthesearchtype.Therearedifferentsituationswheredifferentsearchtypesareappropriate:sometimesonecancareonlyabouttheperformancewhilesometimesqueryrelevanceisthemostimportantfactor.YoushouldrememberthateachshardisasmallLuceneindexand,inordertoreturnmorerelevantresults,someinformation,suchasfrequencies,needstobetransferredbetweentheshards.Tocontrolhowthequeriesareexecuted,wecanpassthesearch_typerequestparameterandsetittooneofthefollowingvalues:
query_then_fetch:Inthefirststep,thequeryisexecutedtogettheinformationneededtosortandrankthedocuments.Thisstepisexecutedagainstalltheshards.Thenonlytherelevantshardsarequeriedfortheactualcontentofthedocuments.Thisisthesearchtypeusedbydefaultifnosearchtypeisprovidedwiththequeryandthisisthequerytypewedescribedpreviously.dfs_query_then_fetch:Thisissimilartoquery_then_fetch.However,itcontainsanadditionalqueryphasecomparingtoquery_then_fetchwhichcalculatesdistributedtermfrequencies.
Therearealsotwodeprecatedsearchtypes:countandscan.ThefirstoneisdeprecatedstartingfromElasticsearch2.0andthesecondonestartingwithElasticsearch2.1.Thefirstsearchtypeusedtoprovidebenefitswhereonlyaggregationsorthenumberofdocumentswasrelevant,butnowitisenoughtoaddsizeequalto0toyourqueries.Thescanrequestwasusedforscrollingfunctionality.
Soifwewouldliketousethesimplestsearchtype,wewouldrunthefollowingcommand:
curl-XGET'localhost:9200/library/book/_search?
pretty&search_type=query_then_fetch'-d'{
"query":{
"term":{"title":"crime"}
}
}'
www.EBooksWorld.ir
![Page 197: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/197.jpg)
SearchexecutionpreferenceInadditiontothepossibilityofcontrollinghowthequeryisexecuted,wecanalsocontrolonwhichshardstoexecutethequery.Bydefault,Elasticsearchusesshardsandreplicasonanynodeinaroundrobinmanner–sothateachshardisqueriedasimilarnumberoftimes.Thedefaultbehavioristhepropermethodofshardexecutionpreferenceformostusecases.Buttheremaybetimeswhenwewanttochangethedefaultbehavior.Forexample,youmaywantthesearchtoonlybeexecutedontheprimaryshards.Todothat,wecansetthepreferencerequestparametertooneofthefollowingvalues:
_primary:Theoperationwillbeonlyexecutedontheprimaryshards,sothereplicaswon’tbeused.Thiscanbeusefulwhenweneedtousethelatestinformationfromtheindexbutourdataisnotreplicatedrightaway._primary_first:Theoperationwillbeexecutedontheprimaryshardsiftheyareavailable.Ifnot,itwillbeexecutedontheothershards._replica:Theoperationwillbeexecutedonlyonthereplicashards._replica_first:Thisoperationissimilarto_primary_first,butusesreplicashards.Theoperationwillbeexecutedonthereplicashardsifpossibleandontheprimaryshardsifthereplicasarenotavailable._local:Theoperationwillbeexecutedontheshardsavailableonthenodewhichtherequestwassentfromand,ifsuchshardsarenotpresent,therequestwillbeforwardedtotheappropriatenodes._only_node:node_id:Thisoperationwillbeexecutedonthenodewiththeprovidednodeidentifier._only_nodes:nodes_spec:Thisoperationwillbeexecutedonthenodesthataredefinedinnodes_spec.ThiscanbeanIPaddress,aname,anameorIPaddressusingwildcards,andsoon.Forexample,ifnodes_specissetto192.168.1.*,theoperationwillberunonthenodeswithIPaddressesstartingwith192.168.1._prefer_node:node_id:Elasticsearchwilltrytoexecutetheoperationonthenodewiththeprovidedidentifier.However,ifthenodeisnotavailable,itwillbeexecutedonthenodesthatareavailable._shards:1,2:Elasticsearchwillexecutetheoperationontheshardswiththegivenidentifiers;inthiscase,onshardswithidentifiers1and2.The_shardsparametercanbecombinedwithotherpreferences,buttheshardsidentifiersneedtobeprovidedfirst.Forexample,_shards:1,2;_local.Customvalue:Anycustom,stringvaluemaybepassed.Requestswiththesamevaluesprovidedwillbeexecutedonthesameshards.
Forexample,ifwewouldliketoexecuteaqueryonlyonthelocalshards,wewouldrunthefollowingcommand:
curl-XGET'localhost:9200/library/_search?pretty&preference=_local'-d'{
"query":{
"term":{"title":"crime"}
}
}'
www.EBooksWorld.ir
![Page 198: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/198.jpg)
SearchshardsAPIWhendiscussingthesearchpreference,wewouldalsoliketomentionthesearchshardsAPIexposedbyElasticsearch.ThisAPIallowsustocheckwhichshardsthequerywillbeexecutedon.InordertousethisAPI,runarequestagainstthesearch_shardsrestendpoint.Forexample,toseehowthequerywillbeexecuted,werunthefollowingcommand:
curl-XGET'localhost:9200/library/_search_shards?pretty'-d
'{"query":"match_all":{}}'
Theresponsetotheprecedingcommandwillbeasfollows:
{
"nodes":{
"my0DcA_MTImm4NE3cG3ZIg":{
"name":"Cloud9",
"transport_address":"127.0.0.1:9300",
"attributes":{}
}
},
"shards":[[{
"state":"STARTED",
"primary":true,
"node":"my0DcA_MTImm4NE3cG3ZIg",
"relocating_node":null,
"shard":0,
"index":"library",
"version":4,
"allocation_id":{
"id":"9ayLDbL1RVSyJRYIJkuAxg"
}
}],[{
"state":"STARTED",
"primary":true,
"node":"my0DcA_MTImm4NE3cG3ZIg",
"relocating_node":null,
"shard":1,
"index":"library",
"version":4,
"allocation_id":{
"id":"wfpvtaLER-KVyOsuD46Yqg"
}
}],[{
"state":"STARTED",
"primary":true,
"node":"my0DcA_MTImm4NE3cG3ZIg",
"relocating_node":null,
"shard":2,
"index":"library",
"version":4,
"allocation_id":{
"id":"zrLPWhCOSTmjlb8TY5rYQA"
}
}],[{
www.EBooksWorld.ir
![Page 199: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/199.jpg)
"state":"STARTED",
"primary":true,
"node":"my0DcA_MTImm4NE3cG3ZIg",
"relocating_node":null,
"shard":3,
"index":"library",
"version":4,
"allocation_id":{
"id":"efnvY7YcSz6X8X8USacA7g"
}
}],[{
"state":"STARTED",
"primary":true,
"node":"my0DcA_MTImm4NE3cG3ZIg",
"relocating_node":null,
"shard":4,
"index":"library",
"version":4,
"allocation_id":{
"id":"XJHW2J63QUKdh3bK3T2nzA"
}
}]]
}
Asyoucansee,intheresponsereturnedbyElasticsearch,wehavetheinformationabouttheshardsthatwillbeusedduringthequeryprocess.Ofcourse,withthesearchshardsAPI,wecanuseadditionalparametersthatcontrolthequeryingprocess.Thesepropertiesarerouting,preference,andlocal.Wearealreadyfamiliarwiththefirsttwo.ThelocalparameterisaBoolean(valuestrueorfalse),onethatallowsustotellElasticsearchtousetheclusterstateinformationstoredonthelocalnode(settinglocaltotrue)insteadoftheonefromthemasternode(settinglocaltofalse).Thisallowsustodiagnoseproblemswithclusterstatesynchronization.
www.EBooksWorld.ir
![Page 200: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/200.jpg)
www.EBooksWorld.ir
![Page 201: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/201.jpg)
BasicqueriesElasticsearchhasextensivesearchanddataanalysiscapabilitiesthatareexposedinformsofdifferentqueries,filters,aggregates,andsoon.Inthissection,wewillconcentrateonthebasicqueriesprovidedbyElasticsearch.Bybasicquerieswemeantheonesthatdon’tcombinetheotherqueriestogetherbutrunontheirown.
www.EBooksWorld.ir
![Page 202: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/202.jpg)
ThetermqueryThetermqueryisoneofthesimplestqueriesinElasticsearch.Itjustmatchesthedocumentthathasaterminagivenfield-theexact,notanalyzedterm.Thesimplesttermqueryisasfollows:
{
"query":{
"term":{
"title":"crime"
}
}
}
Itwillmatchthedocumentsthathavethetermcrimeinthetitlefield.Rememberthatthetermqueryisnotanalyzed,soyouneedtoprovidetheexacttermthatwillmatchthetermintheindexeddocument.Notethatinourinputdata,wehavethetitlefieldwiththevalueofCrimeandPunishment(uppercased),butwearesearchingforcrime,becausetheCrimetermsbecomescrimeafteranalysisduringindexing.
Inadditiontothetermwewanttofind,wecanalsoincludetheboostattributetoourtermquery,whichwillaffecttheimportanceofthegiventerm.WewilltalkmoreaboutboostsintheIntroductiontoApacheLucenescoringsectionofChapter6,MakeYourSearchBetter.Fornow,wejustneedtorememberthatitchangestheimportanceofthegivenpartofthequery.
Forexample,tochangeourpreviousqueryandgiveourtermqueryaboostof10.0,sendthefollowingquery:
{
"query":{
"term":{
"title":{
"value":"crime",
"boost":10.0
}
}
}
}
Asyoucansee,thequerychangedabit.Insteadofasimpletermvalue,wenestedanewJSONobjectwhichcontainsthevaluepropertyandtheboostproperty.Thevalueofthevaluepropertyshouldcontainthetermweareinterestedinandtheboostpropertyistheboostvaluewewanttouse.
www.EBooksWorld.ir
![Page 203: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/203.jpg)
ThetermsqueryThetermsqueryisanextensiontothetermquery.Itallowsustomatchdocumentsthathavecertaintermsintheircontentsinsteadofasingleterm.Thetermqueryallowedustomatchasingle,notanalyzedtermandthetermsqueryallowsustomatchmultipleofthose.Forexample,let’ssaythatwewanttogetallthedocumentsthathavethetermsnovelorbookinthetagsfield.Toachievethis,wewillrunthefollowingquery:
{
"query":{
"terms":{
"tags":["novel","book"]
}
}
}
Theprecedingqueryreturnsallthedocumentsthathaveoneorbothofthesearchedtermsinthetagsfield.Thisisakeypointtoremember–thetermsquerywillfinddocumentshavinganyoftheprovidedterms.
www.EBooksWorld.ir
![Page 204: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/204.jpg)
ThematchallqueryThematchallqueryisoneofthesimplestqueriesavailableinElasticsearch.Itallowsustomatchallofthedocumentsintheindex.Ifwewanttogetallthedocumentsfromourindex,wejustrunthefollowingquery:
{
"query":{
"match_all":{}
}
}
Wecanalsoincludeboostinthequery,whichwillbegiventoallthedocumentsmatchedbyit.Forexample,ifwewanttoaddaboostof2.0toallthedocumentsinourmatchallquery,wewillsendthefollowingquerytoElasticsearch:
{
"query":{
"match_all":{
"boost":2.0
}
}
}
www.EBooksWorld.ir
![Page 205: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/205.jpg)
ThetypequeryAverysimplequerythatallowsustofindallthedocumentswithacertaintype.Forexample,ifwewouldliketosearchforallthedocumentswiththebooktypeinourlibraryindex,wewillrunthefollowingquery:
{
"query":{
"type":{
"value":"book"
}
}
}
www.EBooksWorld.ir
![Page 206: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/206.jpg)
TheexistsqueryAquerythatallowsustofindallthedocumentsthathaveavalueinthedefinedfield.Forexample,tofindthedocumentsthathaveavalueinthetagsfield,wewillrunthefollowingquery:
{
"query":{
"exists":{
"field":"tags"
}
}
}
www.EBooksWorld.ir
![Page 207: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/207.jpg)
ThemissingqueryOppositetotheexistsquery,themissingqueryreturnsthedocumentsthathaveanullvalueornovalueatallinagivenfield.Forexample,tofindallthedocumentsthatdon’thaveavalueinthetagsfield,wewillrunthefollowingquery:
{
"query":{
"missing":{
"field":"tags"
}
}
}
www.EBooksWorld.ir
![Page 208: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/208.jpg)
ThecommontermsqueryThecommontermsqueryisamodernElasticsearchsolutionforimprovingqueryrelevanceandprecisionwithcommonwordswhenwearenotusingstopwords(http://en.wikipedia.org/wiki/Stop_words).Forexample,acrimeandpunishmentqueryresultsinthreetermqueriesandeachofthemhaveacostintermsofperformance.However,theandtermisaverycommononeanditsimpactonthedocumentscorewillbeverylow.Thesolutionisthecommontermsquerywhichdividesthequeryintotwogroups.Thefirstgroupistheonewithimportantterms,whicharetheonesthathavelowerfrequency.Thesecondgroupistheonewithlessimportantterms,whicharetheoneswithhighfrequency.ThefirstqueryisexecutedfirstandElasticsearchcalculatesthescoreforallofthetermsfromthefirstgroup.Thiswaythelowfrequencyterms,whichareusuallytheonesthathavemoreimportance,arealwaystakenintoconsideration.ThenElasticsearchexecutesthesecondqueryforthesecondgroupofterms,butcalculatesthescoreonlyforthedocumentsmatchedforthefirstquery.Thiswaythescoreisonlycalculatedfortherelevantdocumentsandthushigherperformancecanbeachieved.
Anexampleofthecommontermsqueryisasfollows:
{
"query":{
"common":{
"title":{
"query":"crimeandpunishment",
"cutoff_frequency":0.001
}
}
}
}
Thequerycantakethefollowingparameters:
query:Theactualquerycontents.cutoff_frequency:Thepercentage(0.001means0.1%)oranabsolutevalue(whenpropertyissettoavalueequaltoorlargerthan1).Highandlowfrequencygroupsareconstructedusingthisvalue.Settingthisparameterto0.001meansthatthelowfrequencytermsgroupwillbeconstructedfortermshavingafrequencyof0.1%andlower.low_freq_operator:Thiscanbesettoororand,butdefaultstoor.ItspecifiestheBooleanoperatorusedforconstructingqueriesinthelowfrequencytermgroup.Ifwewantallthetermstobepresentinadocumentforittobeconsideredamatch,weshouldsetthisparametertoand.high_freq_operator:Thiscanbesettoororand,butdefaultstoor.ItspecifiestheBooleanoperatorusedforconstructingqueriesinthehighfrequencytermgroup.Ifwewantallthetermstobepresentinadocumentforittobeconsideredamatch,weshouldsetthisparametertoand.minimum_should_match:Insteadofusinglow_freq_operatorandhigh_freq_operator,wecanuseminimum_should_match.Justlikewiththeother
www.EBooksWorld.ir
![Page 209: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/209.jpg)
queries,itallowsustospecifytheminimumnumberoftermsthatshouldbefoundinadocumentforittobeconsideredamatch.Wecanalsospecifyhigh_freqandlow_freqinsidetheminimum_should_matchobject,whichallowsustodefinethedifferentnumberoftermsthatneedtobematchedforthehighandlowfrequencyterms.boost:Theboostgiventothescoreofthedocuments.analyzer:Thenameoftheanalyzerthatwillbeusedtoanalyzethequerytext,whichdefaultstothedefaultanalyzer.disable_coord:Defaultstofalseandallowsustoenableordisablethescorefactorcomputationthatisbasedonthefractionofallthequerytermsthatadocumentcontains.Setittotrueforlessprecisescoring,butslightlyfasterqueries.
NoteUnlikethetermandtermsqueries,thecommontermsqueryisanalyzedbyElasticsearch.
www.EBooksWorld.ir
![Page 210: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/210.jpg)
ThematchqueryThematchquerytakesthevaluesgiveninthequeryparameter,analyzesit,andconstructstheappropriatequeryoutofit.Whenusingamatchquery,Elasticsearchwillchoosetheproperanalyzerforthefieldwechoose,soyoucanbesurethatthetermspassedtothematchquerywillbeprocessedbythesameanalyzerthatwasusedduringindexing.Rememberthatthematchquery(andthemulti_matchquery)doesn’tsupportLucenequerysyntax;however,itperfectlyfitsasaqueryhandlerforyoursearchbox.Thesimplestmatch(andthedefault)querywilllooklikethefollowing:
{
"query":{
"match":{
"title":"crimeandpunishment"
}
}
}
Theprecedingquerywillmatchallthedocumentsthathavethetermscrime,and,orpunishmentinthetitlefield.However,thepreviousqueryisonlythesimplestone;therearemultipletypesofmatchquerywhichwewilldiscussnow.
TheBooleanmatchqueryTheBooleanmatchqueryisaquerywhichanalyzestheprovidedtextandmakesaBooleanqueryoutofit.Thisisalsothedefaulttypeforthematchquery.ThereareafewparameterswhichallowustocontrolthebehavioroftheBooleanmatchqueries:
operator:Thisparametercantakethevalueofororand,andcontrolswhichBooleanoperatorisusedtoconnectthecreatedBooleanclauses.Thedefaultvalueisor.Ifwewantallthetermsinourquerytobematched,weshouldusetheandBooleanoperator.analyzer:Thisspecifiesthenameoftheanalyzerthatwillbeusedtoanalyzethequerytextanddefaultstothedefaultanalyzer.fuzziness:Providingthevalueofthisparameterallowsustoconstructfuzzyqueries.Thevalueofthisparametercanvary.Fornumericfields,itshouldbesettonumericvalue;fordatebasedfield,itcanbesettomillisecondortimevalue,suchas2h;andfortextfields,itcanbesetto0,1,or2(theeditdistanceintheLevenshteinalgorithm–https://en.wikipedia.org/wiki/Levenshtein_distance),AUTO(whichallowsElasticsearchtocontrolhowfuzzyqueriesareconstructedandwhichisapreferredvalue).Finally,fortextfields,itcanalsobesettovaluesfrom0.0to1.0,whichresultsineditdistancebeingcalculatedastermlengthminus1.0multipliedbytheprovidedfuzzinessvalue.Ingeneral,thehigherthefuzziness,themoredifferencebetweentermswillbeallowed.prefix_length:Thisallowscontroloverthebehaviorofthefuzzyquery.Formoreinformationonthevalueofthisparameter,refertotheThefuzzyquerysectioninthischapter.max_expansions:Thisallowscontroloverthebehaviorofthefuzzyquery.Formore
www.EBooksWorld.ir
![Page 211: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/211.jpg)
informationonthevalueofthisparameter,refertotheThefuzzyquerysectioninthischapter.zero_terms_query:Thisallowsustospecifythebehaviorofthequery,whenallthetermsareremovedbytheanalyzer(forexample,becauseofstopwords).Itcanbesettononeorall,withnoneasthedefault.Whensettonone,nodocumentswillbereturnedwhentheanalyzerremovesallthequeryterms.Ifsetittoall,allthedocumentswillbereturned.cutoff_frequency:Itallowsdividingthequeryintotwogroups:onewithhighfrequencytermsandonewithlowfrequencyterms.Refertothedescriptionofthecommontermsquerytoseehowthisparametercanbeused.lenient:Whensettotrue(bydefaultitisfalse),itallowsustoignoretheexceptionscausedbydataincompatibility,suchastryingtoquerynumericfieldsusingstringvalue.
Theparametersshouldbewrappedinthenameofthefieldwearerunningthequeryagainst.SoifwewanttorunasampleBooleanmatchqueryagainstthetitlefield,wesendaqueryasfollows:
{
"query":{
"match":{
"title":{
"query":"crimeandpunishment",
"operator":"and"
}
}
}
}
ThephrasematchqueryAphrasematchqueryissimilartotheBooleanquery,but,insteadofconstructingtheBooleanclausesfromtheanalyzedtext,itconstructsthephrasequery.YoumaywonderwhatphraseiswhenitcomestoLuceneandElasticsearch–well,itistwoormoretermspositionedoneafteranotherinanorder.Thefollowingparametersareavailable:
slop:Anintegervaluethatdefineshowmanyunknownwordscanbeputbetweenthetermsinthetextqueryforamatchtobeconsideredaphrase.Thedefaultvalueofthisparameteris0,whichmeansthatnoadditionalwordsareallowed.analyzer:Thisspecifiesthenameoftheanalyzerthatwillbeusedtoanalyzethequerytextanddefaultstothedefaultanalyzer.
Asamplephrasematchqueryagainstthetitlefieldlookslikethefollowingcode:
{
"query":{
"match_phrase":{
"title":{
"query":"crimepunishment",
"slop":1
}
www.EBooksWorld.ir
![Page 212: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/212.jpg)
}
}
}
Notethatweremovedtheandtermfromourquery,butbecausetheslopissetto1,itwillstillmatchourdocumentbecauseweallowedonetermtobepresentbetweenourterms.
ThematchphraseprefixqueryThelasttypeofthematchqueryisthematchphraseprefixquery.Thisqueryisalmostthesameasthephrasematchquery,butinaddition,itallowsprefixmatchesonthelastterminthequerytext.Also,inadditiontotheparametersexposedbythematchphrasequery,itexposesanadditionalone–themax_expansionsparameter,whichcontrolshowmanyprefixesthelasttermwillberewrittento.Ourexamplequerychangedtothematch_phrase_prefixquerywilllookasfollows:
{
"query":{
"match_phrase_prefix":{
"title":{
"query":"crimepunishm",
"slop":1,
"max_expansions":20
}
}
}
}
Notethatwedidn’tprovidethefullcrimeandpunishmentphrase,butonlycrimepunishmandstillthequerywouldmatchourdocument.Thisisbecauseweusedthematch_phrase_prefixquerycombinedwithslopsetto1.
www.EBooksWorld.ir
![Page 213: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/213.jpg)
ThemultimatchqueryItisthesameasthematchquery,butinsteadofrunningagainstasinglefield,itcanberunagainstmultiplefieldswiththeuseofthefieldsparameter.Ofcourse,alltheparametersyouusewiththematchquerycanbeusedwiththemultimatchquery.Soifwewouldliketomodifyourmatchquerytoberunagainstthetitleandotitlefields,wewillrunthefollowingquery:
{
"query":{
"multi_match":{
"query":"crimepunishment",
"fields":["title^10","otitle"]
}
}
}
Asshownintheprecedingexample,thenicethingaboutthemultimatchqueryisthatthefieldsdefinedinitsupportboosting,sowecanincreaseordecreasetheimportanceofmatchesoncertainfields.
However,thisisnottheonlydifferencewhenitcomestocomparisonwiththematchquery.Wecanalsocontrolhowthequeryisruninternallybyusingthetypepropertyandsettingittooneofthefollowingvalues:
best_fields:Thisisthedefaultbehavior,whichfindsdocumentshavingmatchesinanyfieldfromthedefinedones,butsettingthedocumentscoretothescoreofthebestmatchingfield.Themostusefultypewhensearchingformultiplewordsandwantingtoboostdocumentsthathavethosewordsinthesamefield.most_fields:Thisvaluefindsdocumentsthatmatchanyfieldandsetsthescoreofthedocumenttothecombinedscorefromallthematchedfields.cross_fields:Thisvaluetreatsthequeryasifallthetermswereinone,bigfield,thusreturningdocumentsmatchinganyfield.phrase:Thisvalueusesthematch_phrasequeryoneachfieldandsetsthescoreofthedocumenttothescorecombinedfromallthefields.phrase_prefix:Thisvalueusesthematch_phrase_prefixqueryoneachfieldandsetsthescoreofthedocumenttothescorecombinedfromallthefields.
Inadditiontotheparametersmentionedinthematchqueryandtype,themultimatchqueryexposessomeadditionalonesallowingmorecontroloveritsbehavior:
tie_breaker:Thisallowsustospecifythebalancebetweentheminimumandthemaximumscoringqueryitemsandthevaluecanbefrom0.0to1.0.Whenused,thescoreofthedocumentisequaltothebestscoringelementplusthetie_breakermultipliedbythescoreofalltheothermatchingfieldsinthedocument.So,whensetto0.0,Elasticsearchwillonlyusethescoreofthemostscoringmatchingelement.YoucanreadmoreaboutitinThedis_maxquerysectioninthischapter.
www.EBooksWorld.ir
![Page 214: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/214.jpg)
ThequerystringqueryIncomparisontotheotherqueriesavailable,thequerystringquerysupportsfullApacheLucenequerysyntax,whichwediscussedearlierintheLucenequerysyntaxsectionofChapter1,GettingStartedwithElasticsearchCluster.Itusesaqueryparsertoconstructanactualqueryusingtheprovidedtext.Anexamplequerystringquerywilllooklikethefollowingcode:
{
"query":{
"query_string":{
"query":"title:crime^10+title:punishment-otitle:cat+author:
(+Fyodor+dostoevsky)",
"default_field":"title"
}
}
}
BecausewearefamiliarwiththebasicsoftheLucenequerysyntax,wecandiscusshowtheprecedingqueryworks.Asyoucansee,wewantedtogetthedocumentsthatmayhavethetermcrimeinthetitlefieldandsuchdocumentsshouldbeboostedwiththevalueof10.Next,wewantedonlythedocumentsthathavethetermpunishmentinthetitlefieldandwedidn’twantdocumentswiththetermcatintheotitlefield.Finally,wetoldLucenethatweonlywantedthedocumentsthathadthefyodoranddostoevskytermsintheauthorfield.
SimilartomostofthequeriesinElasticsearch,thequerystringqueryprovidesquiteafewparametersthatallowustocontrolthequerybehaviorandthelistofparametersforthisqueryisratherextensive:
query:Thisspecifiesthequerytext.default_field:Thisspecifiesthedefaultfieldthequerywillbeexecutedagainst.Itdefaultstotheindex.query.default_fieldproperty,whichisbydefaultsetto_all.default_operator:Thisspecifiesthedefaultlogicaloperator(ororand)usedwhennooperatorisspecified.Thedefaultvalueofthisparameterisor.analyzer:Thisspecifiesthenameoftheanalyzerusedtoanalyzethequeryprovidedinthequeryparameter.allow_leading_wildcard:Thisspecifiesifawildcardcharacterisallowedasthefirstcharacterofaterm.Itdefaultstotrue.lowercase_expand_terms:Thisspecifiesifthetermsthatarearesultofqueryrewriteshouldbelowercased.Itdefaultstotrue,whichmeansthattherewrittentermswillbelowercased.enable_position_increments:Thisspecifiesifpositionincrementsshouldbeturnedonintheresultquery.Itdefaultstotrue.fuzzy_max_expansions:Thisspecifiesthemaximumnumberoftermsintowhichfuzzyquerywillbeexpanded,iffuzzyqueryisused.Itdefaultsto50.fuzzy_prefix_length:Thisspecifiestheprefixlengthforthegeneratedfuzzy
www.EBooksWorld.ir
![Page 215: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/215.jpg)
queriesanddefaultsto0.Tolearnmoreaboutit,lookatthefuzzyquerydescription.phrase_slop:Thisspecifiesthephraseslopanddefaultsto0.Tolearnmoreaboutit,lookatthephrasematchquerydescription.boost:Thisspecifiestheboostvaluewhichwillbeusedanddefaultsto1.0.analyze_wildcard:Thisspecifiesifthetermsgeneratedbythewildcardqueryshouldbeanalyzed.Itdefaultstofalse,whichmeansthatthosetermswon’tbeanalyzed.auto_generate_phrase_queries:specifiesifthephrasequerieswillbeautomaticallygeneratedfromthequery.Itdefaultstofalse,whichmeansthatthephrasequerieswon’tbeautomaticallygenerated.minimum_should_match:ThiscontrolshowmanyofthegeneratedBooleanshouldclausesshouldbematchedagainstadocumentforthedocumenttobeconsideredahit.Thevaluecanbeprovidedasapercentage;forexample,50%,whichwouldmeanthatatleast50percentofthegiventermsshouldmatch.Itcanalsobeprovidedasanintegervalue,suchas2,whichmeansthatatleast2termsmustmatch.fuzziness:Thiscontrolsthebehaviorofthegeneratedfuzzyquery.Refertothematchquerydescriptionformoreinformation.max_determined_states:Thisdefaultsto10000andsetsthenumberofstatesthattheautomatoncanhaveforhandlingregularexpressionqueries.Itisusedtodisallowveryexpensivequeriesusingregularexpressions.locale:Thissetsthelocalethatshouldbeusedfortheconversionofstringvalues.Bydefault,itissettoROOT.time_zone:Thissetsthetimezonethatshouldbeusedbyrangequeriesthatarerunondatebasedfields.lenient:Thiscantakethevalueoftrueorfalse.Ifsettotrue,format-basedfailureswillbeignored.Bydefault,itissettofalse.
NotethatElasticsearchcanrewritethequerystringqueryand,becauseofthat,Elasticsearchallowsustopassadditionalparametersthatcontroltherewritemethod.However,formoredetailsaboutthisprocess,gototheUnderstandingthequeryingprocesssectioninthischapter.
RunningthequerystringqueryagainstmultiplefieldsItispossibletorunthequerystringqueryagainstmultiplefields.Inordertodothat,oneneedstoprovidethefieldsparameterinthequerybody,whichshouldholdthearrayofthefieldnames.Therearetwomethodsofrunningthequerystringqueryagainstmultiplefields:thedefaultmethodusestheBooleanquerytomakequeriesandtheothermethodcanusethedis_maxquery.
Inordertousethedis_maxquery,oneshouldaddtheuse_dis_maxpropertyinthequerybodyandsetittotrue.Anexamplequerywilllooklikethefollowingcode:
{
"query":{
"query_string":{
"query":"crimepunishment",
"fields":["title","otitle"],
www.EBooksWorld.ir
![Page 216: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/216.jpg)
"use_dis_max":true
}
}
}
www.EBooksWorld.ir
![Page 217: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/217.jpg)
ThesimplequerystringqueryThesimplequerystringqueryusesoneofthenewestqueryparsersinLucene-theSimpleQueryParser(https://lucene.apache.org/core/5_4_0/queryparser/org/apache/lucene/queryparser/simple/SimpleQueryParser.htmlSimilartothequerystringquery,itacceptsLucenequerysyntaxasthequery;however,unlikeit,itneverthrowsanexceptionwhenaparsingerrorhappens.Insteadofthrowinganexception,itdiscardstheinvalidpartsofthequeryandrunstherest.
Anexamplesimplequerystringquerywilllooklikethefollowingcode:
{
"query":{
"simple_query_string":{
"query":"crimepunishment",
"default_operator":"or"
}
}
}
Thequerysupportsparameterssuchasquery,fields,default_operator,analyzer,lowercase_expanded_terms,locale,lenient,andminimum_should_match,andcanalsoberunagainstmultiplefieldsusingthefieldsproperty.
www.EBooksWorld.ir
![Page 218: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/218.jpg)
TheidentifiersqueryThisisasimplequerythatfiltersthereturneddocumentstoonlythosewiththeprovidedidentifiers.Itworksontheinternal_uidfield,soitdoesn’trequirethe_idfieldtobeenabled.Thesimplestversionofsuchaquerywilllooklikethefollowing:
{
"query":{
"ids":{
"values":["1","2","3"]
}
}
}
Thisquerywillonlyreturnthosedocumentsthathaveoneoftheidentifierspresentinthevaluesarray.Wecancomplicatetheidentifiersqueryabitandalsolimitthedocumentsonthebasisoftheirtype.Forexample,ifwewanttoonlyincludedocumentsfromthebooktypes,wewillsendthefollowingquery:
{
"query":{
"ids":{
"type":"book",
"values":["1","2","3"]
}
}
}
Asyoucansee,we’veaddedthetypepropertytoourqueryandwe’vesetitsvaluetothetypeweareinterestedin.
www.EBooksWorld.ir
![Page 219: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/219.jpg)
TheprefixqueryThisqueryissimilartothetermqueryinitsconfigurationandtothemultitermquerywhenlookingintoitslogic.Theprefixqueryallowsustomatchdocumentsthathavethevalueinacertainfieldthatstartswithagivenprefix.Forexample,ifwewanttofindallthedocumentsthathavevaluesstartingwithcriinthetitlefield,wewillrunthefollowingquery:
{
"query":{
"prefix":{
"title":"cri"
}
}
}
Similartothetermquery,youcanalsoincludetheboostattributetoyourprefixquerywhichwillaffecttheimportanceofthegivenprefix.Forexample,ifwewouldliketochangeourpreviousqueryandgiveourqueryaboostof3.0,wewillsendthefollowingquery:
{
"query":{
"prefix":{
"title":{
"value":"cri",
"boost":3.0
}
}
}
}
NoteNotethattheprefixqueryisrewrittenbyElasticsearchandbecauseofthatElasticsearchallowsustopassanadditionalparameter,thatis,controllingtherewritemethod.However,formoredetailsaboutthatprocess,refertotheUnderstandingthequeryingprocesssectioninthischapter.
www.EBooksWorld.ir
![Page 220: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/220.jpg)
ThefuzzyqueryThefuzzyqueryallowsustofinddocumentsthathavevaluessimilartotheoneswe’veprovidedinthequery.Thesimilarityoftermsiscalculatedonthebasisoftheeditdistancealgorithm.Theeditdistanceiscalculatedonthebasisoftermsweprovideinthequeryandagainstthesearcheddocuments.ThisquerycanbeexpensivewhenitcomestoCPUresources,butcanhelpuswhenweneedfuzzymatching;forexample,whenusersmakespellingmistakes.Inourexample,let’sassumethatinsteadofcrime,ouruserentersthecrmewordintothesearchboxandwewouldliketorunthesimplestformoffuzzyquery.Suchaquerywilllooklikethis:
{
"query":{
"fuzzy":{
"title":"crme"
}
}
}
Theresponseforsuchaquerywillbeasfollows:
{
"took":81,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":0.5,
"hits":[{
"_index":"library",
"_type":"book",
"_id":"4",
"_score":0.5,
"_source":{
"title":"CrimeandPunishment",
"otitle":"Преступлéниеинаказáние",
"author":"FyodorDostoevsky",
"year":1886,
"characters":["Raskolnikov","SofiaSemyonovnaMarmeladova"],
"tags":[],
"copies":0,
"available":true
}
}]
}
}
Eventhoughwemadeatypo,Elasticsearchmanagedtofindthedocumentswewereinterestedin.
www.EBooksWorld.ir
![Page 221: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/221.jpg)
Wecancontrolthefuzzyquerybehaviorbyusingthefollowingparameters:
value:Thisspecifiestheactualquery.boost:Thisspecifiestheboostvalueforthequery.Itdefaultsto1.0.fuzziness:Thiscontrolsthebehaviorofthegeneratedfuzzyquery.Refertothematchquerydescriptionformoreinformation.prefix_length:Thisisthelengthofthecommonprefixofthedifferencingterms.Itdefaultsto0.max_expansions:Thisspecifiesthemaximumnumberoftermsthequerywillbeexpandedto.Thedefaultvalueisunbounded.
Theparametersshouldbewrappedinthenameofthefieldwearerunningthequeryagainst.Soifwewouldliketomodifythepreviousqueryandaddadditionalparameters,thequerywilllooklikethefollowingcode:
{
"query":{
"fuzzy":{
"title":{
"value":"crme",
"fuzziness":2
}
}
}
}
www.EBooksWorld.ir
![Page 222: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/222.jpg)
ThewildcardqueryAquerythatallowsustouse*and?wildcardsinthevalueswesearch.Apartfromthat,thewildcardqueryisverysimilartothetermqueryincaseofitsbody.Tosendaquerythatwouldmatchallthedocumentswiththevalueofthecr?meterm(?matchinganycharacter)wewouldsendthefollowingquery:
{
"query":{
"wildcard":{
"title":"cr?me"
}
}
}
Itwillmatchthedocumentsthathaveallthetermsmatchingcr?meinthetitlefield.However,youcanalsoincludetheboostattributetoyourwildcardquerywhichwillaffecttheimportanceofeachtermthatmatchesthegivenvalue.Forexample,ifwewouldliketochangeourpreviousqueryandgiveourtermqueryaboostof20.0,wewillsendthefollowingquery:
{
"query":{
"wildcard":{
"title":{
"value":"cr?me",
"boost":20.0
}
}
}
}
NoteNotethatwildcardqueriesarenotveryperformanceorientedqueriesandshouldbeavoidedifpossible;especiallyavoidleadingwildcards(termsstartingwithwildcards).ThewildcardqueryisrewrittenbyElasticsearchandbecauseofthatElasticsearchallowsustopassanadditionalparameter,thatis,controllingtherewritemethod.Formoredetailsaboutthisprocess,refertotheUnderstandingthequeryingprocesssectioninthischapter.Alsorememberthatthewildcardqueryisnotanalyzed.
www.EBooksWorld.ir
![Page 223: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/223.jpg)
TherangequeryAquerythatallowsustofinddocumentsthathaveafieldvaluewithinacertainrangeandwhichworksfornumericalfieldsaswellasforstring-basedfieldsanddatebasedfields(justmapstoadifferentApacheLucenequery).Therangequeryshouldberunagainstasinglefieldandthequeryparametersshouldbewrappedinthefieldname.Thefollowingparametersaresupported:
gte:Thequerywillmatchdocumentswiththevaluegreaterthanorequaltotheoneprovidedwiththisparametergt:Thequerywillmatchdocumentswiththevaluegreaterthantheoneprovidedwiththisparameterlte:Thequerywillmatchdocumentswiththevaluelowerthanorequaltotheoneprovidedwiththisparameterlt:Thequerywillmatchdocumentswiththevaluelowerthantheoneprovidedwiththisparameter
Soforexample,ifwewanttofindallthebooksthathavethevaluefrom1700to1900intheyearfield,wewillrunthefollowingquery:
{
"query":{
"range":{
"year":{
"gte":1700,
"lte":1900
}
}
}
}
www.EBooksWorld.ir
![Page 224: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/224.jpg)
RegularexpressionqueryRegularexpressionqueryallowsustouseregularexpressionsasthequerytext.Rememberthattheperformanceofsuchqueriesdependsonthechosenregularexpression.Ifourregularexpressionwouldmatchmanyterms,thequerywillbeslow.Thegeneralruleisthatthemoretermsmatchedbytheregularexpression,theslowerthequerywillbe.
Anexampleregularexpressionquerylookslikethis:
{
"query":{
"regexp":{
"title":{
"value":"cr.m[ae]",
"boost":10.0
}
}
}
}
TheprecedingquerywillresultinElasticsearchrewritingthequery.Therewrittenquerywillhavethenumberoftermqueriesdependingonthecontentofourindexmatchingthegivenregularexpression.Theboostparameterseeninthequeryspecifiestheboostvalueforthegeneratedqueries.
ThefullregularexpressionsyntaxacceptedbyElasticsearchcanbefoundathttps://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html#regexp-syntax.
www.EBooksWorld.ir
![Page 225: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/225.jpg)
ThemorelikethisqueryOneofthequeriesthatgotamajorreworkinElasticsearch2.0,themorelikethisqueryallowsustoretrievedocumentsthataresimilar(ornotsimilar)totheprovidedtextortothedocumentsthatwereprovided.
Themorelikethisqueryallowsustogetdocumentsthataresimilartotheprovidedtext.Elasticsearchsupportsafewparameterstodefinehowthemorelikethisqueryshouldwork:
fields:Anarrayoffieldsthatthequeryshouldberunagainst.Itdefaultstothe_allfield.like:Thisparametercomesintwoflavors:itallowsustoprovideatextwhichthereturneddocumentsshouldbesimilartooranarrayofdocumentsthatthereturningdocumentshouldbesimilarto.unlike:Thisissimilartothelikeparameter,butitallowsustodefinetextordocumentsthatourreturningdocumentshouldnotbesimilarto.min_term_freq:Theminimumtermfrequency(forthetermsinthedocuments)belowwhichtermswillbeignored.Itdefaultsto2.max_query_terms:Themaximumnumberoftermsthatwillbeincludedinanygeneratedquery.Itdefaultsto25.Thehighervaluemaymeanhigherprecision,butlowerperformance.stop_words:Anarrayofwordsthatwillbeignoredwhencomparingdocumentsandthequery.Itisemptybydefault.min_doc_freq:Theminimumnumberofdocumentsinwhichthetermhastobepresentinordernottobeignored.Itdefaultsto5,whichmeansthatatermneedstobepresentinatleastfivedocuments.max_doc_freq:Themaximumnumberofdocumentsinwhichthetermmaybepresentinordernottobeignored.Bydefault,itisunbounded(setto0).min_word_len:Theminimumlengthofasinglewordbelowwhichawordwillbeignored.Itdefaultsto0.max_word_len:Themaximumlengthofasinglewordabovewhichitwillbeignored.Itdefaultstounbounded(whichmeanssettingthevalueto0).boost_terms:Theboostvaluethatwillbeusedwhenboostingeachterm.Itdefaultsto0.boost:Theboostvaluethatwillbeusedwhenboostingthequery.Itdefaultsto1.include:Thisspecifiesiftheinputdocumentsshouldbeincludedintheresultsreturnedbythequery.Itdefaultstofalse,whichmeansthattheinputdocumentswon’tbeincluded.minimum_should_match:Thiscontrolsthenumberoftermsthatneedtobematchedintheresultingdocuments.Bydefault,itissetto30%.analyzer:Thenameoftheanalyzerthatwillbeusedtoanalyzethetextweprovided.
Anexampleforamorelikethisquerylookslikethis:
www.EBooksWorld.ir
![Page 226: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/226.jpg)
{
"query":{
"more_like_this":{
"fields":["title","otitle"],
"like":"crimeandpunishment",
"min_term_freq":1,
"min_doc_freq":1
}
}
}
Aswesaidearlier,thelikepropertycanalsobeusedtoshowwhichdocumentstheresultsshouldbesimilarto.Forexample,thefollowingisthequerythatwillusethelikepropertytopointtoagivendocument(notethatthefollowingquerywon’treturndocumentsonourexampledata):
{
"query":{
"more_like_this":{
"fields":["title","otitle"],
"min_term_freq":1,
"min_doc_freq":1,
"like":[
{
"_index":"library",
"_type":"book",
"_id":"4"
}
]
}
}
}
Wecanalsomixthedocumentsandtexttogether:
{
"query":{
"more_like_this":{
"fields":["title","otitle"],
"min_term_freq":1,
"min_doc_freq":1,
"like":[
{
"_index":"library",
"_type":"book",
"_id":"4"
},
"crimeandpunishment"
]
}
}
}
www.EBooksWorld.ir
![Page 227: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/227.jpg)
www.EBooksWorld.ir
![Page 228: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/228.jpg)
CompoundqueriesIntheBasicqueriessectionofthischapter,wediscussedthesimplestqueriesexposedbyElasticsearch.WealsotalkedaboutthepositionawarequeriescalledspanqueriesintheSpanqueriessection.However,thesimpleonesandthespanqueriesarenottheonlyqueriesthatElasticsearchprovides.Thecompoundqueries,aswecallthem,allowustoconnectmultiplequeriestogetheroralterthebehaviorofotherqueries.Youmaywonderifyouneedsuchfunctionality.Yourdeploymentmaynotneedit,butanythingapartfromasimplequerywillprobablyrequirecompoundqueries.Forexample,combiningasimpletermquerywithamatch_phrasequerytogetbettersearchresultsmaybeagoodcandidateforcompoundqueriesusage.
www.EBooksWorld.ir
![Page 229: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/229.jpg)
TheboolqueryTheboolqueryallowsustowrapavirtuallyunboundednumberofqueriesandconnectthemwithalogicalvalueusingoneofthefollowingsections:
should:Thequerywrappedintothissectionmayormaynotmatch.Thenumberofshouldsectionsthathavetomatchiscontrolledbytheminimum_should_matchparametermust:Thequerywrappedintothissectionmustmatchinorderforthedocumenttobereturned.must_not:Thequerywhenwrappedintothissectionmustnotmatchinorderforthedocumenttobereturned.
Eachoftheprecedingmentionedsectionscanbepresentmultipletimesinasingleboolquery.Thisallowsustobuildverycomplexqueriesthathavemultiplelevelsofnesting(youcanincludetheboolqueryinanotherboolquery).Rememberthatthescoreoftheresultingdocumentwillbecalculatedbytakingasumofallthewrappedqueriesthatthedocumentmatched.
Inadditiontotheprecedingsections,wecanaddthefollowingparameterstothequerybodytocontrolitsbehavior:
filter:Thisallowsustospecifythepartofthequerythatshouldbeusedasafilter.YoucanreadmoreaboutfiltersintheFilteringyourresultssectioninChapter4,ExtendingYourQueryingKnowledge.boost:Thisspecifiestheboostusedinthequery,defaultingto1.0.Thehighertheboost,thehigherthescoreofthematchingdocument.minimum_should_match:Thisdescribestheminimumnumberofshouldclausesthathavetomatchinorderforthecheckeddocumenttobecountedasamatch.Forexample,itcanbeanintegervaluesuchas2orapercentagevaluesuchas75%.Formoreinformation,refertohttps://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-minimum-should-match.html.disable_coord:ABooleanparameter(defaultstofalse),whichallowsustoenableordisablethescorefactorcomputationthatisbasedonthefractionofallthequerytermsthatadocumentcontains.Weshouldsetittotrueforlessprecisescoring,butslightlyfasterqueries.
Imaginethatwewanttofindallthedocumentsthathavethetermcrimeinthetitlefield.Inaddition,thedocumentsmayormaynothavearangeof1900to2000intheyearfieldandmaynothavethenothingtermintheotitlefield.Suchaquerymadewiththeboolquerywilllookasfollows:
{
"query":{
"bool":{
"must":{
"term":{
"title":"crime"
www.EBooksWorld.ir
![Page 230: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/230.jpg)
}
},
"should":{
"range":{
"year":{
"from":1900,
"to":2000
}
}
},
"must_not":{
"term":{
"otitle":"nothing"
}
}
}
}
}
NoteNotethatthemust,should,andmust_notsectionscancontainasinglequeryoranarrayofqueries.
www.EBooksWorld.ir
![Page 231: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/231.jpg)
Thedis_maxqueryThedis_maxqueryisveryusefulasitgeneratesaunionofdocumentsreturnedbyallthesubqueriesandreturnsitastheresult.Thegoodthingaboutthisqueryisthefactthatwecancontrolhowthelowerscoringsubqueriesaffectthefinalscoreofthedocuments.Forthedis_maxquery,wespecifythequeriesusingthequeriesproperty(queryoranarrayofqueries)andthetiebreaker,withthetie_breakerproperty.Wecanalsoincludeadditionalboostbyspecifyingtheboostparameter.
Thefinaldocumentscoreiscalculatedasthesumofscoresofthemaximumscoringqueryandthesumofscoresreturnedfromtherestofthequeries,multipliedbythevalueofthetieparameter.So,thetie_breakerparameterallowsustocontrolhowthelowerscoringqueriesaffectthefinalscore.Ifwesetthetie_breakerparameterto1.0,wegettheexactsum,whilesettingthetieparameterto0.1resultsinonly10percentofthescores(ofallthescoresapartfromthemaximumscoringquery)beingaddedtothefinalscore.
Anexampleofthedis_maxqueryisasfollows:
{
"query":{
"dis_max":{
"tie_breaker":0.99,
"boost":10.0,
"queries":[
{
"match":{
"title":"crime"
}
},
{
"match":{
"author":"fyodor"
}
}
]
}
}
}
Asyoucansee,weincludedthetie_breakerandboostparameters.Inadditiontothat,wespecifiedthequeriesparameterthatholdsthearrayofqueriesthatwillberunandusedtogeneratetheunionofdocumentsforresults.
www.EBooksWorld.ir
![Page 232: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/232.jpg)
TheboostingqueryTheboostingquerywrapsaroundtwoqueriesandlowersthescoreofthedocumentsreturnedbyoneofthequeries.Therearethreesectionsoftheboostingquerythatneedtobedefined:thepositivesectionthatholdsthequerywhosedocumentscorewillbeleftunchanged,thenegativesectionwhoseresultingdocumentswillhavetheirscorelowered,andthenegative_boostsectionthatholdstheboostvaluethatwillbeusedtolowerthesecondsection’squeryscore.Theadvantageoftheboostingqueryisthattheresultsofboththequeries(thenegativeandthepositiveones)willbepresentintheresults,althoughthescoresofsomequerieswillbelowered.Forcomparison,ifweweretousetheboolquerywiththemust_notsection,wewouldn’tgettheresultsforsuchaquery.
Let’sassumethatwewanttohavetheresultsofasimpletermqueryforthetermcrimeinthetitlefieldandwantthescoreofsuchdocumentstonotbechanged.However,wealsowanttohavethedocumentsthatrangefrom1800to1900intheyearfield,andthescoresofdocumentsreturnedbysuchaquerytohaveanadditionalboostof0.5.Suchaquerywilllooklikethefollowing:
{
"query":{
"boosting":{
"positive":{
"term":{
"title":"crime"
}
},
"negative":{
"range":{
"year":{
"from":1800,
"to":1900
}
}
},
"negative_boost":0.5
}
}
}
www.EBooksWorld.ir
![Page 233: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/233.jpg)
Theconstant_scorequeryTheconstant_scorequerywrapsanotherqueryandreturnsaconstantscoreforeachdocumentreturnedbythewrappedquery.Wespecifythescorethatshouldbegiventothedocumentsbyusingtheboostproperty,whichdefaultsto1.0.Itallowsustostrictlycontrolthescorevalueassignedforadocumentmatchedbyaquery.Forexample,ifwewanttohaveascoreof2.0forallthedocumentsthathavethetermcrimeinthetitlefield,wesendthefollowingquerytoElasticsearch:
{
"query":{
"constant_score":{
"query":{
"term":{
"title":"crime"
}
},
"boost":2.0
}
}
}
www.EBooksWorld.ir
![Page 234: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/234.jpg)
TheindicesqueryTheindicesqueryisusefulwhenexecutingaqueryagainstmultipleindices.Itallowsustoprovideanarrayofindices(theindicesproperty)andtwoqueries,onethatwillbeexecutedifwequerytheindexfromthelist(thequeryproperty)andthesecondthatwillbeexecutedonalltheotherindices(theno_match_queryproperty).Forexample,assumewehaveanaliasnamedbooks,holdingtwoindices:libraryandusers.Whatwewanttodoisusethisalias.However,wewanttorundifferentqueriesdependingonwhichindexisusedforsearching.Anexamplequeryfollowingthislogicwilllookasfollows:
{
"query":{
"indices":{
"indices":["library"],
"query":{
"term":{
"title":"crime"
}
},
"no_match_query":{
"term":{
"user":"crime"
}
}
}
}
}
Intheprecedingquery,thequerydescribedinthequerypropertywasrunagainstthelibraryindexandthequerydefinedintheno_match_querysectionwasrunagainstalltheotherindicespresentinthecluster,whichforourhypotheticalaliasmeanstheusersindex.
Theno_match_querypropertycanalsohaveastringvalueinsteadofaquery.Thisstringvaluecaneitherbeallornone,butitdefaultstoall.Iftheno_match_querypropertyissettoall,thedocumentsfromtheindicesthatdon’tmatchwillbereturned.Settingtheno_match_querypropertytononewillresultinnodocumentsfromtheindicesthatdon’tmatchthequeryfromthatsection.
www.EBooksWorld.ir
![Page 235: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/235.jpg)
www.EBooksWorld.ir
![Page 236: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/236.jpg)
UsingspanqueriesElasticsearchleveragesLucenespanqueries,whichallowustomakequerieswhensometokensorphrasesarenearothertokensorphrases.Basically,wecancallthempositionawarequeries.Whenusingthestandardnonspanqueries,wearenotabletomakequeriesthatarepositionaware;tosomeextent,thephrasequeriesallowthat,butonlytosomeextent.So,forElasticsearchandtheunderlyingLucene,itdoesn’tmatterifthetermisinthebeginningofthesentenceorattheendornearanotherterm.Whenusingspanqueries,itdoesmatter.
ThefollowingspanqueriesareexposedinElasticsearch:
spantermqueryspanfirstqueryspannearqueryspanorqueryspannotqueryspanwithinqueryspancontainingqueryspanmultiquery
Beforewecontinuewiththedescription,let’sindexadocumenttoacompletelynewindexthatwewillusetoshowhowspanquerieswork.Todothis,weusethefollowingcommand:
curl-XPUT'localhost:9200/spans/book/1'-d'{
"title":"Testbook",
"author":"Testauthor",
"description":"Theworldbreakseveryone,andafterward,somearestrong
atthebrokenplaces"
}'
www.EBooksWorld.ir
![Page 237: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/237.jpg)
AspanAspan,inourcontext,isastartingandendingtokenpositioninafield.Forexample,inourcase,theworldbreakseveryonecouldbeasinglespan,aworldcanbeasinglespantoo.Asyoumayknow,duringanalysis,Lucene,inadditiontotoken,includessomeadditionalparameters,suchaspositioninthetokenstream.PositioninformationcombinedwiththetermsallowsustoconstructspansusingElasticsearchspanqueries(whicharemappedtoLucenespanqueries).Inthenextfewpages,wewilllearnhowtoconstructspansusingdifferentspanqueriesandhowtocontrolwhichdocumentsarematched.
www.EBooksWorld.ir
![Page 238: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/238.jpg)
SpantermqueryThespan_termqueryisabuilderfortheotherspanqueries.Aspan_termqueryisaquerysimilartothealreadydiscussedtermquery.Onitsown,itworksjustlikethementionedtermquery–itmatchesaterm.Itsdefinitionissimpleandlooksasfollows(weomittedsomepartsofthequeriesonpurpose,becausewewilldiscussitlater):
{
"query":{
...
"span_term":{
"description":{
"value":"world",
"boost":5.0
}
}
}
}
Asyoucansee,itisverysimilartothestandardtermquery.Theabovequeryisrunagainstthedescriptionfieldandwewanttohavethedocumentsthathavetheworldtermreturned.Wealsospecifiedtheboost,whichisalsoallowed.
Onethingtorememberisthatthespan_termquery,similartothestandardtermquery,isnotanalyzed.
www.EBooksWorld.ir
![Page 239: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/239.jpg)
SpanfirstqueryThespanfirstqueryallowsustomatchdocumentsthathavematchesonlyinthefirstpositionsofthefield.Inordertodefineaspanfirstquery,weneedtonestinsideofitanyotherspanquery;forexample,aspantermquerywealreadyknow.So,let’sfindthedocumentthathasthetermworldinthefirsttwopositionsinthedescriptionfield.Wedothatbysendingthefollowingquery:
{
"query":{
"span_first":{
"match":{
"span_term":{"description":"world"}
},
"end":2
}
}
}
Intheresults,wewillgetthedocumentthatwehadindexedinthebeginningofthissection.Inthematchsectionofthespanfirstquery,weshouldincludeatleastasinglespanquerythatshouldbematchedatthemaximumpositionspecifiedbytheendparameter.
So,tounderstandeverythingwell,ifwesettheendparameterto1,weshouldn’tgetourdocumentwiththepreviousquery.So,let’scheckitbysendingthefollowingquery:
{
"query":{
"span_first":{
"match":{
"span_term":{"description":"world"}
},
"end":1
}
}
}
Theresponsetotheprecedingquerywillbeasfollows:
{
"took":1,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":0,
"max_score":null,
"hits":[]
}
}
www.EBooksWorld.ir
![Page 240: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/240.jpg)
Soitisworkingasexpected.Thisisbecausethefirstterminourindexwillbethetermtheandnotthetermworldwhichwesearchedfor.
www.EBooksWorld.ir
![Page 241: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/241.jpg)
SpannearqueryThespannearqueryallowsustomatchdocumentsthathaveotherspansneareachotherandwecancallthisqueryacompoundqueryasitwrapsanotherspanquery.Forexample,ifwewanttofinddocumentsthathavethetermworldnearthetermeveryone,wewillrunthefollowingquery:
{
"query":{
"span_near":{
"clauses":[
{"span_term":{"description":"world"}},
{"span_term":{"description":"everyone"}}
],
"slop":0,
"in_order":true
}
}
}
Asyoucansee,wespecifyourqueriesintheclausessectionofthespannearquery.Itisanarrayofotherspanqueries.Theslopparameterdefinestheallowednumberoftermsbetweenthespans.Thein_orderparametercanbeusedtolimitthematchesonlytothosedocumentsthatmatchourqueriesinthesameorderthattheyweredefinedin.So,inourcase,wewillgetdocumentsthathaveworldeveryone,butnoteveryoneworldinthedescriptionfield.
Solet’sgetbacktoourquery,rightnowitwouldreturn0results.Ifyoulookatourexampledocument,youwillnoticethatbetweenthetermsworldandeveryone,anadditionaltermispresentandwesettheslopparameterto0(slopwasdiscussedduringthephrasequerydescription).Ifweincreaseitto1,wewillgetourresult.Totestit,let’ssendthefollowingquery:
{
"query":{
"span_near":{
"clauses":[
{"span_term":{"description":"world"}},
{"span_term":{"description":"everyone"}}
],
"slop":1,
"in_order":true
}
}
}
TheresultsreturnedbyElasticsearchareasfollows:
{
"took":6,
"timed_out":false,
"_shards":{
"total":5,
www.EBooksWorld.ir
![Page 242: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/242.jpg)
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":0.10848885,
"hits":[{
"_index":"spans",
"_type":"book",
"_id":"1",
"_score":0.10848885,
"_source":{
"title":"Testbook",
"author":"Testauthor",
"description":"Theworldbreakseveryone,andafterward,someare
strongatthebrokenplaces"
}
}]
}
}
Aswecansee,thealteredquerysuccessfullyreturnedourindexeddocument.
www.EBooksWorld.ir
![Page 243: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/243.jpg)
SpanorqueryThespanorqueryallowsustowrapotherspanqueriesandaggregatematchesofallthosethatwe’vewrapped.Similartothespan_nearquery,thespan_orqueryusesthearrayofclausestospecifyotherspanqueries.Forexample,ifwewanttogetthedocumentsthathavethetermworldinthefirsttwopositionsofthedescriptionfield,ortheonesthathavethetermworldnotfurtherthanasinglepositionfromthetermeveryone,wewillsendthefollowingquerytoElasticsearch:
{
"query":{
"span_or":{
"clauses":[
{
"span_first":{
"match":{
"span_term":{"description":"world"}
},
"end":2
}
},
{
"span_near":{
"clauses":[
{"span_term":{"description":"world"}},
{"span_term":{"description":"everyone"}}
],
"slop":1,
"in_order":true
}
}
]
}
}
}
Theresultoftheprecedingquerywillreturnourindexeddocument.
www.EBooksWorld.ir
![Page 244: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/244.jpg)
SpannotqueryThespannotqueryallowsustospecifytwosectionsofqueries.Thefirstistheincludesectionwhichspecifieswhichspanqueriesshouldbematchedandthesecondsectionistheexcludeonewhichspecifiesthespanquerieswhichshouldn’tbeoverlappingthefirstones.Tokeepitsimple,ifaqueryfromtheexcludeonematchesthesamespan(orapartofit)asthequeryfromtheincludesection,suchadocumentwon’tbereturnedasamatchforsuchaspannotquery.Eachofthesesectionscancontainmultiplespanqueries.
So,toillustratethatquery,let’smakeaquerythatwillreturnallthedocumentsthathavethespanconstructedfromasingletermandwhichhavethetermbreaksinthedescriptionfield.Let’salsoexcludethedocumentsthathaveaspanwhichmatchesthetermsworldandeveryoneatthemaximumofasinglepositionfromeachother,whensuchaspanoverlapstheonedefinedinthefirstspanquery.
{
"query":{
"span_not":{
"include":{
"span_term":{"description":"breaks"}
},
"exclude":{
"span_near":{
"clauses":[
{"span_term":{"description":"world"}},
{"span_term":{"description":"everyone"}}
],
"slop":1
}
}
}
}
}
Thefollowingistheresult:
{
"took":1,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":0,
"max_score":null,
"hits":[]
}
}
Asyouwouldhavenoticed,theresultofthequeryisaswewouldhaveexpected.Ourdocumentwasn’tfoundbecausethespanqueryfromtheexcludesectionwasoverlapping
www.EBooksWorld.ir
![Page 245: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/245.jpg)
thespanfromtheincludesection.
www.EBooksWorld.ir
![Page 246: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/246.jpg)
SpanwithinqueryThespan_withinqueryallowsustofinddocumentsthathaveaspanenclosedinanotherspan.Wedefinetwosectionsinthespan_withinquery:thelittleandthebig.Thelittlesectiondefinesaspanquerythatneedstobeenclosedbythespanquerydefinedusingthebigsection.
Forexample,ifwewouldliketofindadocumentthathasthetermworldnearthetermbreaksandthosetermsshouldbeinsideaspanthatisboundbythetermsworldandafterwardnotmorethan10termsfromeachother,thequerythatdoesthatwilllookasfollows:
{
"query":{
"span_within":{
"little":{
"span_near":{
"clauses":[
{"span_term":{"description":"world"}},
{"span_term":{"description":"breaks"}}
],
"slop":0,
"in_order":false
}
},
"big":{
"span_near":{
"clauses":[
{"span_term":{"description":"world"}},
{"span_term":{"description":"afterward"}}
],
"slop":10,
"in_order":false
}
}
}
}
}
www.EBooksWorld.ir
![Page 247: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/247.jpg)
SpancontainingqueryThespan_contaningquerycanbeseenastheoppositeofthespan_withinquerywejustdiscussed.Itallowsustomatchspansthatoverlapotherspans.Again,weusetwosectionswiththespanqueries:thelittleandthebig.Thelittlesectiondefinesaspanquerythatneedstobeenclosedbythespanquerydefinedusingthebigsection.
Wecanusethesameexample.Ifwewouldliketofindadocumentthathasthetermworldnearthetermbreaks,andthosetermsshouldbeinsideaspanthatisboundbythetermsworldandafterwardnotmorethan10termsfromeachother,thequerythatdoesthatwilllookasfollows:
{
"query":{
"span_containing":{
"little":{
"span_near":{
"clauses":[
{"span_term":{"description":"world"}},
{"span_term":{"description":"breaks"}}
],
"slop":0,
"in_order":false
}
},
"big":{
"span_near":{
"clauses":[
{"span_term":{"description":"world"}},
{"span_term":{"description":"afterward"}}
],
"slop":10,
"in_order":false
}
}
}
}
}
www.EBooksWorld.ir
![Page 248: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/248.jpg)
SpanmultiqueryThelasttypeofspanquerythatElasticsearchsupportsisthespan_multiquery.Itallowsustowrapanymultitermquerythatwe’vediscussed(thetermquery,therangequery,thewildcardquery,theregexquery,thefuzzyquery,ortheprefixquery)asaspanquery.
Forexample,ifwewanttofinddocumentsthathavethetermstartingwiththeprefixworinthefirsttwopositionsinthedescriptionfield,wecandothatbysendingthefollowingquery:
{
"query":{
"span_multi":{
"match":{
"prefix":{
"description":{"value":"wor"}
}
}
}
}
}
Thereisonethingtoremember–themultitermquerythatwewanttouseneedstobeenclosedinthematchsectionofthespan_multiquery.
www.EBooksWorld.ir
![Page 249: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/249.jpg)
PerformanceconsiderationsAfewwordsattheendofdiscussingspanqueries.Rememberthattheyarecostlierwhenitcomestoprocessingpower,becausenotonlydothetermshavetobematchedbutalsopositionshavetobecalculatedandchecked.ThismeansthatLuceneandthusElasticsearchwillneedmoreCPUcyclestocalculatealltheneededinformationtofindmatchingdocuments.Youcanexpectspanqueriestobeslowerthanthequeriesthatdon’ttakepositionsintoaccount.
www.EBooksWorld.ir
![Page 250: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/250.jpg)
www.EBooksWorld.ir
![Page 251: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/251.jpg)
ChoosingtherightqueryBynowwe’veseenwhatqueriesareavailableinElasticsearch,boththesimpleonesandtheonesthatcangroupotherqueriesaswell.Beforecontinuingwithmorecomplicatedtopics,wewouldliketodiscusswhichofthequeriesshouldbeusedforwhichusecase.Ofcourse,onecoulddedicatethewholebooktoshowingdifferentqueriesusecases,sowewillonlyshowafewofthemtohelpyouseewhatyoucanexpectandwhichquerytouse.
www.EBooksWorld.ir
![Page 252: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/252.jpg)
TheusecasesAsyoualreadyknowwhichqueriescanbeusedtofindwhichdata,whatwewouldliketoshowyouareexampleusecasesusingthedataweindexedinChapter2,IndexingYourData.Todothis,wewillstartwithafewguidinglinesonhowtochosethequeryandthenwewillshowyouexampleusecasesanddiscusswhythosequeriescouldbeused.
LimitingresultstogiventagsOneofthesimplestexamplesofqueryingElasticsearchisthesearchforexactterms.ByexactwemeancharactertocharactercomparisonofatermthatisindexedandwrittenintoLuceneinvertedindex.Torunsuchaquery,wecanusethetermqueryprovidedbyElasticsearch.ThisisbecauseitscontentisnotanalyzedbyElasticsearch.Forexample,let’sassumethatwewouldliketosearchforallthebookswiththevaluenovelinthetagsfield,whichasweknowfromthemappingsisnotanalyzed.Todothat,wewouldrunthefollowingcommand:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"term":{
"tags":"novel"
}
}
}'
www.EBooksWorld.ir
![Page 253: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/253.jpg)
SearchingforvaluesinarangeOneofthesimplestqueriesthatcanberunisaquerymatchingdocumentsinagivenrangeofvalues.Usuallysuchqueriesareapartofalargerqueryorafilter.Forexample,aquerythatwouldreturnbookswiththenumberofcopiesfrom1to3inclusive,wouldlookasfollows:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"range":{
"copies":{
"gte":1,
"lte":3
}
}
}
}'
BoostingsomeofthematcheddocumentsTherearemanycommonexamplesofusingtheboolquery.Forexample,verysimpleoneslikefindingdocumentshavingalistofterms.Whatwewouldliketoshowyouishowtousetheboolquerytoboostsomeofthedocuments.Forexample,ifwewanttofindallthedocumentsthathaveoneormorecopyandhavetheonesthatarepublishedafter1950,wewillrunthefollowingquery:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"bool":{
"must":[
{
"range":{
"copies":{
"gte":1
}
}
}
],
"should":[
{
"range":{
"year":{
"gt":1950
}
}
}
]
}
}
}'
IgnoringlowerscoringpartialqueriesThedis_maxquery,aswediscussed,allowsustocontrolhowinfluentialthelowerscoring
www.EBooksWorld.ir
![Page 254: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/254.jpg)
partialqueriesare.Forexample,ifwewouldonlywanttoassignthescoreofthehighestscoringpartialqueryforthedocumentsmatchingcrimepunishmentinthetitlefieldorraskolnikovinthecharactersfield,wewouldrunthefollowingquery:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"fields":["_id","_score"],
"query":{
"dis_max":{
"tie_breaker":0.0,
"queries":[
{
"match":{
"title":"crimepunishment"
}
},
{
"match":{
"characters":"raskolnikov"
}
}
]
}
}
}'
Theresultfortheprecedingquerywilllookasfollows:
{
"took":2,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":0.70710677,
"hits":[{
"_index":"library",
"_type":"book",
"_id":"4",
"_score":0.70710677
}]
}
}
Nowlet’sseethescoreofthepartialqueriesalone.Todothat,wewillrunthepartialqueriesusingthefollowingcommands:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"fields":["_id","_score"],
"query":{
"match":{
"title":"crimepunishment"
}
www.EBooksWorld.ir
![Page 255: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/255.jpg)
}
}'
Theresponsefortheprecedingqueryisasfollows:
{
"took":4,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":0.70710677,
"hits":[{
"_index":"library",
"_type":"book",
"_id":"4",
"_score":0.70710677
}]
}
}
Thefollowingisthenextcommand:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"fields":["_id","_score"],
"query":{
"match":{
"characters":"raskolnikov"
}
}
}'
Theresponseisasfollows:
{
"took":2,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":0.5,
"hits":[{
"_index":"library",
"_type":"book",
"_id":"4",
"_score":0.5
}]
}
}
www.EBooksWorld.ir
![Page 256: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/256.jpg)
Asyoucansee,thescoreofthedocumentreturnedbyourdis_maxqueryisequaltothescoreofthehighestscoringpartialquery(thefirstpartialquery).Thatisbecausewesetthetie_breakerpropertyto0.0.
UsingLucenequerysyntaxinqueriesHavingasimplesearchsyntaxisveryusefulforusersandwealreadyhavesuch–theLucenequerysyntax.Usingthequery_stringqueryisanexamplewherewecanleveragethatbyallowingtheuserstotypeinquerieswithadditionalcontrolcharacters.Forexample,ifwewouldliketofindbookshavingthetermscrimeandpunishmentintheirtitleandthefyodordostoevskyphraseintheauthorfield,andnotbeingpublishedbetween2000(exclusive)and2015(inclusive),wewouldusethefollowingcommand:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"query_string":{
"query":"+title:crime+title:punishment+author:\"fyodordostoevsky\"
-copies:{2000TO2015]"
}
}
}'
Asyoucansee,weusedtheLucenequerysyntaxtopassallthematchingrequirementsandweletthequeryparserconstructtheappropriatequery.
HandlinguserquerieswithouterrorsUsingthequery_stringqueryisveryhandy,butitisnoterrortolerant.IfouruserprovidesincorrectLucenesyntax,thequerywillreturnanerror.Becauseofthat,ElasticsearchexposesasecondquerythatsupportsanalysisandfullLucenequerysyntax–thesimple_query_stringquery.Usingsuchaqueryallowsustoruntheuserqueriesandnotcareabouttheparsingerrorsatall.Forexample,let’slookatthefollowingquery:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"query_string":{
"query":"+crime+punishment\"",
"default_field":"title"
}
}
}'
Theresponsewillcontain:
{
"error":{
"root_cause":[{
"type":"query_parsing_exception",
"reason":"Failedtoparsequery[+crime+punishment\"]",
"index":"library",
"line":6,
"col":3
}],
"type":"search_phase_execution_exception",
www.EBooksWorld.ir
![Page 257: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/257.jpg)
"reason":"allshardsfailed",
"phase":"query",
"grouped":true,
"failed_shards":[{
"shard":0,
"index":"library",
"node":"7jznW07BRrqjG-aJ7iKeaQ",
"reason":{
"type":"query_parsing_exception",
"reason":"Failedtoparsequery[+crime+punishment\"]",
"index":"library",
"line":6,
"col":3,
"caused_by":{
"type":"parse_exception",
"reason":"Cannotparse'+crime+punishment\"':Lexicalerror
atline1,column21.Encountered:<EOF>after:\"\"",
"caused_by":{
"type":"token_mgr_error",
"reason":"Lexicalerroratline1,column21.
Encountered:<EOF>after:\"\""
}
}
}
}]
},
"status":400
}
Thismeansthatthequerywasnotproperlyconstructedandaparseerrorhappened.That’swhythesimple_query_stringquerywasintroduced.Itusesaqueryparserthattriestohandleusermistakesandtriestoguesshowthequeryshouldlook.Ourqueryusingthatparserwilllookasfollows:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"simple_query_string":{
"query":"+crime+punishment\"",
"fields":["title"]
}
}
}'
Ifyouruntheprecedingquery,youwillseethattheproperdocumentisreturnedbyElasticsearcheventhoughthequeryisnotproperlyconstructed.
AutocompleteusingprefixesAverycommonusecaseistoprovideautocompletefunctionalityontheindexeddata.Asweknow,theprefixqueryisnotanalyzedandworksonthebasisoftermsindexedinthefield.Sotheactualfunctionalitydependsonwhichtokensareproducedduringindexing.Forexample,let’sassumethatwewouldliketoprovideautocompletefunctionalityonanytokeninthetitlefieldandtheuserprovidedwesprefix.Aquerythatwouldmatchsucharequirementlooksasfollows:
www.EBooksWorld.ir
![Page 258: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/258.jpg)
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"prefix":{
"title":"wes"
}
}
}'
FindingtermssimilartoagivenoneAverysimpleexampleisusingthefuzzyquerytofinddocumentshavingatermsimilartoagivenone.Forexample,ifwewanttofindallthedocumentshavingavaluesimilartocrimea,wewillrunthefollowingquery:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"fuzzy":{
"title":{
"value":"crimea",
"fuzziness":2,
"max_expansions":50
}
}
}
}'
MatchingphrasesThesimplestpositionawarequery,thephrasequeryallowsustofinddocumentsnotwithatermbuttermspositionedoneafteranother–onesthatformaphrase.Forexample,aquerythatwouldonlymatchdocumentsthathavethewestennichtsneuesphraseintheotitlefieldwouldlookasfollows:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"match_phrase":{
"otitle":"westennichtsneues"
}
}
}'
Spans,spanseverywhereThelastusecasewewouldliketodiscussisamorecomplicatedexampleofpositionawarequeriescalledspanqueries.Imaginethatwewouldliketorunaquerytofinddocumentsthathavethewesternfrontphrasenotmorethanthreepositionsafterthetermquietandallthatjustaftertheallterm?Thiscanbedonewithspanqueriesandthefollowingcommandshowshowsuchquerywilllook:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"span_near":{
"clauses":[
{
"span_term":{
www.EBooksWorld.ir
![Page 259: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/259.jpg)
"title":"all"
}
},
{
"span_near":{
"clauses":[
{
"span_term":{
"title":"quiet"
}
},
{
"span_near":{
"clauses":[
{
"span_term":{
"title":"western"
}
},
{
"span_term":{
"title":"front"
}
}
],
"slop":0,
"in_order":true
}
}
],
"slop":3,
"in_order":true
}
}
],
"slop":0,
"in_order":true
}
}
}'
Notethatthespanqueriesarenotanalyzed.WecanseethatbylookingattheresponseoftheExplainAPI.Toseethatresponse,weshouldrunthesamerequestbody(ourquery)tothe/library/book/1/_explainRESTend-point.Theinterestingpartoftheoutputlooksasfollows:
"description":"weight(spanNear([title:all,spanNear([title:quiet,
spanNear([title:western,title:front],0,true)],3,true)],0,true)in0)
[PerFieldSimilarity],resultof:",
www.EBooksWorld.ir
![Page 260: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/260.jpg)
www.EBooksWorld.ir
![Page 261: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/261.jpg)
SummaryThischapterhasbeenallaboutthequeryingprocess.WestartedbylookingathowtoqueryElasticsearchandwhatElasticsearchdoeswhenitneedstohandlethequery.Wealsolearnedaboutthebasicandcompoundqueries,sowearenowabletousebothsimplequeriesaswellastheonesthatgroupmultiplesmallqueriestogether.Finally,wediscussedhowtochoosetherightqueryforagivenusecase.
Inthenextchapter,wewillextendourqueryknowledge.WewillstartwithfilteringourqueriesandmovetohighlightingpossibilitiesandawaytovalidateourqueriesusingElasticsearchAPI.WewilldiscusssortingofsearchresultsandqueryrewritewhichwillshowuswhathappenstosomequeriesinElasticsearchinternals.
www.EBooksWorld.ir
![Page 262: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/262.jpg)
www.EBooksWorld.ir
![Page 263: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/263.jpg)
Chapter4.ExtendingYourQueryingKnowledgeInthepreviouschapter,wedivedintoElasticsearchqueryingcapabilities.WediscussedhowtoqueryElasticsearchindetailandwelearnedhowElasticsearchqueryingworks.Wenowknowthebasicandcompoundqueriesofthisgreatsearchengineandwhataretheconfigurationoptionsforeachquerytype.Wealsogottoknowwhentouseourqueriesandwediscussedafewusecasesandwhichqueriescanbeusedtohandlethem.Thischapterisdedicatedtoextendingourqueryingknowledge.Bytheendofthischapter,youwillhavelearnedthefollowingtopics:
WhatfilteringisandhowtouseitWhathighlightingisandhowtouseitWhatarethehighlightertypesandwhatbenefitstheybringHowtovalidateyourqueriesHowtosortyourqueryresultsWhatqueryrewriteisandhowtocontrolit
www.EBooksWorld.ir
![Page 264: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/264.jpg)
FilteringyourresultsInthepreviouschapter,wetalkedaboutvarioustypesofqueries.Thecommonpartwasthatwealwayswantedtogetthebestresultsfirst.Thisisthemaindifferencefromthestandarddatabaseapproachwhereeverydocumentmatchesthequeryornot.Inthedatabaseworld,wedonotaskhowgoodthedocumentis;ouronlyinterestliesintheresultsreturned.Whentalkingaboutfulltextsearchenginesthisisdifferent–weareinterestednotonlyintheresults,wearealsointerestedintheirquality.Thereasonisobvious,wearesearchinginunstructureddata,usingtextfieldsthatuselanguageanalysis,stemming,andsoon.Becauseofthat,theinitialresultsofourqueries,inmostcases,giveresultsthatarefarfromoptimal.Thisiswhywhenwetalkaboutsearching,wetalkaboutprecisionanddocumentrecall.
Ontheotherhand,sometimeswewanttolimitthewholesubsetofdocumentstoachosenpart.Forexample,inalibrary,wemaywanttosearchonlytheavailablebooks,therestbeingunimportant.Sometimesthescore,busilycalculatedforthegivenfields,onlyinterfereswiththeoverallscoreandhasnomeaningintermsofaccuracy.Insuchcases,filtersshouldbeusedtolimittheresultsofthequery,butnotinterferewiththecalculatedscore.
PriortoElasticsearch2.0,filterswereindependententitiesfromqueries.Inpractice,almosteveryqueryhaditsowncounterpartinfilters.Therewasthetermqueryandthetermfilter,theboolqueryandtheboolfilter,therangequeryandtherangefilter,andsoon.Fromtheuserpointofview,themostimportantdifferencebetweenthequeriesandthefilterswasscoring.Thefilterdidn’tcalculatescore,whichresultedinthefilterbeingeasilycachedandmoreefficient.Butthisdifferencewasveryinconvenientforusers.WiththereleaseofElasticsearch2.0anditsusageofLucene5.3,filterqueriesweredeprecatedalongwithsometypesofqueriesthatallowedustousefilters.Let’sdiscusshowfilteringworksnowandwhatwecandotoachievethesameorbetterperformanceasbeforeinElasticsearch2.0.
www.EBooksWorld.ir
![Page 265: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/265.jpg)
ThecontextisthekeyInElasticsearch2.0,queriescancalculatescoreoromititbychoosingmoreefficientwayofexecution.Thisbehavior,inmanycases,isdoneautomaticallybasedonthecontextwherethequeryisused.Thisisaboutthequeriesthatincludefiltersections,whichremovethedocumentsbasedonsomecriteria.Thesedocumentsareunnecessaryinthereturnedresultsandshouldbeskippedasquicklyaspossiblewithoutaffectingtheoverallscore.Thankstothis,afterdiscardingsomedocumentswecanfocusonlyontherestofthedocuments,calculatingtheirscores,andsortingthembeforereturning.Theexampleofthiscasecanbethemust_notclauseofaBooleanquery.Thedocumentthatmatchesthemust_notclausewillberemovedfromthereturnedresultset,socalculatingthescoreforthedocumentsmatchedbythispartoftheboolquerywouldbeanadditional,unnecessary,andperformanceineffectivework.
Thebestthingaboutallthechangesisthatwedon’tneedtocareaboutifwewanttousefilteringornot.ElasticsearchandtheunderlyingApacheLucenelibrarytakecareofchoosingtherightexecutionmethodforus.
www.EBooksWorld.ir
![Page 266: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/266.jpg)
ExplicitfilteringwithboolqueryAswementionedintheCompoundqueriessectioninChapter3,SearchingYourData,theboolqueryinElasticsearch2.0allowsustoaddafilterexplicitlybyaddingthefiltersectionandincludingaqueryinthatsection.Thisisveryconvenientifwewanttohaveapartofthequerythatneedstomatch,butwearenotinterestedinthescoreforthosedocuments.
Let’slookatthefollowingquery:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"query":{
"term":{
"available":true
}
}
}'
Weseeasimplequerythatshouldreturnallthebooksinourlibraryavailableforborrowing,whichmeansthedocumentswiththeavailablefieldsettotrue.Nowlet’scompareitwiththefollowingquery:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"query":{
"bool":{
"must":{
"match_all":{}
},
"filter":{
"term":{
"available":true
}
}
}
}
}'
Thisqueryreturnsallthebooks,butitalsocontainsthefiltersection,whichtellsElasticsearchthatweareonlyinterestedintheavailablebooks.Thequerywillreturnthesameresultsasthepreviousquerywe’veseen,ofcoursewhenlookingonlyatthenumberofdocumentsandwhichdocumentsarereturned.Thedifferenceisthescore.Forourexampledata,boththequeriesreturntwobooks.Theresultsreturnedforthefirstquerylookasfollows:
{
"took":2,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
www.EBooksWorld.ir
![Page 267: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/267.jpg)
"total":2,
"max_score":1.0,
"hits":[{
"_index":"library",
"_type":"book",
"_id":"4",
"_score":1.0,
"_source":{
"title":"CrimeandPunishment",
"otitle":"Преступлéниеинаказáние",
"author":"FyodorDostoevsky",
"year":1886,
"characters":["Raskolnikov","SofiaSemyonovnaMarmeladova"],
"tags":[],
"copies":0,
"available":true
}
},{
"_index":"library",
"_type":"book",
"_id":"1",
"_score":0.30685282,
"_source":{
"title":"AllQuietontheWesternFront",
"otitle":"ImWestennichtsNeues",
"author":"ErichMariaRemarque",
"year":1929,
"characters":["PaulBäumer","AlbertKropp","HaieWesthus",
"FredrichMüller","StanislausKatczinsky","Tjaden"],
"tags":["novel"],
"copies":1,
"available":true,
"section":3
}
}]
}
}
Theresultsforthesecondquerylookasfollows:
{
"took":2,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":2,
"max_score":1.0,
"hits":[{
"_index":"library",
"_type":"book",
"_id":"4",
"_score":1.0,
www.EBooksWorld.ir
![Page 268: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/268.jpg)
"_source":{
"title":"CrimeandPunishment",
"otitle":"Преступлéниеинаказáние",
"author":"FyodorDostoevsky",
"year":1886,
"characters":["Raskolnikov","SofiaSemyonovnaMarmeladova"],
"tags":[],
"copies":0,
"available":true
}
},{
"_index":"library",
"_type":"book",
"_id":"1",
"_score":1.0,
"_source":{
"title":"AllQuietontheWesternFront",
"otitle":"ImWestennichtsNeues",
"author":"ErichMariaRemarque",
"year":1929,
"characters":["PaulBäumer","AlbertKropp","HaieWesthus",
"FredrichMüller","StanislausKatczinsky","Tjaden"],
"tags":["novel"],
"copies":1,
"available":true,"section":3}
}]
}
}
Ifyoulookatthescoreforthedocumentsineachquery,you’llnoticethedifference.Inthesimpletermquery,Elasticsearch(theLucenelibrary,infact)hasascoreof1.0forthefirstdocumentandascoreof0.30685282forthesecondone.Thisisnotaperfectsolutionbecausetheavailabilitycheckismoreorlessbinaryandwedon’twantittointerferewiththescore.That’swhythesecondqueryisbetterinthiscase.Withtheboolqueryandfiltering,thescoreforthefilterelementisnotcalculatedandthescoreforboththedocumentsisthesame,thatis1.0.
www.EBooksWorld.ir
![Page 269: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/269.jpg)
www.EBooksWorld.ir
![Page 270: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/270.jpg)
HighlightingYouhaveprobablyheardofhighlightingorseenit.YoumaynotevenknowthatyouareactuallyusinghighlightingwhenyouareusingthebiggerandsmallerpublicsearchenginesontheWorldWideWeb(WWW).Whenwetalkabouthighlightingincontextoffulltextsearch,weusuallymeanshowingwhichwordsorphrasesfromthequerywerematchedintheresultingdocuments.Forexample,ifweuseGoogleandsearchforthewordlucene,wewouldseethatwordboldedinthesearchresults:
ItisevenmorevisibleontheMicrosoftBingsearchengine:
Inthischapter,wewillseehowtouseElasticsearchhighlightingcapabilitiestoenhanceourapplicationwithhighlightedresults.
www.EBooksWorld.ir
![Page 271: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/271.jpg)
GettingstartedwithhighlightingThereisnobetterwayofshowinghowhighlightingworksotherthanmakingaqueryandlookingattheresultsreturnedbyElasticsearch.Solet’sdothat.Weassumethatwewouldliketohighlightthetermsthatarematchedinthetitlefieldofourdocumentstoincreasethesearchexperienceofourusers.Bynowyouknowtheexampledatafromtoptobottom,solet’sagainreusethesamedataset.Wewanttomatchthetermcrimeinthetitlefieldandwewanttogethighlightingresults.Oneofthesimplestqueriesthatcanachievethislooksasfollows:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"query":{
"match":{
"title":"crime"
}
},
"highlight":{
"fields":{
"title":{}
}
}
}'
Theresponsefortheprecedingqueryisasfollows:
{
"took":16,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":0.5,
"hits":[{
"_index":"library",
"_type":"book",
"_id":"4",
"_score":0.5,
"_source":{
"title":"CrimeandPunishment",
"otitle":"Преступлéниеинаказáние",
"author":"FyodorDostoevsky",
"year":1886,
"characters":["Raskolnikov","SofiaSemyonovnaMarmeladova"],
"tags":[],
"copies":0,
"available":true
},
"highlight":{
"title":["<em>Crime</em>andPunishment"]
}
www.EBooksWorld.ir
![Page 272: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/272.jpg)
}]
}
}
Asyoucansee,apartfromthestandardinformationaboutthedocumentsthatmatchedthequery,wegotanewsectioncalledhighlight.Elasticsearchusedthe<em>HTMLtagasthebeginningofthehighlightingsectionanditsclosingcounterparttoclosethehighlightedsection.ThisisthedefaultbehaviorofElasticsearch,butwewilllearnhowtochangethat.
www.EBooksWorld.ir
![Page 273: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/273.jpg)
FieldconfigurationInordertoperformhighlighting,theoriginalcontentofthefieldneedstobepresent.Wehavetosetthefieldswewilluseforhighlighting.Thisisdonebyeithermarkingafieldtobestoredorusingthe_sourcefieldwiththosefieldsincluded.Ifthefieldissettobestoredinthemappings,thestoredversionwillbeused,otherwiseElasticsearchwilltrytousethe_sourcefieldandextractthefieldthatneedstobehighlighted.
www.EBooksWorld.ir
![Page 274: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/274.jpg)
UnderthehoodElasticsearchusesApacheLuceneunderthehoodandhighlightingisoneofthefeaturesofthatlibrary.Luceneprovidesthreetypesofhighlightingimplementation:thestandardone,whichwejustused;thesecondonecalledFastVectorHighlighter(https://lucene.apache.org/core/5_4_0/highlighter/org/apache/lucene/search/vectorhighlight/FastVectorHighlighter.htmlwhichneedstermvectorsandpositionstobeabletowork;andthethirdonecalledPostingsHighlighter
(http://lucene.apache.org/core/5_4_0/highlighter/org/apache/lucene/search/postingshighlight/PostingsHighlighter.htmlElasticsearchchoosestherighthighlighterimplementationautomatically.Ifthefieldisconfiguredwiththeterm_vectorpropertysettowith_positions_offsets,FastVectorHighlighterwillbeused.Ifthefieldisconfiguredwiththeindex_optionspropertysettooffsets,PostingsHighlighterwillbeused.Otherwise,thestandardhighlighterwillbeusedbyElasticsearch.
Whichhighlightertousedependsonyourdata,yourqueries,andtheneededperformance.Thestandardhighlighterisageneralusecaseone.However,ifyouwanttohighlightfieldswithlotsofdata,FastVectorHighlighteristherecommendedone.Thethingtorememberaboutitisthatitrequirestermvectorstobepresentandthatwillmakeyourindexslightlylarger.Finally,thefastesthighlighter,thatisalsorecommendedfornaturallanguagehighlighting,isPostingsHighlighter.However,thethingtorememberisthatPostingsHighlighterdoesn’tsupportcomplexqueriessuchasthematch_phrase_prefixqueryandinsuchcaseshighlightingwon’tbereturned.
ForcinghighlightertypeWhileElasticsearchchoosesthehighlightertypeforus,wecanalsoenforcethehighlightingtypeifwereallywantto.Todothat,weneedtosetthetypepropertytooneofthefollowingvalues:
plain:Whenthisvalueisset,Elasticsearchwillusethestandardhighlighterfvh:Whenthisvalueisset,ElasticsearchwilltryusingFastVectorHighlighter.Itwillrequiretermvectorstobeturnedonforthefieldusedforhighlighting.postings:Whenthisvalueisset,ElasticsearchwilltryusingPostingsHighlighter.Itwillrequireoffsetstobeturnedonforthefieldusedforhighlighting
Forexample,tousethestandardhighlighter,wewillrunthefollowingquery:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"query":{
"term":{
"title":"crime"
}
},
"highlight":{
"fields":{
"title":{"type":"plain"}
}
}
}'
www.EBooksWorld.ir
![Page 275: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/275.jpg)
ConfiguringHTMLtagsThedefaultbehaviorofhighlightingmechanismmaynotbesuitedforeveryone–notallofuswouldliketohavethe<em>and</em>tagstobeusedforhighlighting.Becauseofthat,Elasticsearchallowsustochangethedefaultbehaviorandchangethetagsthatareusedforthatpurpose.Todothat,weshouldsetthepre_tagsandpost_tagspropertiestothecodesnippetswewantthehighlightingtostartfromandendat;forexample,by<b>and</b>.Thepre_tagsandpost_tagspropertiesarearraysandbecauseofthatwecanprovidemorethanasingleopeningandclosingtagandElasticsearchwilluseeachofthedefinedtagstohighlightdifferentwords.Forexample,ifwewanttouse<b>astheopeningtagand</b>astheclosingtag,ourquerywilllooklikethis:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"query":{
"term":{
"title":"crime"
}
},
"highlight":{
"pre_tags":["<b>"],
"post_tags":["</b>"],
"fields":{
"title":{}
}
}
}'
TheresultreturnedbyElasticsearchtotheprecedingquerywillbeasfollows:
{
"took":3,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":0.5,
"hits":[{
"_index":"library",
"_type":"book",
"_id":"4",
"_score":0.5,
"_source":{
"title":"CrimeandPunishment",
"otitle":"Преступлéниеинаказáние",
"author":"FyodorDostoevsky",
"year":1886,
"characters":["Raskolnikov","SofiaSemyonovnaMarmeladova"],
"tags":[],
"copies":0,
www.EBooksWorld.ir
![Page 276: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/276.jpg)
"available":true
},
"highlight":{
"title":["<b>Crime</b>andPunishment"]
}
}]
}
}
Asyoucansee,thetermCrimeinthetitlefieldwassurroundedbythetagsofourchoice.
www.EBooksWorld.ir
![Page 277: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/277.jpg)
ControllinghighlightedfragmentsElasticsearchallowsustocontrolthenumberofhighlightedfragmentsreturnedandtheirsizesbyexposingtwoproperties.Thefirstoneisnumber_of_fragments,whichdefinesthenumberoffragmentsreturnedbyElasticsearch(defaultsto5).Settingthispropertyto0causesthewholefieldtobereturned,whichcanbehandyforshortfieldsbutexpensiveforlongerfields.Thesecondpropertyisfragment_sizewhichletsusspecifythemaximumlengthofthehighlightedfragmentsincharactersanddefaultsto100.
Anexamplequeryusingthesepropertieswilllookasfollows:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"query":{
"term":{
"title":"crime"
}
},
"highlight":{
"fields":{
"title":{"fragment_size":200,"number_of_fragments":0}
}
}
}'
www.EBooksWorld.ir
![Page 278: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/278.jpg)
GlobalandlocalsettingsThehighlightingpropertieswediscussedpreviouslycanbesetbothonaglobalbasisandperfieldbasis.Theglobaloneswillbeusedforallthefieldsthatdon’toverwritethemandshouldbeplacedonthesamelevelasthefieldssectionofyourhighlighting,forexample,likethis:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"query":{
"term":{
"title":"crime"
}
},
"highlight":{
"pre_tags":["<b>"],
"post_tags":["</b>"],
"fields":{
"title":{}
}
}
}'
Youcanalsosetthepropertiesforeachfield.Forexample,ifwewouldliketokeepthedefaultbehaviorforallthefieldsexceptourtitlefield,wewoulddothefollowing:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"query":{
"term":{
"title":"crime"
}
},
"highlight":{
"fields":{
"title":{"pre_tags":["<b>"],"post_tags":["</b>"]}
}
}
}'
Asyoucansee,insteadofplacingthepropertiesonthesamelevelasthefieldssection,weplaceditinsidetheemptyJSONobjectthatspecifiesthetitlefieldbehavior.Ofcourse,eachfieldcanbeconfiguredusingdifferentproperties.
www.EBooksWorld.ir
![Page 279: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/279.jpg)
RequirematchingSometimestheremaybeaneed(especiallywhenusingmultiplehighlightedfields)toshowonlythefieldsthatmatchedourquery.Inordertohavesuchbehavior,weneedtosettherequire_field_matchpropertytotrue.Settingthispropertytofalsewillcauseallthetermstobehighlightedevenifafielddidn’tmatchthequery.
Toseehowthatworks,let’screateanewindexcalledusersandlet’sindexasingledocumentthere.Wewilldothatbysendingthefollowingcommand:
curl-XPUT'http://localhost:9200/users/user/1'-d'{
"name":"Testuser",
"description":"Testdocument"
}'
So,let’sassumewewanttohighlightthehitsinbothoftheprecedingfields.Ourcommandsendingthequerytoournewindexwilllooklikethis:
curl-XGET'localhost:9200/users/_search?pretty'-d'{
"query":{
"term":{
"name":"test"
}
},
"highlight":{
"fields":{
"name":{"pre_tags":["<b>"],"post_tags":["</b>"]},
"description":{"pre_tags":["<b>"],"post_tags":["</b>"]}
}
}
}'
Theresultoftheprecedingquerywillbeasfollows:
{
"took":3,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":0.19178301,
"hits":[{
"_index":"users",
"_type":"user",
"_id":"1",
"_score":0.19178301,
"_source":{
"name":"Testuser",
"description":"Testdocument"
},
"highlight":{
www.EBooksWorld.ir
![Page 280: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/280.jpg)
"name":["<b>Test</b>user"]
}
}]
}
}
Notethatweonlygothighlightingonthenamefield.Thisisbecauseourquerymatchedonlythatfield.Let’sseewhatwillhappenifwesettherequire_field_matchpropertytofalseanduseacommandsimilartothefollowingone:
curl-XGET'localhost:9200/users/_search?pretty'-d'{
"query":{
"term":{
"name":"test"
}
},
"highlight":{
"require_field_match":false,
"fields":{
"name":{"pre_tags":["<b>"],"post_tags":["</b>"]},
"description":{"pre_tags":["<b>"],"post_tags":["</b>"]}
}
}
}'
Nowlet’slookatthemodifiedqueryresults:
{
"took":2,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":0.19178301,
"hits":[{
"_index":"users",
"_type":"user",
"_id":"1",
"_score":0.19178301,
"_source":{
"name":"Testuser",
"description":"Testdocument"
},
"highlight":{
"name":["<b>Test</b>user"],
"description":["<b>Test</b>document"]
}
}]
}
}
Asyoucansee,Elasticsearchreturnedhighlightinginboththefieldsnow.
www.EBooksWorld.ir
![Page 281: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/281.jpg)
CustomhighlightingqueryThereareusecaseswhereyourqueriesarecomplicatedandnotreallysuitableforhighlighting,butyoustillwanttousehighlightingfunctionality.Insuchcases,Elasticsearchallowsustohighlightresultsonthebasisofadifferentqueryprovidedusingthehighlight_queryproperty.Anexampleofusingadifferenthighlightingquerylooksasfollows:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"query":{
"term":{
"title":"crime"
}
},
"highlight":{
"fields":{
"title":{
"highlight_query":{
"term":{
"title":"punishment"
}
}
}
}
}
}'
Theprecedingquerywillresultinhighlightingthetermpunishmentinthetitlefield,insteadofthecrimeone.
www.EBooksWorld.ir
![Page 282: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/282.jpg)
ThePostingshighlighterItistimetotalkaboutthethirdavailablehighlighter.ItwasaddedinElasticsearch0.90.6andisslightlydifferentfromthepreviousones.PostingsHighlighterisautomaticallyusedwhenthefielddefinitionhasindex_optionssettooffsets.ToillustratehowPostingsHighlighterworks,wewillcreateasimpleindexwithproperconfigurationthatallowsthathighlightertowork.Wewilldothatbyusingthefollowingcommands:
curl-XPUT'localhost:9200/hl_test'
curl-XPOST'localhost:9200/hl_test/doc/_mapping'-d'{
"doc":{
"properties":{
"contents":{
"type":"string",
"fields":{
"ps":{"type":"string","index_options":"offsets"}
}
}
}
}
}'
Ifeverythinggoeswell,weshouldhaveanewindexandthemappings.Themappingshavetwofieldsdefined:onenamedcontentsandthesecondonenamedcontents.ps.Inthissecondcase,weturnedontheoffsetsbyusingtheindex_optionsproperty.ThismeansthatElasticsearchwillusethestandardhighlighterforthecontentsfieldandthepostingshighlighterforthecontents.psfield.
Toseethedifference,wewillindexasingledocumentwithafragmentfromWikipediadescribingthehistoryofBirmingham.Wedothatbyrunningthefollowingcommand:
curl-XPUTlocalhost:9200/hl_test/doc/1-d'{
"contents":"Birmingham''searlyhistoryisthatofaremoteand
marginalarea.Themaincentresofpopulation,powerandwealthinthepre-
industrialEnglishMidlandslayinthefertileandaccessiblerivervalleys
oftheTrent,theSevernandtheAvon.TheareaofmodernBirminghamlayin
between,ontheuplandBirminghamPlateauandwithinthedenselywoodedand
sparselypopulatedForestofArden."
}'
Thelaststepistosendaqueryusingboththehighlighters.Wecandoitinasinglerequestbyusingthefollowingcommand:
curl'localhost:9200/hl_test/_search?pretty'-d'{
"query":{
"term":{
"contents.ps":"modern"
}
},
"highlight":{
"require_field_match":false,
"fields":{
"contents":{},
"contents.ps":{}
www.EBooksWorld.ir
![Page 283: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/283.jpg)
}
}
}'
Ifeverythinggoeswell,youwillfindthefollowingsnippetintheresponsereturnedbyElasticsearch:
"highlight":{
"contents":["valleysoftheTrent,theSevernandtheAvon.Thearea
of<em>modern</em>Birminghamlayinbetween,ontheupland"],
"contents.ps":["Theareaof<em>modern</em>Birminghamlayinbetween,
ontheuplandBirminghamPlateauandwithinthedenselywoodedandsparsely
populatedForestofArden."]
}
Asyousee,boththehighlightersfoundtheoccurrenceofthedesiredword.Thedifferenceisthatthepostingshighlighterreturnsthesmartersnippet–itchecksforthesentenceboundaries.
Let’stryonemorequery:
curl'localhost:9200/hl_test/_search?pretty'-d'{
"query":{
"match_phrase":{
"contents.ps":"centresof"
}
},
"highlight":{
"require_field_match":false,
"fields":{
"contents":{},
"contents.ps":{}
}
}
}'
Wesearchedforthephrasecentresof.Asyoumayexpect,theresultsforthetwohighlighterswilldiffer.Forthestandardhighlighter,runonthecontentsfield,wewillfindthefollowingphraseintheresponse:
"Birminghamsearlyhistoryisthatofaremoteandmarginalarea.Themain
<em>centres</em><em>of</em>population"
Asyoucanclearlysee,thestandardhighlighterdividedthegivenphraseandhighlightedindividualterms.Also,notalloccurrencesofthetermscentresandofwerehighlighted,butonlytheonesthatformedthephrase.
Ontheotherhand,thePostingsHighlighterreturnedthefollowinghighlightedfragment:
"Birminghamsearlyhistoryisthat<em>of</em>aremoteandmarginal
area.","Themain<em>centres</em><em>of</em>population,powerandwealth
inthepre-industrialEnglishMidlandslayinthefertileandaccessible
rivervalleys<em>of</em>theTrent,theSevernandtheAvon.","Thearea
<em>of</em>modernBirminghamlayinbetween,ontheuplandBirmingham
PlateauandwithinthedenselywoodedandsparselypopulatedForest
www.EBooksWorld.ir
![Page 284: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/284.jpg)
<em>of</em>Arden."
Thisisthesignificantdifference.ThePostingsHighlighterhighlightedallthetermsmatchingthequeryandnotonlythosethatformedthephrase,andreturnedwholesentences.Thisisaverynicefeature,especiallywhenyouwanttodisplaythehighlightingresultsfortheuserintheUIofyourapplication.
www.EBooksWorld.ir
![Page 285: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/285.jpg)
www.EBooksWorld.ir
![Page 286: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/286.jpg)
ValidatingyourqueriesTherearetimeswhenyouarenotintotalcontrolofthequeriesthatyousendtoElasticsearch.Thequeriescanbegeneratedfrommultiplecriteriamakingthemamonsterorevenworse.Theycanbegeneratedbysomekindofawizardwhichmakesithardtotroubleshootandfindthepartthatisfaultyandmakingthequeryfail.Becauseofsuchusecases,ElasticsearchexposestheValidateAPI,whichhelpsusvalidateourqueriesanddiagnosepotentialproblems.
www.EBooksWorld.ir
![Page 287: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/287.jpg)
UsingtheValidateAPITheusageoftheValidateAPIisverysimple.Insteadofsendingthequerytothe_searchRESTendpoint,wesendittothe_validate/queryone.Andthat’sit.Let’slookatthefollowingcommand:
curl-XGET'localhost:9200/library/_validate/query?pretty'--data-binary'{
"query":{
"bool":{
"must":{
"term":{
"title":"crime"
}
},
"should":{
"range:{
"year":{
"from":1900,
"to":2000
}
}
},
"must_not":{
"term":{
"otitle":"nothing"
}
}
}
}
}'
AsimilarquerywasalreadyusedinthisbookinChapter3,SearchingYourData.TheprecedingcommandwilltellElasticsearchtovalidateitandreturntheinformationaboutitsvalidity.TheresponseofElasticsearchtotheprecedingcommandwillbesimilartothefollowingone:
{
"valid":false,
"_shards":{
"total":1,
"successful":1,
"failed":0
}
}
Lookatthevalidattribute.Itissettofalse.Somethingwentwrong.Let’sexecutethequeryvalidationonceagainwiththeexplainparameteraddedinthequery:
curl-XGET'localhost:9200/library/_validate/query?pretty&explain'--data-
binary'{
"query":{
"bool":{
"must":{
"term":{
www.EBooksWorld.ir
![Page 288: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/288.jpg)
"title":"crime"
}
},
"should":{
"range:{
"year":{
"from":1900,
"to":2000
}
}
},
"must_not":{
"term":{
"otitle":"nothing"
}
}
}
}
}'
NowtheresultreturnedfromElasticsearchismoreverbose:
{
"valid":false,
"_shards":{
"total":1,
"successful":1,
"failed":0
},
"explanations":[{
"index":"library",
"valid":false,
"error":"[library]QueryParsingException[Failedtoparse];nested:
JsonParseException[Illegalunquotedcharacter((CTRL-CHAR,code10)):has
tobeescapedusingbackslashtobeincludedinname\nat[Source:
org.elasticsearch.transport.netty.ChannelBufferStreamInput@1110d090;line:
10,column:18]];;com.fasterxml.jackson.core.JsonParseException:Illegal
unquotedcharacter((CTRL-CHAR,code10)):hastobeescapedusing
backslashtobeincludedinname\nat[Source:
org.elasticsearch.transport.netty.ChannelBufferStreamInput@1110d090;line:
10,column:18]"
}]
}
Noweverythingisclear.Inourexample,wehaveimproperlyquotedtherangeattribute.
NoteYoumaywonderwhyinourcurlqueryweusedthe--data-binaryparameter.ThisparameterproperlypreservesthenewlinecharacterwhensendingaquerytoElasticsearch.Thismeansthatthelineandthecolumnnumberremainintactandit’seasiertofinderrors.Intheothercases,the–dparameterismoreconvenientbecauseit’sshorter.
TheValidateAPIcanalsodetectothererrors,forexample,incorrectformatofanumberorothermapping-relatedissues.Unfortunately,forourapplication,itisnoteasytodetectwhattheproblemisbecauseofalackofstructureintheerrormessages.
www.EBooksWorld.ir
![Page 289: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/289.jpg)
TheValidateAPIsupportsmostoftheparametersthataresupportedbystandardElasticsearchqueries,whichinclude:explain,ignore_unavailable,allow_no_indices,expand_wildcards,operation_threading,analyzer,analyze_wildcard,default_operator,df,lenient,lowercase_expanded_terms,andrewrite.
www.EBooksWorld.ir
![Page 290: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/290.jpg)
www.EBooksWorld.ir
![Page 291: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/291.jpg)
SortingdataSofarwe’verunourqueriesandgottheresultsintheorderdeterminedbythescoreofeachdocument.However,itisnotenoughforalltheusecases.Itisreallyhandytobeabletosortourresultsonthebasisofthefieldvalues.Forexample,whenyouaresearchinglogsortime-baseddataingeneral,youprobablywanttohavethemostrecentdatafirst.Inadditiontothat,Elasticsearchallowsustocontrolhowthedocumentsuchbesortednotonlyusingfieldvalues,butalsousingmoresophisticatedsortinglikeonesthatusescriptsorsortingonfieldsthathavemultiplevalues.Wewillcoverallthatinthissection.
www.EBooksWorld.ir
![Page 292: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/292.jpg)
DefaultsortingLet’slookatthefollowingquerythatreturnsallthebookswithatleastoneofthespecifiedwords:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"query":{
"terms":{
"title":["crime","front","punishment"]
}
}
}'
Underthehood,wecanimaginethatElasticsearchseestheprecedingqueryasfollows:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"query":{
"terms":{
"title":["crime","front","punishment"]
}
},
"sort":{"_score":"desc"}
}'
Lookatthehighlightedsectionintheprecedingquery.ThisisthedefaultsortingusedbyElasticsearch.Forbettervisibility,wecanchangetheformattingslightlyandshowthehighlightedfragmentasfollows:
"sort":[
{"_score":"desc"}
]
Theprecedingsectiondefineshowthedocumentsshouldbesortedintheresultslist.Inthiscase,Elasticsearchwillshowthedocumentswiththehighestscoreontopoftheresultslist.Thesimplestmodificationistoreversetheorderingbychangingthesortsectiontothefollowingone:
"sort":[
{"_score":"asc"}
]
www.EBooksWorld.ir
![Page 293: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/293.jpg)
SelectingfieldsusedforsortingDefaultsortingisboring,isn’tit?So,let’schangeittosortonthebasisofthevaluesofthefieldspresentinthedocuments.Let’schoosethetitlefield,whichmeansthatthesortsectionofourquerywilllookasfollows:
"sort":[
{"title":"asc"}
]
Unfortunately,thisdoesn’tworkasexpected.AlthoughElasticsearchsortedthedocuments,theorderingissomewhatstrange.Lookcloselyattheresponse.Witheverydocument,Elasticsearchreturnsinformationaboutthesorting;forexample,fortheCrimeandPunishmentbook,thereturneddocumentlookslikethefollowingcode:
{
"_index":"library",
"_type":"book",
"_id":"4",
"_score":null,
"_source":{
"title":"CrimeandPunishment",
"otitle":"Преступлéниеинаказáние",
"author":"FyodorDostoevsky",
"year":1886,
"characters":["Raskolnikov","SofiaSemyonovnaMarmeladova"],
"tags":[],
"copies":0,
"available":true
},
"sort":["punishment"]
}
Ifyoucomparethetitlefieldandthereturnedsortinginformation,everythingshouldbeclear.Elasticsearch,duringtheanalysisprocess,splitsthefieldintoseveraltokens.Sincesortingisdoneusingasingletoken,Elasticsearchchoosesoneoftheproducedtokens.Itdoesthebestthatitcanbysortingthesetokensalphabeticallyandchoosingthefirstone.Thisisthereasonwhy,inthesortingvalue,wefindonlyasinglewordinsteadofthewholecontentofthetitlefield.IfyouwouldliketoseehowElasticsearchbehaveswhenusingdifferentfieldsforsorting,youcantryfieldssuchascopies:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"query":{
"terms":{
"title":["crime","front","punishment"]
}
},
"sort":[
{"copies":"asc"}
]
}'
Ingeneral,itisagoodideatohaveanotanalyzedfieldforsorting.Wecanusefieldswith
www.EBooksWorld.ir
![Page 294: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/294.jpg)
multiplevaluesforsorting,but,inmostcases,itdoesn’tmakemuchsenseandhaslimitedusage.
Asanexampleofusingtwodifferentfields,oneforsortingandanotherforsearching,let’schangeourtitlefield.Thechangedtitlefielddefinitionwilllookasfollows:
"title":{
"type":"string",
"fields":{
"sort":{"type":"string","index":"not_analyzed"}
}
}
Afterchangingthetitlefieldinthemappings(we’veusedthesamemappingsasinChapter3,SearchingYourData)andre-indexingthedata,wecantrysortingthetitle.sortfieldandseewhetheritworks.Todothis,wewillneedtosendthefollowingquery:
{
"query":{
"match_all":{}
},
"sort":[
{"title.sort":"asc"}
]
}
Now,itworksproperly.Asyoucansee,weusedthenewfield,thetitle.sortone.Wesetitasnottobeanalyzed,sothereisasingletokenforthatfieldintheindexofElasticsearch.
SortingmodeIntheresponsefromElasticsearch,everydocumentcontainsinformationaboutthevalueusedforsorting.Forexample,let’slookatoneofthedocumentsreturnedbythequeryinwhichweusedthetitlefieldforsorting:
{
"_index":"library",
"_type":"book",
"_id":"1",
"_score":null,
"_source":{
"title":"AllQuietontheWesternFront",
"otitle":"ImWestennichtsNeues",
"author":"ErichMariaRemarque",
"year":1929,
"characters":["PaulBäumer","AlbertKropp","HaieWesthus",
"FredrichMüller","StanislausKatczinsky","Tjaden"],
"tags":["novel"],
"copies":1,
"available":true,
"section":3
},
"sort":["all"]
www.EBooksWorld.ir
![Page 295: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/295.jpg)
}
Thesortingusedinthequerytogettheprecedingdocument,wasasfollows:
"sort":[
{"title":"asc"}
]
However,becausewearesortingonananalyzedfield,whichcontainsmorethanasinglevalue,thesortingdefinitionisinfactequivalenttothelongerform,whichlooksasfollows:
"sort":[
{"title":{"order":"asc","mode":"min"}
]
modedefineswhichtokenshouldbeusedforcomparisonwhensortingonafieldwhichhasmorethanonevalue.Theavailablevalueswecanchoosefromare:
min:Sortingwillusethelowestvalue(orthefirstalphabeticalvalueonthetextbasedfields)max:Sortingwillusethehighestvalue(orthelastalphabeticalvalueonthetextbasedfields)avg:Sortingwillusetheaveragevaluemedian:Sortingwillusethemedianvaluesum:Sortingwillusethesumofallthevaluesinthefield
NoteThemodessuchasmedian,avg,andsumareusefulfornumericalmultivaluedfields,butdon’tmakemuchsensewhenitcomestotextbasedfields.
Notethatsort,inrequestandresponse,isgivenasanarray.Thissuggeststhatwecanuseseveraldifferentorderings.Elasticsearchwillusethenextelementinthesortingdefinitionlisttodetermineorderingbetweenthedocumentsthathavethesamevalueoftheprevioussortingclause.So,ifwehavethesamevalueinthetitlefield,thedocumentswillbesortedbythenextfieldthatwespecify.Forexample,ifwewouldliketogetthedocumentsthathavethemostcopiesandthensortbythetitle,wewillrunthefollowingquery:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"query":{
"terms":{
"title":["crime","front","punishment"]
}
},
"sort":[
{"copies":"desc"},{"title":"asc"}
]
}'
www.EBooksWorld.ir
![Page 296: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/296.jpg)
SpecifyingbehaviorformissingfieldsWhataboutwhensomeofthedocumentsthatmatchthequerydon’thavethefieldwewanttosorton?Bydefault,documentswithoutthegivenfieldarereturnedfirstinthecaseofascendingorderandlastinthecaseofdescendingorder.However,sometimesthisisnotexactlywhatwewanttoachieve.
Whenweusesortingonnumericfields,wecanchangethedefaultElasticsearchbehaviorfordocumentswithmissingfields.Forexample,let’stakealookatthefollowingquery:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"query":{
"match_all":{}
},
"sort":[
{
"section":{
"order":"asc",
"missing":"_last"
}
}
]
}'
Notetheextendedformofthesortsectionofourquery.We’veaddedthemissingparametertoit.Bysettingthemissingparameterto_last,Elasticsearchwillplacethedocumentswithoutthegivenfieldatthebottomoftheresultslist.Settingthemissingparameterto_firstwillresultinElasticsearchplacingdocumentswithoutthegivenfieldatthetopoftheresultslist.Itisworthmentioningthatbesidesthe_lastand_firstvalues,Elasticsearchalsoallowsustouseanynumber.Insuchacase,adocumentwithoutadefinedfieldwillbetreatedasthedocumentwiththisgivenvalue.
www.EBooksWorld.ir
![Page 297: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/297.jpg)
DynamiccriteriaAswementionedintheprevioussection,Elasticsearchallowsustosortusingfieldsthathavemultiplevalues.Wecancontrolhowthecomparisonismadeusingscripts.WedothatbyshowingElasticsearchhowtocalculatethevaluethatshouldbeusedforsorting.Let’sassumethatwewanttosortbythefirstvalueindexedinthetagsfield.Let’stakealookatthefollowingexamplequery(notethatrunningthefollowingqueryrequiresthescript.inlinepropertysettoonintheelasticsearch.ymlfile):
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"query":{
"match_all":{}
},
"sort":{
"_script":{
"script":"doc[\"tags\"].values.size()>0?doc[\"tags\"].values[0]
:\"\u19999\"",
"type":"string",
"order":"asc"
}
}
}'
Intheprecedingexample,wereplacedeverynonexistentvaluewiththeUnicodecodeofacharacterthatshouldbelowenoughinthelist.Themainideaofthiscodeistocheckifourarraycontainsatleastasingleelement.Ifitdoes,thenthefirstvaluefromthearrayisreturned.Ifthearrayisempty,wereturntheUnicodecharacterthatshouldbeplacedatthebottomoftheresultslist.Besidesthescriptparameter,thisoptionofsortingrequiresustospecifytheorder(ascending,inourcase)andtypeparametersthatwillbeusedforthecomparison(wereturnstringfromourscript).
www.EBooksWorld.ir
![Page 298: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/298.jpg)
CalculatescoringwhensortingBydefault,Elasticsearchassumesthatwhenyouusesorting,thescoreiscompletelyunimportant.Usuallyitisagoodassumption;whydoadditionalcomputationswhentheimportanceofthedocumentsisgivenbythesortingformula.Sometimes,however,youwanttoknowhowgoodthedocumentisinrelationtothecurrentquery,evenifthedocumentsarepresentedinadifferentorder.Thisiswhenthetrack_scoresparametershouldbeusedandsettotrue.Anexamplequeryusingitlooksasfollows:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"query":{
"match_all":{}
},
"track_scores":true,
"sort":[
{"title":{"order":"asc"}}
]
}'
Theprecedingquerycalculatesthescoreforeverydocument.Infact,inourexample,thescoreisboringandisalwaysequalto1.0becauseofthematch_allquerywhichtreatsallthedocumentsasequal.
www.EBooksWorld.ir
![Page 299: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/299.jpg)
www.EBooksWorld.ir
![Page 300: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/300.jpg)
QueryrewriteWhendebuggingyourqueries,itisveryvaluabletoknowhowallthequeriesareexecuted.Becauseofthat,wedecidedtoincludethesectiononhowqueryrewriteworksinElasticsearch,whyitisused,andhowtocontrolit.Ifyouhaveeverusedqueries,suchastheprefixqueryandthewildcardquery,basicallyanyquerythatissaidtobemultiterm(aquerythatisbuiltofmultipleterms),you’veusedqueryrewritingeventhoughyoumaynothaveknownaboutit.Elasticsearchdoesrewriteforperformancereasons.Therewriteprocessisaboutchangingtheoriginal,expensivequeryintoasetofqueriesthatarefarlessexpensivefromanApacheLucenepointofview,thusspeedingupthequeryexecution.
www.EBooksWorld.ir
![Page 301: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/301.jpg)
PrefixqueryasanexampleThebestwaytoillustratehowtherewriteprocessisdoneinternallyistolookatanexampleandseewhichtermsareusedinsteadoftheoriginalqueryterm.Wewillindexthreedocumentstoourlibrary_itindexbyusingthefollowingcommands:
curl-XPOST'localhost:9200/library_it/book/1'-d'{"title":"Solr4
Cookbook"}'
curl-XPOST'localhost:9200/library_it/book/2'-d'{"title":"Solr3.1
Cookbook"}'
curl-XPOST'localhost:9200/library_it/book/3'-d'{"title":"Mastering
Elasticsearch"}'
Whatwewouldlikeistofindallthedocumentsthatstartwiththeletters.Simpleasthat,werunthefollowingqueryagainstourlibrary_itindex:
curl-XGET'localhost:9200/library_it/_search?pretty'-d'{
"query":{
"prefix":{
"title":{
"prefix":"s",
"rewrite":"constant_score_boolean"
}
}
}
}'
We’veusedasimpleprefixquery;we’vesaidthatwewouldliketofindallthedocumentswiththelettersinthetitlefield.We’vealsousedtherewritepropertytospecifythequeryrewritemethod,butlet’sskipitfornowaswewilldiscussthepossiblevaluesofthisparameterinthelaterpartofthissection.
Astheresponsetothepreviousquery,wegetthefollowing:
{
"took":13,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":2,
"max_score":1.0,
"hits":[{
"_index":"library_it",
"_type":"book",
"_id":"2",
"_score":1.0,
"_source":{
"title":"Solr3.1Cookbook"
}
},{
"_index":"library_it",
www.EBooksWorld.ir
![Page 302: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/302.jpg)
"_type":"book",
"_id":"1",
"_score":1.0,
"_source":{
"title":"Solr4Cookbook"
}
}]
}
}
Asyoucansee,inresponsewegotthetwodocumentsthathadthecontentsofthetitlefieldstartingwiththedesiredcharacter.Wedidn’tspecifythemappingsexplicitly,sowereliedonElasticsearch’sabilitytochoosethemappingtypeforus.Aswealreadyknow,forthetextfield,Elasticsearchusesthedefaultanalyzer.Thismeansthatthetermsinourdocumentswillbelowercasedand,becauseofthat,weusedthelowercasedletterinourprefixquery(rememberthattheprefixqueryisnotanalyzed).
www.EBooksWorld.ir
![Page 303: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/303.jpg)
GettingbacktoApacheLuceneNowlet’stakeastepbackandlookatApacheLuceneagain.IfyourecallwhatLuceneinvertedindexisbuiltfrom,youcantellthatitcontainsaterm,count,anddocumentpointer(ifyoudon’trecall,refertotheFulltextsearchingsectioninChapter1,GettingStartedwithElasticsearchCluster).So,let’sseehowthesimplifiedviewoftheindexmaylookfortheprecedingdatawe’veputtothelibrary_itindex:
WhatyouseeinthecolumnwiththeTermtextisquiteimportant.IfyoulookatElasticsearchandApacheLuceneinternals,youcanseethatourprefixquerywasrewrittentothefollowingLucenequery:
ConstantScore(title:solr)
WecanchecktheportionsoftherewriteusingtheElasticsearchAPI.Firstofall,wecanusetheExplainAPIbyrunningthefollowingcommand:
curl-XGET'localhost:9200/library_it/book/1/_explain?pretty'-d'{
"query":{
"prefix":{
"title":{
"prefix":"s",
"rewrite":"constant_score_boolean"
}
}
}
}'
Theresultwillbeasfollows:
{
"_index":"library_it",
"_type":"book",
"_id":"1",
"matched":true,
"explanation":{
www.EBooksWorld.ir
![Page 304: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/304.jpg)
"value":1.0,
"description":"sumof:",
"details":[{
"value":1.0,
"description":"ConstantScore(title:solr),productof:",
"details":[{
"value":1.0,
"description":"boost",
"details":[]
},{
"value":1.0,
"description":"queryNorm",
"details":[]
}]
},{
"value":0.0,
"description":"matchonrequiredclause,productof:",
"details":[{
"value":0.0,
"description":"#clause",
"details":[]
},{
"value":1.0,
"description":"_type:book,productof:",
"details":[{
"value":1.0,
"description":"boost",
"details":[]
},{
"value":1.0,
"description":"queryNorm",
"details":[]
}]
}]
}]
}
}
WecanseethatElasticsearchusedaconstantscorequerywiththetermsolragainstthetitlefield.
www.EBooksWorld.ir
![Page 305: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/305.jpg)
QueryrewritepropertiesWecancontrolhowthequeriesarerewritteninternally.Todothat,weplacetherewriteparameterinsidetheJSONobjectresponsiblefortheactualquery.Forexample:
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"query":{
"prefix":{
"title":"s",
"rewrite":"constant_score_boolean"
}
}
}'
Therewritepropertycantakethefollowingvalues:
scoring_boolean:ThisrewritemethodtranslateseachgeneratedtermintoaBooleanshouldclauseintheBooleanquery.Thisrewritemethodcausesthescoretobecalculatedforeachdocument.Becauseofthat,thismethodmaybeCPUdemanding.Pleasealsonotethat,forqueriesthathavemanyterms,itmayexceedtheBooleanquerylimit,whichissetto1024.ThedefaultBooleanquerylimitcanbechangedbysettingtheindex.query.bool.max_clause_countpropertyintheelasticsearch.ymlfile.However,rememberthatthemoreBooleanqueriesproduced,thelowerthequeryperformancemaybe.constant_score:Thisrewritemethodchoosesconstant_score_booleanorconstant_score_filterdependingonthequeryandtakingperformanceintoconsideration.Thisisalsothedefaultbehaviorwhentherewritepropertyisnotsetatall.constant_score_boolean:Thisrewritemethodissimilartothescoring_booleanrewritemethoddescribedpreviously,butlessCPUdemandingbecausethescoringisnotcomputedand,insteadofthat,eachtermreceivesascoreequaltothequeryboost(onebydefault,andwhichcanbesetusingtheboostproperty).BecausethisrewritemethodalsoresultsinBooleanshouldclausesbeingcreated,similartothescoring_booleanrewritemethod,thismethodcanalsohitthemaximumBooleanclauseslimit.top_terms_N:ArewritemethodthattranslateseachgeneratedtermintoaBooleanshouldclauseinaBooleanqueryandkeepsthescoresascomputedbythequery.However,unlikethescoring_booleanrewritemethod,itonlykeepsanNnumberoftopscoringtermstoavoidhittingthemaximumBooleanclauseslimitandincreasethefinalqueryperformance.top_terms_blended_freqs_N:ArewritemethodthattranslateseachtermintoaBooleanqueryandtreatthetermsasiftheyhadthesametermfrequency.top_terms_boost_N:Arewritemethodsimilartothetop_terms_None,butthescoresarenotcomputed.Instead,thedocumentsaregivenascoreequaltothevalueoftheboostproperty(onebydefault).
Forexample,ifwewouldlikeourexamplequerytousetop_terms_NwithNequalto2,ourquerywouldlooklikethis:
www.EBooksWorld.ir
![Page 306: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/306.jpg)
curl-XGET'localhost:9200/library/book/_search?pretty'-d'{
"query":{
"prefix":{
"title":{
"prefix":"s",
"rewrite":"top_terms_2"
}
}
}
}'
IfyoulookattheresultsreturnedbyElasticsearch,you’llnoticethat,unlikeourinitialquery,thedocumentsweregivenascoredifferentthanthedefault1.0:
{
"took":4,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":0.15342641,
"hits":[{
"_index":"library",
"_type":"book",
"_id":"3",
"_score":0.15342641,
"_source":{
"title":"TheCompleteSherlockHolmes",
"author":"ArthurConanDoyle",
"year":1936,
"characters":["SherlockHolmes","Dr.Watson","G.Lestrade"],
"tags":[],
"copies":0,
"available":false,
"section":12
}
}]
}
}
Thescoreisdifferentthanthedefault1.0becausewe’veusedthetop_terms_NrewritetypeandthistypeofqueryrewritekeepsthescoreforNtopscoringterms.
BeforewefinishtheQueryrewritesectionofthischapter,weshouldaskourselvesonelastquestion:whentousewhichrewritetype?Theanswertothisquestiongreatlydependsonyourusecase,but,tosummarize,ifyoucanlivewithlowerprecisionandrelevancy(buthigherperformance),youcangoforthetopNrewritemethod.Ifyouneedhighprecisionandthusmorerelevantqueries(butlowerperformance),choosetheBooleanapproach.
www.EBooksWorld.ir
![Page 307: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/307.jpg)
www.EBooksWorld.ir
![Page 308: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/308.jpg)
SummaryThechapteryoujustfinishedwasagainfocusedonquerying.Weusedfiltersandsawwhathighlightingisandhowtouseit.Welearnedwhatarethehighlightertypesandhowtheycanhelpus.WevalidatedourqueriesandwelearnedhowElasticsearchcanhelpuswhenitcomestosortingourresults.Finally,wediscussedqueryrewriting,whatthatbringsus,andhowwecancontrolit.
Inthenextchapter,wewillgetbacktoindexationtopic.WewilldiscussindexingcomplexJSONobjectssuchastree-likestructuresandindexingdatathatisnotflat.WewillprepareElasticsearchtohandlerelationshipsbetweendocumentsandwewillusetheElasticsearchAPItoupdatethestructureofourindices.
www.EBooksWorld.ir
![Page 309: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/309.jpg)
www.EBooksWorld.ir
![Page 310: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/310.jpg)
Chapter5.ExtendingYourIndexStructureWestartedthepreviouschapterbylearninghowtodealwithrevisedfilteringinElasticsearch2.xandwhattoexpectfromitnow.Wealsoexploredhighlightingandhowitcanhelpusinimprovingtheusers’searchexperience.WediscoveredqueryvalidationinElasticsearchandlearnedthewaysofdatasortinginElasticsearch.Finally,wediscussedqueryrewritingandhowthataffectsourqueries.Bytheendofthischapter,youwillhavelearnedthefollowingtopics:
Indexingtree-likestructuresIndexingdatathatisnotflatHandlingdocumentrelationshipsbyusingnestedobjectandparent–childfeaturesModifyingindexstructurebyusingElasticsearchAPI
www.EBooksWorld.ir
![Page 311: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/311.jpg)
Indexingtree-likestructuresTreesareeverywhere.Ifyoudevelopane-commerceshopapplication,yourproductswillprobablybedescribedwiththeuseofcategories.Thethingaboutcategoriesisthatinmostcasestheyarehierarchical.Therearetopcategories,suchaselectronics,music,books,andsoon.Eachofthetoplevelcategoriescanhavenumerouschildrencategories,suchasfictionandscience,andthosecangetevendeeperintosciencefiction,romance,andsoon.Ifyoulookatthefilesystem,thefilesanddirectoriesarearrangedintree-likestructuresaswell.Thisbookcanalsoberepresentedasatree:chapterscontaintopicsandtopicsaredividedintosubtopics.Sothedataaroundusisarrangedintotree-likestructuresandasyoucanimagine,Elasticsearchiscapableofindexingtree-likestructuressothatwecanrepresentthedatainaneasiermanner.Let’scheckhowwecannavigatethroughthistypeofdatausingpath_analyzer.
www.EBooksWorld.ir
![Page 312: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/312.jpg)
DatastructureTobeginwith,let’screateasimpleindexstructurebyusingthefollowingcommand:
curl-XPUT'localhost:9200/path?pretty'-d'{
"settings":{
"index":{
"analysis":{
"analyzer":{
"path_analyzer":{"tokenizer":"path_hierarchy"}
}
}
}
},
"mappings":{
"category":{
"properties":{
"category":{
"type":"string",
"fields":{
"name":{"type":"string","index":"not_analyzed"},
"path":{"type":"string","analyzer":"path_analyzer",
"store":true}
}
}
}
}
}
}'
Asyoucansee,wehaveasingletypecreated–thecategorytype.Wewilluseittostoreandindextheinformationaboutthelocationofourdocumentinthetreestructure.Theideaissimple–wecanshowthelocationofthedocumentasapath,intheexactsamemannerasthefilesanddirectoriesarepresentedonyourharddiskdrive.Forexample,inanautomotiveshop,wecanhave/cars/passenger/sport,/cars/passenger/camper,or/cars/delivery_truck/.However,toachievethat,weneedtoindexthispathintwodifferentways.Firstofall,wewilluseannotanalyzedfieldcalledname,tostoreandindexpathsnameinitsoriginalform.Wewillalsouseafieldcalledpath,whichwillusethepath_analyzeranalyzerwhichwe’vedefinedtoprocessthepathsoitiseasiertosearch.
www.EBooksWorld.ir
![Page 313: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/313.jpg)
AnalysisNow,let’sseewhatElasticsearchwilldowiththecategorypathduringtheanalysisprocess.Toseethis,wewillusethefollowingcommandline,whichusestheanalysisAPIdiscussedintheUnderstandingtheexplaininformationsectionofChapter6,MakeYourSearchBetter:
curl-XGET'localhost:9200/path/_analyze?field=category.path&pretty'-d
'/cars/passenger/sport'
ThefollowingresultswillbereturnedbyElasticsearch:
{
"tokens":[{
"token":"/cars",
"start_offset":0,
"end_offset":5,
"type":"word",
"position":0
},{
"token":"/cars/passenger",
"start_offset":0,
"end_offset":15,
"type":"word",
"position":0
},{
"token":"/cars/passenger/sport",
"start_offset":0,
"end_offset":21,
"type":"word",
"position":0
}]
}
Aswecansee,ourcategorypath/cars/passenger/sportwasprocessedbyElasticsearchanddividedintothreetokens.Thankstothis,wecansimplyfindeverydocumentthatbelongstoagivencategoryoritssubcategoriesusingthetermfilter.Fortheexampletobecomplete,let’sindexasimpledocumentbyusingthefollowingcommand:
curl-XPUT'localhost:9200/path/category/1'-d'{"category":
"/cars/passenger/sport"}'
Anexampleofusingfiltersisasfollows:
curl-XGET'localhost:9200/path/_search?pretty'-d'{
"query":{
"bool":{
"filter":{
"term":{
"category.path":"/cars"
}
}
}
}
}'
www.EBooksWorld.ir
![Page 314: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/314.jpg)
Notethatwealsohavetheoriginalvalueindexedinthecategory.namefield.Thisishandywhenwewanttofinddocumentsfromaparticularpath,ignoringthedocumentsthataredeeperinthehierarchy.
www.EBooksWorld.ir
![Page 315: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/315.jpg)
www.EBooksWorld.ir
![Page 316: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/316.jpg)
IndexingdatathatisnotflatNotalldataisflatliketheexampleswehaveusedinthebookuntilnow.MostofthedatayouwillencounterwillhavesomestructureandnestedobjectsinsidetherootJSONobject.Ofcourse,ifwearebuildingoursystemthatElasticsearchwillbeapartofandweareincontrolofallthepiecesofit,wecancreateastructurethatisconvenientforElasticsearch.Buteveninsuchcases,flatdataisnotalwaysanoption.Thankfully,Elasticsearchallowsustoindexdatathatisnotflatandthissectionwillshowushowtodothat.
www.EBooksWorld.ir
![Page 317: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/317.jpg)
DataLet’sassumethatwehavethefollowingdata(westoreitinthefilecalledstructured_data.json):
{
"author":{
"name":{
"firstName":"Fyodor",
"lastName":"Dostoevsky"
}
},
"isbn":"123456789",
"englishTitle":"CrimeandPunishment",
"year":1886,
"characters":[
{
"name":"Raskolnikov"
},
{
"name":"Sofia"
}
],
"copies":0
}
Asyoucanseethedataisnotflat–itcontainsarraysandnestedobjects.Ifwewanttocreatemappingsandusetheknowledgethatwe’vegotsofar,wewillhavetoflattenthedata.However,aswealreadysaid,Elasticsearchallowssomedegreeofstructureandweshouldbeabletocreatemappingsthatwillworkfortheprecedingexample.
www.EBooksWorld.ir
![Page 318: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/318.jpg)
ObjectsTheprecedingexampledatashowsthestructuredJSONfile.Asyoucanseeintheexample,ourrootobjecthassomeadditional,simpleproperties,suchasenglishTitle,isbn,year,andcopies.Thesewillbeindexedasnormalfieldsintheindexandwealreadyknowhowtodealwiththem(wediscussedthatintheMappingsconfigurationsectionofChapter2,IndexingYourData).Inadditiontothat,ithasthecharactersarraytypeandtheauthorobject.Theauthorobjecthasanotherobjectnestedwithinit–thenameobject,whichhastwoproperties:firstNameandlastName.Soasyoucansee,wecanhavemultiplenestedobjectsinsideeachother.
www.EBooksWorld.ir
![Page 319: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/319.jpg)
ArraysWehavealreadyusedarraytypedata,butwedidn’ttalkaboutit.Bydefault,allthefieldsinLuceneandthusinElasticsearcharemultivalued,whichmeansthattheycanstoremultiplevalues.InordertosendsuchfieldstoindexingtoElasticsearch,weusetheJSONarraytype,whichisnestedwithintheopeningandclosingsquarebrackets[].Asyoucanseeintheprecedingexample,weusedthearraytypeforthecharactersofourbook.
www.EBooksWorld.ir
![Page 320: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/320.jpg)
MappingsLet’snowlookathowourmappingswouldlooklikeforthebookobjectweshowedearlier.Wealreadysaidthattoindexarrayswedon’tneedanythingspecial.So,inourcase,toindexthecharactersdatawewillneedtoaddfieldsdefinitionsimilartothefollowingone:
"characters":{
"properties":{
"name":{"type":"string"}
}
}
Nothingstrange!Wejustnestthepropertiessectioninsidethearraysname(whichischaractersinourcase)andwedefinethefieldsthere.Astheresultoftheprecedingmappings,wewillgetthecharacters.namemultivaluedfieldintheindex.
Wedosimilarthingforourauthorobject.Wecallthesectionwiththesamenameasitispresentinthedata.Wehavetheauthorobject,butitalsohasthenameobjectnestedinit,sowedothesame–wejustnestanotherobjectinsideit.So,ourmappingsfortheauthorfieldwouldlookasfollows:
"author":{
"properties":{
"name":{
"properties":{
"firstName":{"type":"string"},
"lastName":{"type":"string"}
}
}
}
}
ThefirstNameandlastNamefieldsappearintheindexasauthor.name.firstNameandauthor.name.lastName.
Therestofthefieldsaresimplecoretypes,soI’llskipdiscussingthemastheywerealreadydiscussedintheMappingsconfigurationsectionofChapter2,IndexingYourData.
FinalmappingsSoourfinalmappingsfile,thatwe’vecalledstructured_mapping.json,lookslikethefollowing:
{
"book":{
"properties":{
"author":{
"type":"object",
"properties":{
"name":{
"type":"object",
www.EBooksWorld.ir
![Page 321: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/321.jpg)
"properties":{
"firstName":{"type":"string"},
"lastName":{"type":"string"}
}
}
}
},
"isbn":{"type":"string"},
"englishTitle":{"type":"string"},
"year":{"type":"integer"},
"characters":{
"properties":{
"name":{"type":"string"}
}
},
"copies":{"type":"integer"}
}
}
}
SendingthemappingstoElasticsearchNowthatwehaveourmappingsdone,wewouldliketotestifalltheworkwedidactuallyworks.Thistimewewilluseaslightlydifferenttechniqueofcreatinganindexandputtingthemappings.First,let’screatethelibraryindexwiththefollowingcommand(youneedtodeletethelibraryindexifyoualreadyhaveit):
curl-XPUT'localhost:9200/library'
Now,let’ssendourmappingsforthebooktype:
curl-XPUT'localhost:9200/library/book/_mapping'-d
@structured_mapping.json
Nowwecanindexourexampledata:
curl-XPOST'localhost:9200/library/book/1'-d@structured_data.json
www.EBooksWorld.ir
![Page 322: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/322.jpg)
TobeornottobedynamicAswealreadyknow,Elasticsearchisschema-less,whichmeansthatitcanindexdatawithouttheneedofcreatingthemappingsupfront.WhatElasticsearchwilldointhebackgroundwhenanewfieldisencounteredinthedataisamappingupdate–itwilltrytoguessthefieldtypeandaddittothemappings.ThedynamicbehaviorofElasticsearchisturnedonbydefault,buttheremaybesituationswhereyoumaywanttoturnitoffforsomepartsofyourindex.Inordertodothat,oneshouldaddthedynamicpropertytothegivenfieldandsetittofalse.Thisshouldbedoneonthesamelevelofnestingasthetypepropertyfortheobject,whichshouldn’tbedynamic.Forexample,ifwewantourauthorandnameobjectstonotbedynamic,weshouldmodifytherelevantpartofthemappingsfilesothatitlooksasfollows:
"author":{
"type":"object",
"dynamic":false,
"properties":{
"name":{
"type":"object",
"dynamic":false,
"properties":{
"firstName":{"type":"string","index":"analyzed"},
"lastName":{"type":"string","index":"analyzed"}
}
}
}
}
However,rememberthatinordertoaddnewfieldsforsuchobjects,wewouldhavetoupdatethemappings.
NoteYoucanalsoturnoffthedynamicmappingsfunctionalitybyaddingtheindex.mapper.dynamicpropertytoyourelasticsearch.ymlconfigurationfileandsettingittofalse.
www.EBooksWorld.ir
![Page 323: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/323.jpg)
DisablingobjectindexingThereisoneadditionalthingthatwewouldliketomentionwhenitcomestoobjectshandling–wecandisableindexingaparticularobjectbyusingtheenabledpropertyandsettingittofalse.Theremaybevariousreasonsforthat,suchasnotwantingafieldtobeindexedornotwantingawholeJSONobjecttobeindexed.Forexample,ifwewanttoomitanobjectcalledinformationfromourauthorobject,wewillhavetheauthorobjectdefinitionlookasfollows:
"author":{
"type":"object",
"properties":{
"name":{
"type":"object",
"dynamic":false,
"properties":{
"firstName":{"type":"string","index":"analyzed"},
"lastName":{"type":"string","index":"analyzed"},
"information":{"type":"object","enabled":false}
}
}
}
}
Thedynamicparametercanalsobesettostrict.Thismeansthatnewfieldswon’tbeaddedintothedocumentwhentheyappearandtheindexingofsuchdocumentwillfail.
www.EBooksWorld.ir
![Page 324: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/324.jpg)
www.EBooksWorld.ir
![Page 325: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/325.jpg)
UsingnestedobjectsNestedobjectscancomeinhandyincertainsituations.Basically,withnestedobjectsElasticsearchallowsustoconnectmultipledocumentstogether–onemaindocumentandmultipledependentones.Themaindocumentandthenestedonesareindexedtogetherandtheyareplacedinthesamesegmentoftheindex(actually,inthesameblockinsidethesegment,neareachother),whichguaranteesthebestperformancewecangetforsuchadatastructure.Thesamegoesforchangingthedocument;unlessyouareusingtheupdateAPI,youneedtoindextheparentdocumentandalltheothernestedonesatthesametime.
NoteIfyouwouldliketoreadmoreabouthownestedobjectsworkontheApacheLucenelevel,thereisaverygoodblogpostwrittenbyMikeMcCandlessathttp://blog.mikemccandless.com/2012/01/searching-relational-content-with.html.
Nowlet’sgetonwithourexampleusecase.Imaginethatwehaveashopwithclothesandwestorethesizeandcolorofeacht-shirt.Ourstandard,non-nestedmappingswilllooklikethis(storedincloth.json):
{
"cloth":{
"properties":{
"name":{"type":"string"},
"size":{"type":"string","index":"not_analyzed"},
"color":{"type":"string","index":"not_analyzed"}
}
}
}
Tocreatetheshopindexwithoutclothmapping,werunthefollowingcommands:
curl-XPOST'localhost:9200/shop'
curl-XPUT'localhost:9200/shop/cloth/_mapping'[email protected]
Nowimaginethatwehaveat-shirtinourshopthatweonlyhaveinXXLsizeinredandinXLsizeinblack.Soourexampledocumentindexationcommandwilllookasfollows:
curl-XPOST'localhost:9200/shop/cloth/1'-d'{
"name":"Testshirt",
"size":["XXL","XL"],
"color":["red","black"]
}'
However,thereisaproblemwithsuchadatastructure.WhatifoneofourclientssearchesourshopinordertofindtheXXLt-shirtinblack?Let’scheckthatbyrunningthefollowingquery(weassumethatwe’veusedourmappingstocreatetheindexandwe’veindexedourexampledocument):
curl-XGET'localhost:9200/shop/cloth/_search?pretty=true'-d'{
"query":{
www.EBooksWorld.ir
![Page 326: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/326.jpg)
"bool":{
"must":[
{
"term":{"size":"XXL"}
},
{
"term":{"color":"black"}
}
]
}
}
}'
Weshouldgetnoresults,right?ButinfactElasticsearchreturnedthefollowingdocument:
{
(…)
"hits":{
"total":1,
"max_score":0.4339554,
"hits":[{
"_index":"shop",
"_type":"cloth",
"_id":"1",
"_score":0.4339554,
"_source":{
"name":"Testshirt",
"size":["XXL","XL"],
"color":["red","black"]
}
}]
}
}
Thisisbecausethedocumentwasmatched–wehavethevalueswearesearchingforinthesizefieldandinthecolorfield.Ofcourse,thisisnotwhatwewouldliketoget.
So,let’smodifyourmappingstousethenestedobjectstoseparatecolorandsizetodifferentnesteddocuments.Thefinalmappinglooksasfollows(westorethesemappingsinthecloth_nested.jsonfile):
{
"cloth":{
"properties":{
"name":{"type":"string","index":"analyzed"},
"variation":{
"type":"nested",
"properties":{
"size":{"type":"string","index":"not_analyzed"},
"color":{"type":"string","index":"not_analyzed"}
}
}
}
}
}
www.EBooksWorld.ir
![Page 327: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/327.jpg)
Now,wewillcreateasecondindexcalledshop_nestedusingourmodifiedmappingsbyrunningthefollowingcommands:
curl-XPOST'localhost:9200/shop_nested'
curl-XPUT'localhost:9200/shop_nested/cloth/_mapping'-d
@cloth_nested.json
Asyoucansee,we’veintroducedanewobjectinsideourclothtype–variationone,whichisanestedone(thetypepropertysettonested).Itbasicallysaysthatwewillwanttoindexthenesteddocuments.Now,let’smodifyourdocument.Wewilladdthevariationobjecttoitandthatobjectwillstoretheobjectswithtwoproperties–sizeandcolor.Sotheindexcommandforourmodifiedexampleproductwilllooklikethefollowing:
curl-XPOST'localhost:9200/shop_nested/cloth/1'-d'{
"name":"Testshirt",
"variation":[
{"size":"XXL","color":"red"},
{"size":"XL","color":"black"}
]
}'
We’vestructuredthedocumentsothateachsizeanditsmatchingcolorisaseparatedocument.However,ifyourunourpreviousquery,itwon’treturnanydocuments.Thisisbecauseinordertoqueryfornesteddocuments,weneedtouseaspecializedquery.Sonowourquerylooksasfollows:
curl-XGET'localhost:9200/shop_nested/cloth/_search?pretty=true'-d'{
"query":{
"nested":{
"path":"variation",
"query":{
"bool":{
"must":[
{"term":{"variation.size":"XXL"}},
{"term":{"variation.color":"black"}}
]
}
}
}
}
}'
Andnow,theprecedingquerywillnotreturntheindexeddocument,becausewedon’thaveanesteddocumentthathasthesizeequaltoXXLandcolorblack.
Let’sgetbacktothequeryforasecondtodiscussitbriefly.Asyoucansee,weusedthenestedqueryinordertosearchinthenesteddocuments.Thepathpropertyspecifiesthenameofthenestedobject(yes,wecanhavemultipleofthem).Wejustincludedastandardquerysectionunderthenestedtype.Alsonotethatwespecifiedthefullpathforthefieldnamesinthenestedobjects,whichishandywhenyouhavemultilevelnesting,whichisalsopossible.
www.EBooksWorld.ir
![Page 328: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/328.jpg)
ScoringandnestedqueriesThereisoneadditionalpropertywhenitcomestohandlingnesteddocumentsduringquery.Inadditiontothepathproperty,thereisthescore_modeproperty,whichallowsustodefinehowthescoringiscalculatedfromthenestedqueries.Elasticsearchallowsustosetthescore_modepropertytooneofthefollowingvalues:
avg:Thisisthedefaultvalue.Usingitforthescore_modepropertywillresultinElasticsearchtakingtheaveragevaluecalculatedfromthescoresofthedefinednestedqueries.Calculatedaveragewillbeincludedinthescoreofthemainquery.sum:Usingthisvalueforthescore_modepropertywillresultinElasticsearchtakingasumofthescoresforeachnestedqueryandincludingitinthescoreofthemainquery.min:Usingthisvalueforthescore_modepropertywillresultinElasticsearchtakingthescoreoftheminimumscoringnestedqueryandincludingitinthescoreofthemainquery.max:Usingthisvalueforthescore_modepropertywillresultinElasticsearchtakingthescoreofthemaximumscoringnestedqueryandincludingitinthescoreofthemainquery.none:Usingthisvalueforthescore_modepropertywillresultinnoscorebeingtakenfromthenestedquery.
www.EBooksWorld.ir
![Page 329: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/329.jpg)
www.EBooksWorld.ir
![Page 330: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/330.jpg)
Usingtheparent-childrelationshipIntheprevioussection,wediscussedusingElasticsearchtoindexthenesteddocumentsalongwiththeparentone.However,eventhoughthenesteddocumentsareindexedasseparatedocumentsintheindex,wecan’tchangeasinglenesteddocument(unlessweusetheupdateAPI).Elasticsearchallowsustohavearealparent-childrelationshipandwewilllookatitinthefollowingsection.
www.EBooksWorld.ir
![Page 331: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/331.jpg)
IndexstructureanddataindexingLet’susethesameexamplethatweusedwhendiscussingthenesteddocuments–thehypotheticalclothstore.Whatwewouldliketohaveistheabilitytoupdatethesizesandcolorswithouttheneedtoindexthewholeparentdocumentaftereachchange.WewillseehowtoachievethatusingElasticsearchparent-childfunctionality.
ChildmappingsFirstwehavetocreateachildindexdefinition.Tocreatechildmappings,weneedtoaddthe_parentpropertywiththenameoftheparenttype,whichwillbeclothinourcase.Inthechildrendocuments,wewanttohavethesizeandthecolorofthecloth.So,thecommandthatwillcreatetheshopindexandthevariationtypewilllookasfollows:
curl-XPOST'localhost:9200/shop'
curl-XPUT'localhost:9200/shop/variation/_mapping'-d'{
"variation":{
"_parent":{"type":"cloth"},
"properties":{
"size":{"type":"string","index":"not_analyzed"},
"color":{"type":"string","index":"not_analyzed"}
}
}
}'
Andthat’sall.Youdon’tneedtospecifywhichfieldwillbeusedtoconnectthechilddocumentstotheparentones.Bydefault,Elasticsearchwillusethedocuments’uniqueidentifierforthat.Ifyourememberfromthepreviouschapters,theinformationaboutauniqueidentifierispresentintheindexbydefault.
ParentmappingsTheonlyfieldweneedtohaveinourparentdocumentisname.Wedon’tneedanythingmorethanthat.So,inordertocreateourclothtypeintheshopindex,wewillrunthefollowingcommands:
curl-XPUT'localhost:9200/shop/cloth/_mapping'-d'{
"cloth":{
"properties":{
"name":{"type":"string"}
}
}
}'
TheparentdocumentNowwearegoingtoindexourparentdocument.Aswewanttostoretheinformationaboutthesizeandthecolorinthechilddocuments,theonlythingweneedtohaveintheparentdocumentsisthename.Ofcourse,thereisonethingtoremember–ourparentdocumentsneedtobeoftypecloth,becauseofthe_parentpropertyvalueinthechildmappings.Theindexingcommandforourparentdocumentisverysimpleandlooksasfollows:
www.EBooksWorld.ir
![Page 332: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/332.jpg)
curl-XPOST'localhost:9200/shop/cloth/1'-d'{
"name":"Testshirt"
}'
Ifyoulookattheprecedingcommand,you’llnoticethatourdocumentwillbegiventheidentifier1.
ChilddocumentsToindexthechilddocuments,weneedtoprovideinformationabouttheparentdocumentwiththeuseoftheparentrequestparameter.Thevalueoftheparentparametershouldpointtotheidentifieroftheparentdocument.So,toindextwochilddocumentstoourparentdocument,weneedtorunthefollowingcommandlines:
curl-XPOST'localhost:9200/shop/variation/1000?parent=1'-d'{
"color":"red",
"size":"XXL"
}'
curl-XPOST'localhost:9200/shop/variation/1001?parent=1'-d'{
"color":"black",
"size":"XL"
}'
Andthat’sall.We’veindexedtwoadditionaldocuments,whichareofourvariationtype,butwe’vespecifiedthatourdocumentshaveaparent,thedocumentwithanidentifierof1.
www.EBooksWorld.ir
![Page 333: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/333.jpg)
QueryingWe’veindexedourdataandnowweneedtouseappropriatequeriestomatchthedocumentswiththedatastoredintheirchildren.Thisisbecause,bydefault,Elasticsearchsearchesonthedocumentswithoutlookingattheparent-childrelations.Forexample,thefollowingquerywillmatchallthreedocumentsthatwe’veindexed(twochildrenandoneparent):
curl-XGET'localhost:9200/shop/_search?q=*&pretty'
Thisisnotwhatwewouldliketoachieve,atleastinmostcases.Usually,weareinterestedinparentdocumentsthathavechildrenmatchingthequery.OfcourseElasticsearchprovidessuchfunctionalitieswithspecializedtypesofqueries.
NoteThethingtorememberthoughisthat,whenrunningqueriesagainstparents,thechildrendocumentswon’tbereturned,andviceversa.
QueryingdatainthechilddocumentsImaginethatwewanttogetclothesthatareoftheXXLsizeandarered.Asyourecall,thesizeandthecoloroftheclothareindexedinthechilddocuments,soweneedaspecializedhas_childquery,tocheckwhichparentdocumentshavechildrenwiththedesiredsizeandcolor.Soanexamplequerythatmatchesourrequirementlooksasfollows:
curl-XGET'localhost:9200/shop/_search?pretty'-d'{
"query":{
"has_child":{
"type":"variation",
"query":{
"bool":{
"must":[
{"term":{"size":"XXL"}},
{"term":{"color":"red"}}
]
}
}
}
}
}'
Thequeryisquitesimple;itisofthehas_childtype,whichtellsElasticsearchthatwewanttosearchinthechilddocuments.Inordertospecifywhichtypeofchildrenweareinterestedin,wespecifythetypepropertywiththenameofthechildtype.Thequeryisprovidedusingthequeryproperty.We’veusedastandardboolquery,whichwe’vealreadydiscussed.Theresultofthequerywillcontainonlythoseparentdocumentsthathavechildrenmatchingourboolquery.Inourcase,thesingledocumentreturnedlooksasfollows:
{
www.EBooksWorld.ir
![Page 334: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/334.jpg)
"took":16,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":1.0,
"hits":[{
"_index":"shop",
"_type":"cloth",
"_id":"1",
"_score":1.0,
"_source":{
"name":"Testshirt"
}
}]
}
}
Thehas_childqueryallowsustoprovideadditionalparameterstocontrolitsbehavior.Everyparentdocumentfoundmaybeconnectedwithoneormorechilddocuments.Thismeansthateverychilddocumentcaninfluencetheresultingscore.Bydefault,thequerydoesn’tcareaboutthechildrendocuments,howmanyofthemmatched,andwhatistheircontent–itonlymattersiftheymatchthequeryornot.Thiscanbechangedbyusingthescore_modeparameter,whichcontrolsthescorecalculationofthehas_childquery.Thevaluesthisparametercantakeare:
none:Thedefaultone,thescoregeneratedbytherelationis1.0min:Thescoreistakenfromthelowestscoredchildmax:Thescoreistakenfromthehighestscoredchildsum:Thescoreiscalculatedasthesumofthechildscoresavg:Thescoreistakenastheaverageofthechildscores
Let’sseeanexample:
curl-XGET'localhost:9200/shop/_search?pretty'-d'{
"query":{
"has_child":{
"type":"variation",
"score_mode":"sum",
"query":{
"bool":{
"must":[
{"term":{"size":"XXL"}},
{"term":{"color":"red"}}
]
}
}
}
}
}'
www.EBooksWorld.ir
![Page 335: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/335.jpg)
Weusedsumasscore_modewhichresultsinchildrencontributingtothefinalscoreoftheparentdocument–thecontributionisthesumofscoresofeverychilddocumentmatchingthequery.
Andfinally,wecanlimitthenumberofchildrendocumentsthatneedtobematched;wecanspecifyboththemaximumnumberofthechildrendocumentsallowedtobematched(themax_childrenproperty)andtheminimumnumberofchildrendocuments(themin_childrenproperty)thatneedtobematched.Thequeryillustratingtheusageoftheseparametersisasfollows:
curl-XGET'localhost:9200/shop/_search?pretty'-d'{
"query":{
"has_child":{
"type":"variation",
"min_children":1,
"max_children":3,
"query":{
"bool":{
"must":[
{"term":{"size":"XXL"}},
{"term":{"color":"red"}}
]
}
}
}
}
}'
QueryingdataintheparentdocumentsSometimes,wearenotinterestedintheparentdocumentsbutinthechildrendocuments.Ifyouwouldliketoreturnthechilddocumentsthatmatchesagivendataintheparentdocument,Elasticsearchhasaqueryforus–thehas_parentquery.Itissimilartothehas_childquery;however,insteadofthetypeproperty,wespecifytheparent_typepropertywiththevalueoftheparentdocumenttype.Forexample,thefollowingquerywillreturnboththechilddocumentsthatwe’veindexed,butnottheparentdocument:
curl-XGET'localhost:9200/shop/_search?pretty'-d'{
"query":{
"has_parent":{
"parent_type":"cloth",
"query":{
"term":{"name":"test"}
}
}
}
}'
TheresponsefromElasticsearchwillbesimilartothefollowingone:
{
"took":3,
"timed_out":false,
"_shards":{
www.EBooksWorld.ir
![Page 336: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/336.jpg)
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":2,
"max_score":1.0,
"hits":[{
"_index":"shop",
"_type":"variation",
"_id":"1000",
"_score":1.0,
"_routing":"1",
"_parent":"1",
"_source":{
"color":"red",
"size":"XXL"
}
},{
"_index":"shop",
"_type":"variation",
"_id":"1001",
"_score":1.0,
"_routing":"1",
"_parent":"1",
"_source":{
"color":"black",
"size":"XL"
}
}]
}
}
Similartothehas_childquery,thehas_parentqueryalsogivesusthepossibilityoftuningthescorecalculationofthequery.Inthiscase,score_modehasonlytwooptions:none,thedefaultonewherethescorecalculatedbythequeryisequalto1.0,andscore,whichcalculatesthescoreofthedocumentonthebasisoftheparentdocumentcontents.Anexamplethatusesscore_modeinthehas_parentquerylooksasfollows:
curl-XGET'localhost:9200/shop/_search?pretty'-d'{
"query":{
"has_parent":{
"parent_type":"cloth",
"score_mode":"score",
"query":{
"term":{"name":"test"}
}
}
}
}'
Theonedifferencewiththepreviousexampleisscore_mode.Ifyouchecktheresultsofthesequeries,you’llnoticeonlyasingledifference.Thescoreofallthedocumentsfromthefirstexampleis1.0,whilethescorefortheresultsreturnedbytheprecedingqueryisequalto0.8784157.Inthiscase,allthedocumentsfoundhavethesamescore,because
www.EBooksWorld.ir
![Page 337: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/337.jpg)
theyhaveacommonparentdocument.
www.EBooksWorld.ir
![Page 338: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/338.jpg)
PerformanceconsiderationsWhenusingElasticsearchparent-childfunctionality,youhavetobeawareoftheperformanceimpactthatithas.Thefirstthingyouneedtorememberisthattheparentandthechilddocumentsneedtobestoredinthesameshardinorderforthequeriestowork.Ifyouhappentohaveahighnumberofchildrenforasingleparent,youmayendupwithshardsnothavingasimilarnumberofdocuments.Becauseofthat,yourqueryperformancecanbelowerononeofthenodes,resultinginthewholequerybeingslower.Also,rememberthatparent-childquerieswillbeslowerthanonesthatrunagainstthedocumentsthatdon’thavearelationshipbetweenthem.Thereisawayofspeedingupjoinsfortheparent-childqueriesatthecostofmemorybyeagerlyloadingthesocalledglobalordinals;however,wewilldiscussthatmethodintheElasticsearchcachessectionofChapter9,ElasticsearchClusterinDetail.
Finally,thefirstquerywillpreloadandcachethedocumentidentifiersusingthedocvalues.Thistakestime.Inordertoimprovetheperformanceofinitialqueriesthatusetheparent-childrelationship,WarmerAPIcanbeused.YoucanfindmoreinformationabouthowtoaddwarmingqueriestoElasticsearchintheWarmingupsectionofChapter10,AdministratingYourCluster.
www.EBooksWorld.ir
![Page 339: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/339.jpg)
www.EBooksWorld.ir
![Page 340: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/340.jpg)
ModifyingyourindexstructurewiththeupdateAPIInthepreviouschapters,wediscussedhowtocreateindexmappingsandindexthedata.Butwhatifyoualreadyhavethemappingscreated,anddataindexed,butyouwanttomodifythestructureoftheindex?Ofcourseonecouldsaythatwecouldjustcreateanewindexwithnewmappings,butthatisnotalwaysapossibility,especiallyinaproductionenvironment.Thisispossibletosomeextent.Forexample,bydefault,ifweindexadocumentwithanewfield,Elasticsearchwilladdthatfieldtotheindexstructure.Let’snowlookathowtomodifytheindexstructuremanually.
NoteForsituationswheremappingchangesareneededandtheyarenotpossiblebecauseofconflictswiththecurrentindexstructure,itisverygoodtousealiases–bothreadandwriteones.WewilldiscussaliasingintheIndexaliasingsectionofChapter10,AdministratingYourCluster.
www.EBooksWorld.ir
![Page 341: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/341.jpg)
ThemappingsLet’sassumethatwehavethefollowingmappingsforourusersindexstoredintheuser.jsonfile:
{
"user":{
"properties":{
"name":{"type":"string"}
}
}
}
Asyoucansee,itisverysimple.Itjusthasasinglepropertythatwillholdtheusername.Nowlet’screateanindexcalledusersandlet’susetheprecedingmappingstocreateourtype.Todothat,wewillrunthefollowingcommands:
curl-XPOST'localhost:9200/users'
curl-XPUT'localhost:9200/users/user/_mapping'[email protected]
Ifeverythinggoeswell,wewillhaveourindex(calledusers)andtype(calleduser)created.Sonowlet’strytoaddanewfieldtothemappings.
AddinganewfieldtotheexistingindexInordertoillustratehowtoaddanewfieldtoourmappings,weassumethatwewanttoaddaphonenumbertothedatastoredforeachuser.Inordertodothat,weneedtosendanHTTPPUTcommandtothe/index_name/type_name/_mappingRESTendpointwiththeproperbodythatwillincludeournewfield.Forexample,toaddthementionedphonefield,wewillrunthefollowingcommand:
curl-XPUT'http://localhost:9200/users/user/_mapping'-d'{
"user":{
"properties":{
"phone":{"type":"string",index:"not_analyzed"}
}
}
}'
Similartothepreviouscommandweran,ifeverythinggoeswell,weshouldhaveanewfieldaddedtoourindexstructure.
NoteOfcourse,Elasticsearchwon’treindexourdataorpopulatethenewlyaddedfieldautomatically.Itwilljustalterthemappingsheldbythemasternodeandpopulatethemappingstoalltheothernodesintheclusterandthat’sall.Datareindexationmustbedonebyusortheapplicationthatindexesthedatainourenvironment.Untilthen,theolddocumentswon’thavethenewlyaddedfield.Thisiscrucialtoremember.Ifyoudon’thavetheoriginaldocuments,youcanusethe_sourcefieldtogettheoriginaldatafromElasticsearchandindexthemonceagain.
Toensureeverythingisokay,wecanruntheGETHTTPrequesttothe_mappingRESTend
www.EBooksWorld.ir
![Page 342: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/342.jpg)
pointandElasticsearchwillreturntheappropriatemappings.Anexamplecommandtogetthemappingsforourusertypeintheusersindexwilllookasfollows:
curl-XGET'localhost:9200/users/user/_mapping?pretty'
ModifyingfieldsofanexistingindexOurusersindexstructurecontainstwofields:nameandphone.Let’simaginethatweindexedsomedatabutafterawhilewedecidedthatwewanttosearchonthephonefieldandwewouldliketochangeitsindexpropertyfromnot_analyzedtoanalyzed.Becausewealreadyknowhowtoaltertheindexstructure,wewillrunthefollowingcommand:
curl-XPUT'http://localhost:9200/users/user/_mapping?pretty'-d'{
"user":{
"properties":{
"phone":{"type":"string","store":"yes","index":"analyzed"}
}
}
}'
WhatElasticsearchwillreturnisaresponseindicatinganerror,whichlooksasfollows:
{
"error":{
"root_cause":[{
"type":"illegal_argument_exception",
"reason":"Mapperfor[phone]conflictswithexistingmappingin
othertypes:\n[mapper[phone]hasdifferent[index]values,mapper[phone]
hasdifferent[store]values,mapper[phone]hasdifferent[omit_norms]
values,cannotchangefromdisabletoenabled,mapper[phone]hasdifferent
[analyzer]]"
}],
"type":"illegal_argument_exception",
"reason":"Mapperfor[phone]conflictswithexistingmappinginother
types:\n[mapper[phone]hasdifferent[index]values,mapper[phone]has
different[store]values,mapper[phone]hasdifferent[omit_norms]values,
cannotchangefromdisabletoenabled,mapper[phone]hasdifferent
[analyzer]]"
},
"status":400
}
Thisisbecausewecan’tchangeafieldthatwassettobenot_analyzedtoonethatisanalyzed.Andnotonlythat,inmostcasesyouwon’tbeabletoupdatethefieldsmapping.Thisisagoodthing,becauseifwewouldbeallowedtochangesuchsettings,wewouldconfuseElasticsearchandLucene.Imaginethatwealreadyhavemanydocumentswiththephonefieldsettonot_analyzedandweareallowedtochangethemappingstoanalyzed.Elasticsearchwouldn’tchangethedatathatwasalreadyindexed,butthequeriesthatareanalyzedwouldbeprocessedwithadifferentlogicandthusyouwouldn’tbeabletoproperlyfindyourdata.
However,togiveyousomeexamplesofwhatisprohibitedandwhatisnot,wedecidedtomentionsomeoftheoperationsforboththecases.Forexample,thefollowingmodificationcanbesafelymade:
www.EBooksWorld.ir
![Page 343: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/343.jpg)
AddinganewtypedefinitionAddinganewfieldAddinganewanalyzer
Thefollowingmodificationsareprohibitedorwillnotwork:
EnablingnormsforafieldChangingafieldtobestoredornotstoredChangingthetypeofthefield(forexample,fromtexttonumeric)ChangingastoredfieldtonotstoredandviceversaChangingthevalueofindexedpropertyChangingtheanalyzerofanalreadyindexeddocument
RememberthattheprecedingmentionedexamplesofallowedandnotallowedupdatesdonotmentionallthepossibilitiesofupdateAPIusageandyouhavetotryforyourselfiftheupdateyouaretryingtodowillwork.
www.EBooksWorld.ir
![Page 344: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/344.jpg)
www.EBooksWorld.ir
![Page 345: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/345.jpg)
SummaryThechapteryoujustfinishedreadingconcentratedonindexingoperationsandhandlingdatathatisnotflatorhaverelationshipsbetweenthedocuments.Westartedwithindexingtree-likestructuresandobjectsinElasticsearch.Wealsousednestedobjectsandlearnedwhentheycanbeused.Wealsousedparent-childfunctionalityandwelearnedhowthisapproachisdifferentcomparedtonesteddocuments.Finally,wemodifiedourindicesstructurewithacallofanAPIandlearnedwhenthisispossible.
Inthenextchapter,wewillgetbacktoqueryingrelatedtopics.WewilllearnhowLucenescoringworks,howtousescriptsinElasticsearch,andhowtohandlemultilingualdata.Wewillaffectscoringusingboostsandwewillusesynonymstoimproveusers’searchresults.Finally,wewilllookatwhatwecandotoseehowourdocumentswerescored.
www.EBooksWorld.ir
![Page 346: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/346.jpg)
www.EBooksWorld.ir
![Page 347: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/347.jpg)
Chapter6.MakeYourSearchBetterInthepreviouschapter,wewerefocusedonindexingoperations;welearnedhowtohandlethestructureddata.Westartedwithindexingtree-likestructuresandJSONobjects.Weusednestedobjectsandindexeddocumentsusingparent-childfunctionality.Finally,attheendofthechapter,weusedElasticsearchAPItomodifyourindicesstructures.Bytheendofthischapter,youwillhavelearnedthefollowingtopics:
UnderstandinghowApacheLucenescoringworksUsingscriptingHandlingmultilingualdataUsingboostingtoaffectdocumentscoringUsingsynonymsUnderstandinghowyourdocumentswerescored
www.EBooksWorld.ir
![Page 348: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/348.jpg)
IntroductiontoApacheLucenescoringWhentalkingaboutqueriesandtheirrelevance,wecan’tomittheinformationaboutthescoringandwhereitcomesfrom.Butwhatisascore?Thescoreisapropertythatdescribestherelevanceofadocumentinthecontextofaquery.Inthefollowingsection,wewilltalkaboutthedefaultApacheLucenescoringmechanism–theTF/IDFalgorithmandhowitaffectsthereturneddocument.
NoteTheTF/IDFisnottheonlyavailablealgorithmexposedbyElasticsearch.Formoreinformationabouttheavailablemodels,refertotheAvailablesimilaritymodelssectioninChapter2,IndexingYourData.YoucanalsorefertothebooksMasteringElasticsearchandMasteringElasticsearchSecondEditionpublishedbyPacktPublishing.
www.EBooksWorld.ir
![Page 349: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/349.jpg)
WhenadocumentismatchedWhenadocumentisreturnedbyLucene,itmeansthatitmatchedthequerywesenttoit.Inmostcases,eachoftheresultingdocumentsintheresponseisgivenascore.Thehigherthescore,themorerelevantthedocumentisfromthesearchengine’spointofview,ofcourse,inthecontextofagivenquery.Thismeansthatthescorefactorcalculatedforthesamedocumentontwodifferentquerieswillbedifferent.Becauseofthat,comparingscoresbetweenqueriesusuallydoesn’tmakemuchsense.However,let’sgetbacktothescoring.Tocalculatethescorepropertyforadocument,multiplefactorsaretakenintoaccount:
documentboost:Theboostvaluegivenforadocumentduringindexing.fieldboost:Theboostvaluegivenforafieldduringqueryingandindexing.coord:Thecoordinationfactorthatisbasedonthenumberoftermsthedocumenthas.Itisresponsibleforgivingmorevaluetothedocumentsthatcontainmoresearchtermscomparedtotheotherdocuments.inversedocumentfrequency:Thetermbasedfactorthattellsthescoringformulahowrareforscorepropertycalculation:inversedocumentfrequency”thegiventermis.Thehighertheinversedocumentfrequencythelesscommonthetermis.lengthnorm:Thefieldbasedfactorfornormalizationbasedonthenumberoftermsthegivenfieldcontains.Thelongerthefield,thesmallerboostthisfactorwillgive.Itbasicallymeansthattheshorterdocumentswillbefavored.termfrequency:Thetermbasedfactordescribinghowmanytimesthegiventermoccursinadocument.Thehigherthetermfrequency,thehigherthescoreofthedocumentwillbe.querynorm:Thequerybasednormalizationfactorthatiscalculatedasthesumofthesquaredweightofeachofthequeryterms.Querynormisusedtoallowscorecomparisonbetweenqueries,whichwesaidisnotalwayseasyorpossible.
www.EBooksWorld.ir
![Page 350: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/350.jpg)
DefaultscoringformulaThepracticalformulafortheTF/IDFalgorithmlooksasfollows:
Toadjustyourqueryrelevance,youdon’tneedtorememberthedetailsoftheequation,butitisveryimportanttoknowhowitworks–toatleastbeawarethatthereisanequationyoucananalyze.Wecanseethatthescorefactorforthedocumentisafunctionofqueryqanddocumentd.Therearealsotwofactorsthatarenotdependentdirectlyonqueryterms:coordandqueryNorm.Thesetwoelementsoftheformulaaremultipliedbythesumcalculatedforeachterminthequery.Thesumontheotherhandiscalculatedbymultiplyingthetermfrequencyforthegiventerm,itsinversedocumentfrequency,termboost,andthenorm,whichisthelengthnormwediscussedpreviously.
NoteNotethattheprecedingformulaisapracticalone.YoucanfindmoreinformationabouttheconceptualformulainLuceneJavadocsathttp://lucene.apache.org/core/5_4_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html
Thegoodthingabouttheprecedingrulesisthatyoudon’tneedtorememberallofthat.Whatyoushouldbeawareofiswhatmatterswhenitcomestothedocumentscore.Basically,thereareafewruleswhichcomefromtheprecedingmentionedequation:
Therarerthematchedtermis,thehigherthescorethedocumentwillhaveTheshorterthedocumentfieldsare(thelesstermstheyhave),thehigherthescorethedocumentwillhaveThehighertheboostforthefieldsis,thehigherthescorethedocumentwillhave
Aswecansee,Lucenegivesahigherscoreforthedocumentsthathavemanyquerytermsmatchedandhaveshorterfields(lesstermsindexed)thatwereusedformatching,anditalsofavorsrarertermsinsteadofthecommonones(ofcourse,theonesthatmatched).
www.EBooksWorld.ir
![Page 351: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/351.jpg)
RelevancymattersInmostcases,wewanttogetthebestmatchingdocuments.However,themostrelevantdocumentsdon’talwaysmeanthesameasthebestmatches.Someusecasesdefineverystrictrulesonwhyagivendocumentshouldbehigherontheresultslist.Forexample,onecouldsaythat,inadditiontothedocumentbeingaperfectmatchintermsofTF/IDFsimilarity,wehavepayingcustomerstoconsider.Dependingonthecustomerplan,wewanttogivemoreimportancetosuchdocuments.Insuchcases,wecouldwantthedocumentsforthecustomersthatpaythemosttobeontopofthesearchresults.Ofcourse,thisisnotrelevantinTF/IDF.
Theotherexampleisyellowpages,wherecustomerspayformoreinformationdescribingthedocument.SuchlargedocumentsmaynotbethemostrelevantonesaccordingtoTF/IDF,soyoumaywanttoadjustthescoringifyouareworkingwithsuchdata.
TheseareverysimpleexamplesandElasticsearchqueriescanbecomereallycomplicated.WewilltalkaboutsuchqueriesintheInfluencingscoreswithqueryboostssectioninthischapter.
Whenworkingonsearchrelevance,youshouldalwaysrememberthatitisnotaonetimeprocess.Yourdatawillchangewithtimeandyourquerieswillneedtobeadjusted.Inmostcases,tuningthequeryrelevancywillbeconstantwork.Youwillneedtoreacttoyourbusinessrulesandneeds,tohowtheusersbehave,andsoon.Itisveryimportanttorememberthatthisprocessisnotasingletimeoneaboutwhichyoucanforget.
www.EBooksWorld.ir
![Page 352: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/352.jpg)
www.EBooksWorld.ir
![Page 353: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/353.jpg)
ScriptingcapabilitiesofElasticsearchElasticsearchhasafewfunctionalitieswherescriptscanbeused.You’vealreadyseenexamplessuchasupdatingdocumentsandsearching.WewillalsousethescriptingcapabilitiesofElasticsearchwhenwediscussaggregations.Eventhoughscriptsseemtobearatheradvancedtopic,wewilllookatthepossibilitiesofferedbyElasticsearch.That’sbecausescriptsarepricelessincertainsituations.
Elasticsearchcanuseseverallanguagesforscripting.Whennotexplicitlydeclared,itassumesthatGroovy(www.groovy-lang.org/)isused.OtherlanguagesavailableoutoftheboxareLuceneexpressionlanguageandMustache(https://mustache.github.io/).Ofcoursewecanuseplugins,whichwillmakeElasticsearchunderstandadditionalscriptinglanguages,suchasJavaScript,MVEL,andPython.Thethingworthmentioningisthatindependentfromthescriptinglanguagethatwechoose,Elasticsearchexposesobjectsthatwecanuseinourscripts.Let’sstartbybrieflylookingatwhattypeofinformationweareallowedtouseinourscripts.
www.EBooksWorld.ir
![Page 354: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/354.jpg)
ObjectsavailableduringscriptexecutionDuringdifferentoperations,Elasticsearchallowsustousedifferentobjectsinourscripts.Todevelopascriptthatfitsourusecase,weshouldbefamiliarwiththeseobjects.
Forexample,duringasearchoperation,thefollowingobjectsareavailable:
_doc(alsoavailableasdoc):Thisisaninstanceoftheorg.elasticsearch.search.lookup.LeafDocLookupobject.Itgivesusaccesstothecurrentdocumentfoundwiththecalculatedscoreandfieldvalues._source:Thisisaninstanceoftheorg.elasticsearch.search.lookup.SourceLookupobject.Itprovidesaccesstothesourceofthecurrentdocumentandthevaluesdefinedinthesource._fields:Thisisaninstanceoftheorg.elasticsearch.search.lookup.LeafFieldsLookupobject.Itcanbeusedtoaccessthevaluesofthedocumentfields.
Ontheotherhand,duringadocumentupdateoperation,theprecedingmentionedvariablesarenotaccessible.Elasticsearchexposesonlythectxobjectwiththe_sourceproperty,whichprovidesaccesstothedocumentcurrentlyprocessedintheupdaterequest.
Aswehavepreviouslyseen,severalmethodsarementionedinthecontextofdocumentfieldsandtheirvalues.Let’snowlookatexamplesofhowtogetthevalueforaparticularfieldusingthepreviouslymentionedobjectavailableduringthesearchoperation.Inthebracketsafterthescriptpiece,youcanseewhatElasticsearchwillreturnforoneofourexampledocumentsfromthelibraryindex(wewillusethedocumentwithidentifier4):
_doc.title.value(and)_source.title(crimeandpunishment)_fields.title.value(null)
Abitconfusing,isn’tit?Duringindexing,theoriginaldocumentisbydefaultstoredinthe_sourcefield.Ofcourse,bydefault,allthefieldsarepresentinthat_sourcefield.Inadditiontothat,thedocumentisparsedandeveryfieldmaybestoredinanindexifitismarkedasstored(thatis,ifthestorepropertyissettotrue;otherwise,bydefault,thefieldsarenotstored).Finally,thefieldvaluemaybeconfiguredasindexed.Thismeansthatthefieldvalueisanalyzedandplacedintheindex.Tosumup,onefieldmaylandinElasticsearchindexinthefollowingways:
Asapartofthe_sourcedocumentAsastoredandunparsedoriginalvalueAsanindexedvaluethatisprocessedbyananalyzer
Inscripts,wehaveaccesstoallthesefieldrepresentations.Theonlyexceptionistheupdateoperation,which,aswe’vementionedbefore,givesusonlyaccesstodocument_sourceaspartofthectxvariable.Youmaywonderwhichversionyoushoulduse.Well,ifyouwantaccesstotheprocessedform,theanswerwillbesimple–usethe_docobject.Whatabout_sourceand_fields?Inmostcases,_sourceisagoodchoice.Itisusually
www.EBooksWorld.ir
![Page 355: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/355.jpg)
fastandneedslessdiskoperationsthanreadingtheoriginalfieldvaluesfromtheindex.Thisisespeciallytruewhenyouneedtoreadthevaluesofmultiplefieldsinyourscripts;fetchingasingle_sourcefieldisfasterthanfetchingmultipleindependentfieldsfromtheindex.
www.EBooksWorld.ir
![Page 356: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/356.jpg)
ScripttypesElasticsearchallowsustousescriptsinthreedifferentways:
Inlinescripts:ThesourceofthescriptisdirectlydefinedinthequeryInfilescripts:ThesourceisdefinedintheexternalfileplacedintheElasticsearchconfig/scriptsdirectoryAsadocumentinthededicatedindex:Thesourceofthescriptisdefinedasadocumentinaspecialindexavailablebyusingthe/_scriptsAPIend-point
Choosingthewaytodefinescriptsdependsonseveralfactors.Ifyouhavescriptswhichyouwilluseinmanydifferentqueries,thefileorthededicatedindexseemtobethebestsolutions.Thescriptsinfileisprobablylessconvenient,butitispreferredfromthesecuritypointofview;theycan’tbeoverwrittenandinjectedintoyourquerycausingasecuritybreach.
InfilescriptsThisistheonlywaytoallowdynamicscriptingifwedon’twanttoenablequerydynamicscriptinginElasticsearch.Theideaisthateveryscriptusedbythequeriesisdefinedinitsownfileplacedintheconfig/scriptsdirectory.Wewillnowlookatthismethodofusingscripts.Let’screateanexamplefilecalledtag_sort.groovyandlet’splaceitintheconfig/scriptsdirectoryofourElasticsearchinstance(orinstancesifwerunacluster).Thecontentofthementionedfileshouldlooklikethis:
_doc.tags.values.size()>0?_doc.tags.values[0]:'\u19999'
Afterfewseconds,Elasticsearchwillautomaticallyloadanewfile.YoushouldseesomethinglikethefollowingintheElasticsearchlogs:
[2015-08-3013:14:33,005][INFO][script][AlexWilder]
compilingscriptfile[/Users/negativ/Developer/ES/es-
current/config/scripts/tag_sort.groovy]
NoteIfyouhavemulti-nodecluster,youhavetomakesurethatthescriptisavailableoneverynode.
Nowwearereadytousethisscriptinourqueries.YoumayrememberthatweusedexactlythesamescriptintheSortingdatasectioninChapter4,ExtendingYourQueryingKnowledge.Nowthemodifiedquerythatusesourscriptstoredinthefilelooksasfollows:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"match_all":{}
},
"sort":{
"_script":{
"script":{
"file":"tag_sort"
www.EBooksWorld.ir
![Page 357: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/357.jpg)
},
"type":"string",
"order":"asc"
}
}
}'
Wewillreturntothis,butfirst,thenextpossiblewayofdefininginlinescripts.
InlinescriptsInlinescriptsareamoreconvenientwayofusingscripts,especiallyforconstantlychangingqueriesandforad-hocqueries.Themaindrawbackofsuchanapproachissecurity.Ifweallowuserstorunanykindofquery,includingscripts,wecanexposeourElasticsearchinstancetoattackers.SuchattackscanexecutearbitrarycodeontheserverrunningElasticsearchwithrightsequaltotheonesgiventotheuserrunningElasticsearch.Intheworstcasescenario,theattackercouldusesecurityholestogainsuperuserrights.Thisisthereasonwhyinlinescriptsaredisabledbydefault.Aftercarefulconsideration,youcanenablethembyadding:
script.inline:on
Addtheprecedingcommandlinetotheelasticsearch.ymlfile.
Afterallowingtheinlinescripttobeexecuted,wecanrunaquerythatlooksasfollows:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"match_all":{}
},
"sort":{
"_script":{
"script":{
"inline":"_doc.tags.values.size()>0?_doc.tags.values[0]:
\"\u19999\""
},
"type":"string",
"order":"asc"
}
}
}'
IndexedscriptsThelastoptionfordefiningscriptsisstoringtheminthededicatedElasticsearchindex.Forthesamesecurityreasons,dynamicexecutionoftheindexedscriptsisbydefaultdisabled.Toenabletheindexedscripts,wehavetoaddasimilarconfigurationoptiontotheoneweaddedtobeabletousetheinlinescripts.Weneedtoaddthefollowinglinetotheelasticsearch.ymlfile:
script.indexed:on
Afteraddingtheprecedingpropertytoallthenodesandrestartingthecluster,wewillbereadytostartusingtheindexedscripts.Elasticsearchprovidesanadditional,dedicated
www.EBooksWorld.ir
![Page 358: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/358.jpg)
endpointforthispurpose.Let’sstoreourscript:
curl-XPOST'localhost:9200/_scripts/groovy/tag_sort'-d'{
"script":"_doc.tags.values.size()>0?_doc.tags.values[0]:
\"\u19999\""
}'
Thescriptisready,butlet’sdiscusswhatwejustdid.WesentanHTTPPOSTrequesttothespecial_scriptsRESTend-point.Wealsospecifiedthelanguageofthescript(groovyinourcase)andthenameofthescript(tag_sort).Thebodyoftherequestisthescriptitself.
Wecannowmoveontothequery,whichlooksasfollows:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"match_all":{}
},
"sort":{
"_script":{
"script":{
"id":"tag_sort"
},
"type":"string",
"order":"asc"
}
}
}'
Aswesee,thequeryispracticallyidenticaltothequeryusedwiththescriptdefinedinafile.Theonlydifferenceisthatweprovidedtheidentifierofthescriptusingtheidparameterinsteadofprovidingthefilename.
www.EBooksWorld.ir
![Page 359: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/359.jpg)
QueryingwithscriptsIfwelookatanyrequestmadetoElasticsearchthatusesscripts,wewillnoticesomesimilarproperties,whichareasfollows:
script:Thispropertywrapsthescriptdefinition.inline:Thispropertyholdsthecodeofthescriptitself.id:Thispropertydefinestheidentifieroftheindexedscript.file:Thefilenameofthescriptwithouttheextension.lang:Thispropertydefinesthelanguageofthescript.Ifitisomitted,Elasticsearchassumesgroovy.params:Thisobjectcontainstheparametersandtheirvalues.Everydefinedparametercanbeusedinsidethescriptbyspecifyingthatparameter’sname.Theparametersallowustowritecleanercodewhichwillbeexecutedinamoreefficientmanner.Scriptsusingtheparametersareexecutedfasterthancodewithembeddedconstantsbecauseofcaching.
www.EBooksWorld.ir
![Page 360: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/360.jpg)
ScriptingwithparametersAsourscriptsbecomemoreandmorecomplicated,theneedforcreatingmultiple,almostidenticalscriptscanappear.Thesescriptsusuallydifferinthevaluesused,withthelogicbehindthembeingexactlythesame.Inoursimpleexample,weusedahardcodedvalueusedtomarkdocumentswithemptytagslist.Let’schangethistoallowdefinitionofthehardcodedvalue.Let’suseinfilescriptdefinitionandcreateatag_sort_with_param.groovyfilewiththefollowingcontents:
_doc.tags.values.size()>0?_doc.tags.values[0]:tvalue
Theonlychangewe’vemadeistheintroductionoftheparameternamedtvalue,whichcanbesetinthequeryinthefollowingway:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"match_all":{}
},
"sort":{
"_script":{
"script":{
"file":"tag_sort_with_param",
"params":{
"tvalue":"000"
}
},
"type":"string",
"order":"asc"
}
}
}'
Theparamssectiondefinesallthescriptparameters.Inoursimpleexample,we’veonlyusedasingleparameter,butofcoursewecanhavemultipleparametersinasinglequery.
www.EBooksWorld.ir
![Page 361: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/361.jpg)
ScriptlanguagesAswealreadysaid,thedefaultlanguageforscriptingisGroovy.However,wearenotlimitedtoonlyasinglescriptinglanguagewhenusingElasticsearch.Infact,ifyouwouldliketo,youcanevenuseJavatowriteyourscripts.Inadditiontothat,thecommunitybehindElasticsearchprovidesadditionallanguagessupportasplugins.Soifyouarewillingtoinstallplugins,youcanextendthelistofscriptinglanguagesthatElasticsearchsupportsevenfurther.YoumaywonderwhyyouwouldevenconsiderusingascriptinglanguageotherthanthedefaultGroovy.Thefirstreasonisyourownpreferences.Ifyouareapythonenthusiast,youareprobablynowthinkingabouthowtousepythonforyourElasticsearchscripts.Theotherreasoncouldbesecurity.Whenwetalkedabouttheinlinescripts,wetoldyouthattheyareturnedoffbydefault.Thisisnotexactlytrueforallthescriptinglanguagesavailableoutofthebox.TheinlinescriptsaredisabledbydefaultwhenusingGroovy,butyoucanuseLuceneexpressionsandMustachewithoutanyissues.Thisisbecausethoselanguagesaresandboxed,whichmeansthatthesecuritysensitivefunctionsareturnedoff.Andofcourse,thelastfactorwhenchoosingalanguageisperformance.Theoretically,thenativescripts(inJava)shouldhavebetterperformancethanothers,butyoushouldrememberthatthedifferencecanbeinsignificant.Youshouldalwaysconsiderthecostofdevelopmentandmeasureperformance.
www.EBooksWorld.ir
![Page 362: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/362.jpg)
UsingotherthanembeddedlanguagesUsingGroovyforscriptingisasimpleandsufficientsolutionformostusecases.However,youmayhaveadifferentpreferenceandyoumayliketousesomethingdifferent,suchasJavaScript,Python,orMvel.Beforeusingotherlanguages,wemustinstallanappropriateplugin.YoucanreadmoredetailsaboutpluginsintheElasticsearchpluginssectionofChapter9,ElasticsearchCluster.Fornow,we’lljustrunthefollowingcommandfromtheElasticsearchdirectory:
bin/plugininstalllang-javascript
TheprecedingcommandwillinstallapluginthatwillallowtheusageofJavaScriptasthescriptinglanguage.Theonlychangeweshouldmakeintherequestistoaddtheadditionalinformationaboutthelanguageweareusingforscriptingand,ofcourse,modifythescriptitselftocorrectlyusethenewlanguage.Lookatthefollowingexample:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"match_all":{}
},
"sort":{
"_script":{
"script":{
"inline":"_doc.tags.values.length>0?_doc.tags.values[0]
:\"\u19999\";",
"lang":"javascript"
},
"type":"string",
"order":"asc"
}
}
}'
Asyoucansee,we’veusedJavaScriptforscriptinginsteadofthedefaultGroovy.ThelangparameterinformsElasticsearchaboutthelanguagebeingused.
www.EBooksWorld.ir
![Page 363: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/363.jpg)
UsingnativecodeIncasethescriptsaretoosloworyoudon’tlikescriptinglanguages,ElasticsearchallowsyoutowriteJavaclassesandusetheminsteadofscripts.Therearetwopossiblewaysofaddingnativescripts:addingclassesdefiningscriptstoElasticsearchclasspathoraddingscriptasafunctionalityprovidedbyaplugin.Wewilldescribethissecondsolutionasitismoreelegant.
ThefactoryimplementationWeneedtoimplementatleasttwoclassestocreateanewnativescript.Thefirstoneisafactoryforourscript.Fornow,let’sfocusonit.Thefollowingsamplecodeillustratesthefactoryforourscript:
packagepl.solr.elasticsearch.examples.scripts;
importjava.util.Map;
importorg.elasticsearch.common.Nullable;
importorg.elasticsearch.script.ExecutableScript;
importorg.elasticsearch.script.NativeScriptFactory;
publicclassHashCodeSortNativeScriptFactoryimplementsNativeScriptFactory
{
@Override
publicExecutableScriptnewScript(@NullableMap<String,Object>params)
{
returnnewHashCodeSortScript(params);
}
@Override
publicbooleanneedsScores(){
returnfalse;
}
}
Theessentialpartsarehighlightedinthecodesnippet.Thisclassshouldimplementtheorg.elasticsearch.script.NativeScriptFactoryclass.Theinterfaceforcesustoimplementtwomethods.ThenewScript()methodtakestheparametersdefinedintheAPIcallandreturnsaninstanceofourscript.Finally,needsScores()informsElasticsearchifwewanttousescoringandwhetheritshouldbecalculated.
ImplementingthenativescriptNowlet’slookattheimplementationofourscript.Theideaissimple–ourscriptwillbeusedforsorting.DocumentswillbeorderedbythehashCode()valueofthechosenfield.Thedocumentswithoutavalueinthedefinedfieldwillbefirstontheresultslist.Weknowthelogicdoesn’tmaketoomuchsense,butitisgoodforpresentationasitissimple.Thesourcecodeforournativescriptlooksasfollows:
www.EBooksWorld.ir
![Page 364: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/364.jpg)
packagepl.solr.elasticsearch.examples.scripts;
importjava.util.Map;
importorg.elasticsearch.script.AbstractSearchScript;
publicclassHashCodeSortScriptextendsAbstractSearchScript{
privateStringfield="name";
publicHashCodeSortScript(Map<String,Object>params){
if(params!=null&¶ms.containsKey("field")){
this.field=params.get("field").toString();
}
}
@Override
publicObjectrun(){
Objectvalue=source().get(field);
if(value!=null){
returnvalue.hashCode();
}
return0;
}
}
Firstofall,ourclassinheritsfromtheorg.elasticsearch.script.AbstractSearchScriptclassandimplementstherun()method.Thisiswherewegettheappropriatevaluesfromthecurrentdocument,processitaccordingtoourstrangelogic,andreturntheresult.Youmaynoticethesource()call.Itisexactlythesame_sourceparameterthatweusedwhendealingwithnon-nativescripts.Thedoc()andfields()methodsarealsoavailableandtheyfollowthesamelogicwedescribedearlier.
Thethingworthlookingatishowwe’veusedtheparameters.Weassumethatausercanputthefieldparameter,tellinguswhichdocumentfieldwillbeusedformanipulation.Wealsoprovideadefaultvalueforthisparameter.
TheplugindefinitionWesaidthatwewillinstallourscriptasapartofaplugin.Thisiswhyweneedadditionalfiles.ThefirstfileistheplugininitializationclasswherewetellElasticsearchaboutournewscript:
packagepl.solr.elasticsearch.examples.scripts;
importorg.elasticsearch.plugins.Plugin;
importorg.elasticsearch.script.ScriptModule;
publicclassScriptPluginextendsPlugin{
@Override
publicStringdescription(){
return"Theexampleofnativesortscript";
www.EBooksWorld.ir
![Page 365: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/365.jpg)
}
@Override
publicStringname(){
return"naive-sort-plugin";
}
publicvoidonModule(finalScriptModulemodule){
module.registerScript("native_sort",
HashCodeSortNativeScriptFactory.class);
}
}
Theimplementationiseasy.Thedescription()andname()methodsareonlyforinformation,solet’sfocusontheonModule()method.Inourcase,weneedaccesstothescriptmodule–Elasticsearchservicewithscriptsandscriptinglanguages.ThisiswhywedefineonModule()withoneScriptModuleargument.ThankstoElasticsearchmagic,wecanusethismoduleandregisterourscriptsoitcanbefoundbytheengine.WehaveusedtheregisterScript()method,whichtakesthescriptnameandthepreviouslydefinedfactoryclass.
Thesecondneededfileisaplugindescriptorfile:plugin-descriptor.properties.ItdefinestheconstantsusedbytheElasticsearchpluginsubsystem.Withoutmorethinking,let’slookatthecontentsofthisfile:
jvm=true
classname=pl.solr.elasticsearch.examples.scripts.ScriptPlugin
elasticsearch.version=2.2.0
version=0.0.1-SNAPSHOT
name=native_script
description=ExampleNativeScripts
java.version=1.7
Theappropriatelineshavethefollowingmeaning:
jvm:tellsElasticsearchthatourfilecontainsJavacodeclassname:describesthemainclasswithplugindefinitionelasticsearch.versionandjava.version:tellsusabouttheElasticsearchversionthatissupportedbythepluginandtheJavaversionthatisneedednameanddescription:Informativenameandshortdescriptionofourplugin
Andthat’sit.Wehaveallthefilesneededtorunourscript.Pleasenotethatyoucanhavemorethanasinglescriptpackedasasingleplugin.
InstallingthepluginNowit’stimetoinstallournativescriptembeddedintheplugin.AfterpackingthecompiledclassesasaJARarchive,weshouldputitintheElasticsearchplugins/native-scriptdirectory.Thenative-scriptpartisarootdirectoryforourpluginandyoumaynameitasyouwish.Inthisdirectoryyoualsoneedthepreparedplugin-descriptor.propertiesfile.ThismakesourpluginvisibletoElasicsearch.
www.EBooksWorld.ir
![Page 366: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/366.jpg)
RunningthescriptAfterrestartingElasticsearch(orthewholeclusterifyourunmorethanasinglenode),wecanstartsendingthequeriesthatuseournativescript.Forexample,wewillsendaquerythatusesourpreviouslyindexeddatafromthelibraryindex.Thisexamplequerylooksasfollows:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"match_all":{}
},
"sort":{
"_script":{
"script":{
"script":"native_sort",
"lang":"native",
"params":{
"field":"otitle"
}
},
"type":"string",
"order":"asc"
}
}
}'
Notetheparamspartofthequery.Inthiscall,wewanttosortontheotitlefield.Weprovidethescriptnamenative_sortandthescriptlanguagenative.Thisisrequired.Ifeverythinggoeswell,weshouldseeourresultssortedbyourcustomsortlogic.IfwelookattheresponsefromElasticsearch,wewillseethatthedocumentswithouttheotitlefieldareatthefirstfewpositionsoftheresultslistandtheirsortvalueis0.
www.EBooksWorld.ir
![Page 367: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/367.jpg)
www.EBooksWorld.ir
![Page 368: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/368.jpg)
SearchingcontentindifferentlanguagesUntilnow,whendiscussinglanguageanalysis,we’vetalkedmostlyabouttheory.Wedidn’tseeanexampleregardinglanguageanalysis,handlingmultiplelanguagesthatourdatacanconsistof,andsoon.Nowthiswillchange,asthissectionisdedicatedtoinformationabouthowwecanhandledatainmultiplelanguages.
www.EBooksWorld.ir
![Page 369: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/369.jpg)
HandlinglanguagesdifferentlyAsyoualreadyknow,Elasticsearchallowsustochoosedifferentanalyzersforourdata.Wecanhaveourdatadividedonthebasisofwhitespaces,orhavethemlowercased,andsoon.Thiscanusuallybedoneregardlessofthelanguage–thesametokenizationonthebasisofwhitespaceswillworkforEnglish,German,andPolish,althoughitwon’tworkforChinese.However,whatifyouwanttofinddocumentsthatcontainwordssuchascatandcatsbyonlysendingthewordcattoElasticsearch?Thisiswherelanguageanalysiscomesintoplaywithstemmingalgorithmsfordifferentlanguages,whichallowtheanalyzedwordstobereducedtotheirrootforms.Andnowtheworstpart–wecan’tuseonegeneralstemmingalgorithmforallthelanguagesintheworld;wehavetochooseoneappropriatelanguage.Thefollowingsectionsinthechapterwillhelpyouwithsomepartsofthelanguageanalysisprocess.
www.EBooksWorld.ir
![Page 370: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/370.jpg)
HandlingmultiplelanguagesThereareafewwaysofhandlingmultiplelanguagesinElasticsearchandallofthemhavesomeprosandcons.Wewon’tbediscussingeverything,butjustforthepurposeofgivingyouanidea,afewofthosemethodsareasfollows:
StoringdocumentsindifferentlanguagesasdifferenttypesStoringdocumentsindifferentlanguagesinseparateindicesStoringlanguagedataindifferentfieldsofasingledocument
Forthepurposeofthebook,wewillfocusonasinglemethod–theonethatallowsstoringdocumentsindifferentlanguagesinasingleindex.Wewillfocusonaproblemwherewehaveasingletypeofdocument,buteachdocumentmaycomefromanywhereintheworldandthuscanbewritteninmultiplelanguages.Also,wewouldliketoenableouruserstousealltheanalysiscapabilities,suchasstemmingandstopwordsfordifferentlanguages,notonlyforEnglish.
NoteNotethatthestemmingalgorithmsperformdifferentlyfordifferentlanguages,bothintermsofanalysisperformanceandtheresultingterms.Forexample,Englishstemmersareverygood,butyoucanrunintoissueswithEuropeanlanguages,suchasGerman.
www.EBooksWorld.ir
![Page 371: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/371.jpg)
DetectingthelanguageofthedocumentBeforewecontinuewithshowingyouhowtosolveourproblemwithhandlingmultiplelanguagesinElasticsearch,wewouldliketotellyouaboutoneadditionalthing,thatislanguagedetection.Therearesituationswhereyoujustdon’tknowwhatlanguageyourdocumentorqueryarein.Insuchcases,languagedetectionlibrariesmaybeagoodchoice,especiallywhenusingJavaasyourprogramminglanguageofchoice.Someofthelibrariesareasfollows:
ApacheTika(http://tika.apache.org/)Languagedetection(https://github.com/shuyo/language-detection)
Thelanguagedetectionlibraryclaimstohaveover99percentprecisionfor53languages;that’salotifyouaskus.
Youshouldremember,though,thatdatalanguagedetectionwillbemorepreciseforlongertext.Becausethetextofqueriesisusuallyshort,youcanexpecttohavesomedegreeoferrorduringquerylanguageidentification.
www.EBooksWorld.ir
![Page 372: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/372.jpg)
SampledocumentLet’sstartwithintroducingasampledocument,whichisasfollows:
{
"title":"Firsttestdocument",
"content":"Thisisatestdocument"
}
Asyoucansee,thedocumentisprettysimple;itcontainsthefollowingtwofields:
title:Thisfieldholdsthetitleofthedocumentcontent:Thisfieldholdstheactualcontentofthedocument
Thisdocumentisquitesimple,but,fromthesearchpointofview,theinformationaboutdocumentlanguageismissing.Whatweshoulddoisenrichthedocumentbyaddingtheneededinformation.Wecandothatbyusingoneofthepreviouslymentionedlibraries,whichwilltrytodetectthelanguage.
Afterwehavethelanguagedetected,weinformElasticsearchwhichanalyzershouldbeusedandmodifythedocumenttodirectlyshowthelanguageofeachfield.Eachofthefieldswouldhavetobeanalyzedbyalanguageanalyzerdedicatedtothedetectedlanguage.
NoteAfulllistoftheselanguageanalyzerscanbefoundathttps://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-lang-analyzer.html).
Ifadocumentiswritteninalanguagethatwearenotsupporting,wewilljustfallbacktosomedefaultfieldwiththedefaultanalyzer.Forexample,ourprocessedandpreparedforindexingdocumentcouldlooklikethis:
{
"title_english":"Firsttestdocument",
"content_english":"Thisisatestdocument"
}
Thethingisthatallthisprocessingwe’vementionedwouldhavetobedoneoutsideofElasticsearchorinsomekindofcustompluginthatwouldimplementthementionedlogic.
NoteInthepreviousversionsofElasticsearch,therewasapossibilityofchoosingananalyzerbasedonthevalueofanadditionalfield,whichcontainedtheanalyzername.Thiswasamoreconvenientandelegantwaybutintroducedsomeuncertaintyaboutthefieldcontents.Youalwayshadtodeliveraproperanalyzerwhenusingthegivenfieldorstrangethingshappened.TheElasticsearchteammadethedifficultdecisionandremovedthisfeature.
Thereisalsoasimplerway:wecantakeourfirstdocumentandindexitinseveralwaysindependentlyfrominputlanguage.Let’sfocusonthissolution.
www.EBooksWorld.ir
![Page 373: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/373.jpg)
ThemappingsTohandleoursolution,whichwillprocessthedocumentusingseveraldefinedlanguages,weneednewmappings.Let’slookatthemappingswe’vecreatedtoindexourdocuments(we’vestoredtheminthemappings.jsonfile):
{
"mappings":{
"doc":{
"properties":{
"title":{
"type":"string",
"index":"analyzed",
"fields":{
"english":{
"type":"string",
"index":"analyzed",
"analyzer":"english"
},
"russian":{
"type":"string",
"index":"analyzed",
"analyzer":"russian"
},
"german":{
"type":"string",
"index":"analyzed",
"analyzer":"german"
}
}
},
"content":{
"type":"string",
"index":"analyzed",
"fields":{
"english":{
"type":"string",
"index":"analyzed",
"analyzer":"english"
},
"russian":{
"type":"string",
"index":"analyzed",
"analyzer":"russian"
},
"german":{
"type":"string",
"index":"analyzed",
"analyzer":"german"
}
}
}
}
}
}
www.EBooksWorld.ir
![Page 374: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/374.jpg)
}
Intheprecedingmappings,we’veshownthedefinitionforthetitleandcontentfields(ifyouarenotfamiliarwithanyaspectofmappingsdefinition,refertotheMappingsconfigurationsectionofChapter2,IndexingYourData).WehaveusedthemultifieldfeatureofElasticsearch:eachfieldcanbeindexedinseveralwaysusingvariouslanguageanalyzers(inourexample,thoseanalyzersare:English,Russian,andGerman).
Inaddition,thebasefieldusesthedefaultanalyzer,whichwemayuseatquerytimewhenthelanguageisunknown.So,eachfieldwillactuallyhavefourfields–thedefaultoneandthreelanguageorientedfields.
Inordertocreateasampleindexcalleddocsthatusesourmappings,wewillusethefollowingcommand:
curl-XPUT'localhost:9200/docs'[email protected]
www.EBooksWorld.ir
![Page 375: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/375.jpg)
QueryingNowlet’sseehowwecanqueryourdatatousethenewlycreatedlanguagefields.Wecandividethequeryingsituationintotwodifferentcases.Ofcourse,tostartqueryingweneeddocuments.Let’sindexourexampledocumentbyrunningthefollowingcommand:
curl-XPOST'localhost:9200/docs/doc/1'-d'{"title":"Firsttest
document","content":"Thisisatestdocument"}'
QuerieswithanidentifiedlanguageThefirstcaseiswhenwehaveourquerylanguageidentified.Let’sassumethattheidentifiedlanguageisEnglish.Insuchcases,ourqueryisasfollows:
curl'localhost:9200/docs/_search?pretty'-d'{
"query":{
"match":{
"content.english":"documents"
}
}
}'
Thethingtoputemphasisonintheprecedingqueryisthefieldusedforqueryingandthequerytype.Thefieldusediscontent.english,whichalsoindicateswhichanalyzerwewanttouse.Weusedthatfieldbecausewehadidentifiedourlanguagebeforerunningthequery.Thankstothis,theEnglishanalyzercanfindourdocumentevenifwehavethesingularformofthewordinthedocument.TheresponsereturnedbyElasticsearchwillbeasfollows:
{
"took":2,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":0.19178301,
"hits":[{
"_index":"docs",
"_type":"doc",
"_id":"1",
"_score":0.19178301,
"_source":{
"title":"Firsttestdocument",
"content":"Thisisatestdocument"
}
}]
}
}
Thethingtonoteisalsothequerytype–thematchquery.Weusedthematchquery
www.EBooksWorld.ir
![Page 376: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/376.jpg)
becauseitanalyzesitsbodywiththeanalyzerusedbythefieldthatitisrunagainst.Weneedthattoproperlymatchthedatainthequeryandthedataintheindex.
QuerieswithanunknownlanguageNowlet’slookatthesecondsituation–handlingquerieswhenwecouldn’tidentifythelanguageofthequery.Insuchcases,wecan’tusethefieldnamepointingtooneofthelanguages,suchascontent.german.Insuchacase,weusethedefaultfieldwhichusesthedefaultanalyzerandwesendthequerytothecontentfieldinstead.Thequerywilllookasfollows:
curl'localhost:9200/docs/_search?pretty'-d'{
"query":{
"match":{
"content":"documents"
}
}
}'
However,wedidn’tgetanyresultsthistimebecausethedefaultanalyzercan’tdealwithasingularformofawordwhenwearesearchingwithapluralform.
www.EBooksWorld.ir
![Page 377: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/377.jpg)
CombiningqueriesToadditionallyboostthedocumentsthatperfectlymatchwithourdefaultanalyzer,wecancombinethetwoprecedingquerieswiththeboolquery.Suchacombinedquerywilllookasfollows:
curl-XGET'localhost:9200/docs/_search?pretty=true'-d'{
"query":{
"bool":{
"minimum_should_match":1,
"should":[
{
"match":{
"content.english":"documents"
}
},
{
"match":{
"content":"documents"
}
}
]
}
}
}'
Forthedocumenttobereturned,atleastoneofthedefinedqueriesmustmatch.Iftheybothmatch,thedocumentwillhaveahigherscorevalueandwillbeplacedhigherintheresults.
Thereisoneadditionaladvantageoftheprecedingcombinedquery.Ifourlanguageanalyzerdoesn’tfindadocument(forexample,whentheanalysisisdifferentfromtheoneusedduringindexing),thesecondqueryhasachancetofindthetermsthataretokenizedonlybywhitespacecharactersandlowercase.
www.EBooksWorld.ir
![Page 378: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/378.jpg)
www.EBooksWorld.ir
![Page 379: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/379.jpg)
InfluencingscoreswithqueryboostsInthebeginningofthischapter,welearnedwhatscoringisandhowElasticsearchusesthescoringformula.Whenanapplicationgrows,theneedforimprovingthequalityofsearchalsoincreases-wecallitsearchexperience.Weneedtogainknowledgeaboutwhatismoreimportanttotheuserandweseehowtheusersusethesearchesfunctionality.Thisleadstovariousconclusions;forexample,weseethatsomepartsofthedocumentsaremoreimportantthanothersorthatparticularqueriesemphasizeonefieldatthecostofothers.Weneedtoincludesuchinformationinourdataandqueriessothatbothsidesofthescoringequationareclosertoourbusinessneeds.Thisiswhereboostingcanbeused.
www.EBooksWorld.ir
![Page 380: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/380.jpg)
TheboostBoostisanadditionalvalueusedintheprocessofscoring.Wealreadyknowitcanbeappliedto:
Query:Whenused,weinformthesearchenginethatthegivenqueryisapartofacomplexqueryandismoresignificantthantheotherparts.Document:Whenusedduringindexing,wetellElasticsearchthatadocumentismoreimportantthantheothersintheindex.Forexample,whenindexingblogposts,weareprobablymoreinterestedinthepoststhemselvesthanpingbacksorcomments.
Valuesassignedbyustoaqueryoradocumentarenottheonlyfactorsusedwhenwecalculatetheresultingscoreandweknowthat.Wewillnowlookatafewexamplesofqueryboosting.
www.EBooksWorld.ir
![Page 381: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/381.jpg)
AddingtheboosttoqueriesLet’simaginethatourindexhastwodocumentsandwe’veusedthefollowingcommandstoindexthem:
curl-XPOST'localhost:9200/messages/email/1'-d'{
"id":1,
"to":"JohnSmith",
"from":"DavidJones",
"subject":"Topsecret!"
}'
curl-XPOST'localhost:9200/messages/email/2'-d'{
"id":2,
"to":"DavidJones",
"from":"JohnSmith",
"subject":"John,readthisdocument"
}'
Thisdataistrivial,butitshoulddescribeourproblemverywell.Nowlet’sassumewehavethefollowingquery:
curl-XGET'localhost:9200/messages/_search?pretty'-d'{
"query":{
"query_string":{
"query":"john",
"use_dis_max":false
}
}
}'
Inthiscase,Elasticsearchwillcreateaquerytothe_allfieldandwillfinddocumentsthatcontainthedesiredwords.Wealsosaidthatwedon’twantthedisjunctionquerytobeusedbyspecifyingtheuse_dis_maxparametertofalse(ifyoudon’trememberwhatadisjunctionqueryis,refertotheThedis_maxquerysectioninChapter3,SearchingYourData).Aswecaneasilyguess,bothourrecordswillbereturned.Therecordwithidentifierequalto2willbefirstbecausethewordJohnoccurstwotimes–onceinthefromfieldandonceinthesubjectfield.Let’scheckthisoutinthefollowingresult:
"hits":{
"total":2,
"max_score":0.13561106,
"hits":[{
"_index":"messages",
"_type":"email",
"_id":"2",
"_score":0.13561106,
"_source":{
"id":2,
"to":"DavidJones",
"from":"JohnSmith",
"subject":"John,readthisdocument"
}
},{
www.EBooksWorld.ir
![Page 382: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/382.jpg)
"_index":"messages",
"_type":"email",
"_id":"1",
"_score":0.11506981,
"_source":{
"id":1,
"to":"JohnSmith",
"from":"DavidJones",
"subject":"Topsecret!"
}
}]
}
Iseverythingallright?Technically,yes.Butwethinkthattheseconddocument(theonewithidentifier1)shouldbepositionedasthefirstoneintheresultlist,becausewhensearchingforsomething,themostimportantfactor(inmanycases)ismatchingpeopleratherthanthesubjectofthemessage.Youcandisagree,butthisisexactlywhyfull-textsearchingrelevanceisadifficulttopic;sometimesitishardtotellwhichorderingisbetterforaparticularcase.Whatcanwedo?First,let’srewriteourquerytoimplicitlyinformElasticsearchwhatfieldsshouldbeusedforsearching:
curl-XGET'localhost:9200/messages/_search?pretty'-d'{
"query":{
"query_string":{
"fields":["from","to","subject"],
"query":"john",
"use_dis_max":false
}
}
}'
Thisisnotexactlythesamequeryasthepreviousone.Ifwerunit,wewillgetthesameresults(inourcase).However,ifyoulookcarefully,youwillnoticedifferencesinscoring.Inthepreviousexample,Elasticsearchonlyusedonefield,thatisthedefault_allfield.Thequerythatweareusingnowisusingthreefieldsformatching.Thismeansthatseveralfactors,suchasfieldlengths,arechanged.Anyway,thisisnotsoimportantinourcase.Elasticsearchunderthehoodgeneratesacomplexquerymadeofthreequeries–onetoeachfield.Ofcourse,thescorecontributedbyeachquerydependsonthenumberoftermsfoundinthisfieldandthelengthofthisfield.
Let’sintroducesomedifferencesbetweenthefieldsandtheirimportance.Comparethefollowingquerytothelastone:
curl-XGET'localhost:9200/messages/_search?pretty'-d'{
"query":{
"query_string":{
"fields":["from^5","to^10","subject"],
"query":"john",
"use_dis_max":false
}
}
}'
www.EBooksWorld.ir
![Page 383: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/383.jpg)
Lookatthehighlightedparts(^5and^10).Byusingthatnotation(the^characterfollowedbyanumber),wecaninformElasticsearchhowimportantagivenfieldis.Weseethatthemostimportantfieldisthetofield(becauseofthehighestboostvalue).Nextwehavethefromfield,whichislessimportant.Thesubjectfieldhasthedefaultvalueforboost,whichis1.0andistheleastimportantfieldwhenitcomestoscorecalculation.Alwaysrememberthatthisvalueisonlyoneofthevariousfactors.Youmaybewonderingwhywechoose5andnot1000or1.23.Well,thisvaluedependsontheeffectwewanttoachieve,whatquerywehave,and,mostimportantly,whatdatawehaveinourindex.Typically,whendatachangesinthemeaningfulparts,weshouldprobablycheckandtuneourrelevanceonceagain.
Intheend,let’slookatasimilarexample,butusingtheboolquery:
curl-XGET'localhost:9200/messages/_search?pretty'-d'{
"query":{
"bool":{
"should":[
{"term":{"from":{"value":"john","boost":5}}},
{"term":{"to":{"value":"john","boost":10}}},
{"term":{"subject":{"value":"john"}}}
]
}
}
}'
Theprecedingquerywillyieldthesameresults,whichmeansthatthefirstdocumentontheresultslistwillbetheonewiththeidentifier1,butthescoreswillbeslightlydifferent.ThisisbecausetheLucenequeriesmadefromthelasttwoexamplesareslightlydifferentandthusthescoresaredifferent.
www.EBooksWorld.ir
![Page 384: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/384.jpg)
ModifyingthescoreTheprecedingexampleshowshowtoaffecttheresultlistbyboostingparticularquerycomponents–thefields.Anothertechniqueistorunaqueryandaffectthescoreofthematcheddocuments.Inthefollowingsections,wewillsummarizethepossibilitiesofferedbyElasticsearch.Intheexamples,wewilluseourlibrarydatathatwehavealreadyusedinthepreviouschapters.
ConstantscorequeryAconstant_scorequeryallowsustotakeanyqueryandexplicitlysetthevaluethatshouldbeusedasthescorethatwillbegivenforeachmatchingdocumentbyusingtheboostparameter.
Atfirst,thisquerydoesn’tseemtobepractical.Butwhenwethinkaboutbuildingcomplexqueries,thisqueryallowsustosethowmanydocumentsmatchingthisquerycanaffectthetotalscore.Lookatthefollowingexample:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"constant_score":{
"query":{
"query_string":{
"query":"available:falseauthor:heller"
}
}
}
}
}'
Inourdata,wehavetwodocumentswiththeavailablefieldsettofalse.Oneofthesedocumentshasanadditionalvalueintheauthorfield.Ifweuseadifferentquery,thedocumentwithanadditionalvalueintheauthorfield(abookwithidentifier2)wouldbegivenahigherscore,but,thankstotheconstantscorequery,Elasticsearchwillignorethatinformationduringscoring.Bothdocumentswillbegivenascoreequalto1.0.
BoostingqueryThenexttypeofquerythatcanbeusedwithboostingistheboostingquery.Theideaistoallowustodefineapartofquerywhichwillcausematcheddocumentstohavetheirscoreslowered.Thefollowingexamplereturnsalltheavailablebooks(availablefieldsettotrue),butthebookswrittenbyE.M.Remarquewillhaveanegativeboostof0.1(whichmeansabouttentimeslowerscore):
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"boosting":{
"positive":{
"term":{
"available":true
}
},
www.EBooksWorld.ir
![Page 385: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/385.jpg)
"negative":{
"match":{
"author":"remarque"
}
},
"negative_boost":0.1
}
}
}'
ThefunctionscorequeryTillnowwe’veseentwoexamplesofqueriesthatallowedustoalterthescoreofthereturneddocuments.Thethirdexamplewewantedtotalkabout,thefunction_scorequery,iswaymorecomplicatedthanthepreviouslydiscussedqueries.Thefunction_scorequeryisveryusefulwhenthescorecalculationismorecomplicatedthangivingasingleboosttoallthedocuments;boostingmorerecentdocumentsisanexampleofaperfectusecaseforthefunction_scorequery.
Structureofthefunctionquery
Thestructureofthefunctionqueryisquitesimpleandlooksasfollows:
{
"query":{
"function_score":{
"query":{...},
"functions":[
{
"filter":{...},
"FUNCTION":{...}
}
],
"boost_mode":"...",
"score_mode":"...",
"max_boost":"...",
"min_score":"...",
"boost":"..."
}
}
}
Ingeneral,thefunctionscorequerycanuseaquery,oneofseveralfunctions,andadditionalparameters.Eachfunctioncanhaveafilterdefinedtofiltertheresultsonwhichitwillbeapplied.Ifnofilterisgivenforafunction,itwillbeappliedtoallthedocuments.
Thelogicbehindthefunctionscorequeryisquitesimple.Firstofall,thefunctionsarematchedagainstthedocumentsandthescoreiscalculatedbasedonscore_mode.Afterthat,thequeryscoreforthedocumentiscombinedwiththescorecalculatedforthefunctionsandcombinedtogetheronthebasisofboost_mode.
Let’snowdiscusstheparameters:
Boostmode:Theboost_modeparameterallowsustodefinehowthescorecomputedbythefunctionquerieswillbecombinedwiththescoreofthequery.Thefollowing
www.EBooksWorld.ir
![Page 386: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/386.jpg)
valuesareallowed:
multiply:Thedefaultbehavior,whichresultsinthequeryscorebeingmultipliedbythescorecomputedfromthefunctionsreplace:Thequeryscorewillbetotallyignoredandthedocumentscorewillbeequaltothescorecalculatedbythefunctionssum:Thedocumentscorewillbecalculatedasthesumofthequeryandthefunctionscoresavg:Thescoreofthedocumentwillbeanaverageofthequeryscoreandthefunctionscoremax:Thedocumentwillbegivenamaximumofqueryscoreandfunctionscoremin:Thedocumentwillbegivenaminimumofqueryscoreandfunctionscore
Scoremode:Thescore_modeparameterdefineshowthescorecomputedbythefunctionsarecombinedtogether.Thefollowingscore_modeparametervaluesaredefined:
multiply:Thedefaultbehaviorwhichresultsinthescoresreturnedbythefunctionsbeingmultipliedsum:Thescoresreturnedbythedefinedfunctionsaresummedavg:Thescorereturnedbythefunctionsisanaverageofallthescoresofthematchingfunctionsfirst:Thescoreofthefirstfunctionwithafiltermatchingthedocumentisreturnedmax:Themaximumscoreofthefunctionsisreturnedmin:Theminimumscoreofthefunctionsisreturned
Thereisonethingtoremember–wecanlimitthemaximumcalculatedscorevaluebyusingthemax_boostparameterinthefunctionscorequery.Bydefault,thatparameterissettoFloat.MAX_VALUE,whichmeansthemaximumfloatvalue.
Theboostparameterallowsustosetaquerywideboostforthedocuments.
Ofcourse,thereisonethingweshouldremember–thescorecalculateddoesn’taffectwhichdocumentsmatchedthequery.Becauseofthat,themin_scorepropertyhasbeenintroduced.Itallowsustodefinetheminimumscoreofthedocuments.Documentsthathaveascorelowerthanthemin_scorepropertywillbeexcludedfromtheresults.
Whatwehaven’ttalkedaboutyetarethefunctionscoresthatwecanincludeinthefunctionssectionofourquery.Thecurrentlyavailablefunctionsare:
weightfactorfieldvaluefactorscriptscorerandomdecay
Theweightfactorfunction
Theweightfactorfunctionallowsustomultiplythescoreofthedocumentbyagiven
www.EBooksWorld.ir
![Page 387: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/387.jpg)
value.Thevalueoftheweightparameterisnotnormalizedandistakenasis.Anexampleusingtheweightfunction,wherewemultiplythescoreofthedocumentby20,looksasfollows:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"function_score":{
"query":{
"term":{
"available":true
}
},
"functions":[
{"weight":20}
]
}
}
}'
Fieldvaluefactorfunction
Thefield_value_factorfunctionallowsustoinfluencethescoreofthedocumentbyusingavalueofthefieldinthatdocument.Forexample,tomultiplythescoreofthedocumentbythevalueoftheyearfield,werunthefollowingquery:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"function_score":{
"query":{
"term":{
"available":true
}
},
"functions":[
{
"field_value_factor":{
"field":"year",
"missing":1
}
}
]
}
}
}'
Inadditiontochoosingthefieldwhosevalueshouldbeused,wecanalsocontrolthebehaviorofthefieldvaluefactorfunctionbyusingthefollowingproperties:
factor:Themultiplicationfactorthatwillbeusedalongwiththefieldvalue.Itdefaultsto1.modifier:Themodifierthatwillbeappliedtothefieldvalue.Itdefaultstonone.Itcantakethevalueoflog,log1p,log2p,ln,ln1p,ln2p,square,sqrt,andreciprocal.missing:Thevaluethatshouldbeusedwhenadocumentdoesn’thaveanyvaluein
www.EBooksWorld.ir
![Page 388: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/388.jpg)
thefieldspecifiedinthefieldproperty.
Thescriptscorefunction
Thescript_scorefunctionallowsustouseascripttocalculatethescorethatwillbeusedasthescorereturnedbyafunction(andthuswillfallintobehaviordefinedbytheboost_modeparameter).Anexampleofscript_scoreusageisasfollows(forthefollowingexampletowork,inlinescriptingneedstobeallowed,whichmeansaddingthescript.inlinepropertyandsettingittooninelasticsearch.yml):
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"function_score":{
"query":{
"term":{
"available":true
}
},
"functions":[
{
"script_score":{
"script":{
"inline":"_score*_source.copies*parameter1",
"params":{
"parameter1":12
}
}
}
}
]
}
}
}'
Therandomscorefunction
Byusingtherandom_scorefunction,wecangenerateapseudorandomscore,byspecifyingaseed.Inordertosimulaterandomness,weshouldspecifyanewseedeverytime.Therandomnumberwillbegeneratedbyusingthe_uidfieldandtheprovidedseed.Ifaseedisnotprovided,thecurrenttimestampwillbeused.Anexampleofusingthisisasfollows:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"function_score":{
"query":{
"term":{
"available":true
}
},
"functions":[
{
"random_score":{
"seed":12345
}
www.EBooksWorld.ir
![Page 389: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/389.jpg)
}
]
}
}
}'
Decayfunctions
Inadditiontotheearliermentionedscoringfunctions,Elasticsearchexposesadditionalones,calledthedecayfunctions.Thedifferencefromthepreviouslydescribedfunctionsisthatthescoregivenbythosefunctionslowerswithdistance.Thedistanceiscalculatedonthebasisofasinglevaluednumericfield(suchasadate,ageographicalpoint,orastandardnumericfield).Thesimplestexamplethatcomestomindisboostingdocumentsonthebasisofdistancefromagivenpointorboostingonthebasisofdocumentdate.
Forexample,let’sassumethatwehaveapointfieldthatstoresthelocationandwewantourdocument’sscoretobeaffectedbythedistancefromapointwheretheuserstands(forexample,ourusersendsaqueryfromamobiledevice).Assumingtheuserisat52,21,wecouldsendthefollowingquery:
{
"query":{
"function_score":{
"query":{
"term":{
"available":true
}
},
"functions":[
{
"linear":{
"point":{
"origin":"52,21",
"scale":"1km",
"offset":0,
"decay":0.2
}
}
}
]
}
}
}
Intheprecedingexample,thelinearisthenameofthedecayfunction.Thevaluewilldecaylinearlywhenusingit.Theotherpossiblevaluesaregaussandexp.We’vechosenthelineardecayfunctionbecauseofthefactthatitsetsthescoreto0whenthefieldvalueexceedsthegivenoriginvaluetwice.Thisisusefulwhenyouwanttolowerthevalueofthedocumentsthataretoofaraway.
NoteNotethatthegeographicalsearchingcapabilitiesofElasticsearchwillbediscussedintheGeosectionofChapter8,BeyondFull-textSearching.
www.EBooksWorld.ir
![Page 390: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/390.jpg)
Nowlet’sdiscusstherestofthequerystructure.Thepointisthenameofthefieldwewanttouseforscorecalculation.Ifthedocumentdoesn’thaveavalueinthedefinedfield,itwillbegivenavalueof1forthetimeofcalculation.
Inadditiontothat,we’veprovidedadditionalparameters.Theoriginandscalearerequired.Theoriginparameteristhecentralpointfromwhichthecalculationwillbedoneandthescaleistherateofdecay.Bydefault,theoffsetissetto0.Ifdefined,thedecayfunctionwillonlycomputeascoreforthedocumentswithvaluegreaterthanthevalueofthisparameter.ThedecayparametertellsElasticsearchhowmuchthescoreshouldbeloweredandissetto0.5bydefault.Inourcase,we’vesaidthat,atthedistanceof1kilometer,thescoreshouldbereducedby20%(0.2).
NoteWeexpectthefunction_scorequerytobemodifiedandextendedwiththenextversionsofElasticsearch(justasitwaswithElasticsearchversion1.x).Wesuggestfollowingtheofficialdocumentationandthepagededicatedtothefunction_scorequeryathttps://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html.
www.EBooksWorld.ir
![Page 391: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/391.jpg)
www.EBooksWorld.ir
![Page 392: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/392.jpg)
Whendoesindex-timeboostingmakesense?Intheprevioussection,wediscussedboostingqueries.Thiskindofapproachtohandlingdifferencesintheweightofdocumentsisveryhandy,powerful,andeasytouse.Itisalsosufficientinmostsituations.However,therearecaseswhenamoreconvenientwayofdocumentsboostingisindex-timeboosting.Oneofsuchusecaseisthesituationwhenweknowwhichdocumentsareimportantduringtheindexingphase.Insuchacase,wecanpreparethedocumentboostandincludeitaspartofthedocument.Wegainaboostthatisindependentfromaqueryatthecostofreindexingthedocumentswhentheboostvalueischanged(becauseweneedtoapplythechangedboost).Inadditiontothat,theperformancegetsslightlybetterbecausesomepartsneededintheboostingprocessarealreadycalculatedatindextime,whichcanmatterwhenyourindiceshavealargenumberofdocuments.Informationabouttheboostisstoredasapartofthenormalizationfactorandbecauseofthatitisimportanttokeepthenormsturnedon.Thismeansthatwecan’tsetnorms.enabledtofalsebecausewewon’tbeabletouseindextimeboosting.
www.EBooksWorld.ir
![Page 393: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/393.jpg)
DefiningboostinginthemappingsItisalsopossibletodirectlydefinethefield’sboostinourmappings.ThiswillresultinElasticsearchgivingaboostforallthedocumentshavingavalueinsuchafield.Ofcourse,thatwillalsohappenduringindexingtime.Thefollowingexampleillustratesthat:
{
"mappings":{
"book":{
"properties":{
"title":{"type":"string"},
"author":{"type":"string","boost":10.0}
}
}
}
}
Thankstotheprecedingboost,allquerieswillfavorvaluesfoundinthefieldnamedauthor.Thisalsoappliestoqueriesusingthe_allfield,becauseElasticsearchwillapplytheboosttovaluescopiedbetweenthefields.
www.EBooksWorld.ir
![Page 394: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/394.jpg)
www.EBooksWorld.ir
![Page 395: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/395.jpg)
WordswiththesamemeaningYoumayhaveheardaboutsynonyms,wordsthathavethesameorsimilarmeaning.Sometimesyouwouldwanttohavesomewordsmatchedwhenoneofthosewordsisenteredintothesearchbox.Let’srecalloursampledatafromChapter3,SearchingYourData.Therewasabookcalledcrimeandpunishment.Whatifwewantthatbooktonotonlybematchedwhenthewordscrimeorpunishmentareused,butalsowhenusingthewordssuchascriminalityandabuse.Atfirstglance,thismaynotsoundlikegoodbehavior,butsometimesthisisreallyneeded,especiallyinusecaseswheretherearemultiplewordsmeaningthesame(likeinmedicine).Tohandlesuchusecases,wewillusesynonyms.
www.EBooksWorld.ir
![Page 396: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/396.jpg)
SynonymfilterSynonymsinElasticsearcharehandledontheanalysislevel–atbothindexandquerytime,byadedicatedsynonymsfilter.Tousethesynonymfilter,weneedtodefineourownanalyzer.Forexample,let’sdefineananalyzerthatwillbecalledsynonymandwillusethewhitespacetokenizerandasinglefiltercalledsynonym.Ourfilter’stypepropertyneedstobesettosynonym,whichtellsElasticsearchthatthisfilterisasynonymfilter.
Inadditiontothat,wewanttoignorecase,sothattheuppercasedandlowercasedsynonymsaretreatedequally(settheignore_casepropertytotrue).Todefineourcustomsynonymanalyzerthatusesasynonymfilterwhencreatinganewindex,wewouldusethefollowingcommand:
curl-XPOST'localhost:9200/test'-d'{
"index":{
"analysis":{
"analyzer":{
"synonym":{
"tokenizer":"whitespace",
"filter":[
"synonym"
]
}
},
"filter":{
"synonym":{
"type":"synonym",
"ignore_case":true,
"synonyms":[
"crime=>criminality"
]
}
}
}
}
}'
SynonymsinthemappingsInthedefinitionyou’vejustseen,we’vespecifiedthesynonymruleinthemappingswesendtoElasticsearch.Todothat,weneededtoaddthesynonymsproperty,whichisanarrayofsynonymrules.Forexample,thefollowingpartofthemappingsdefinitiondefinesasinglesynonymrule:
"synonyms":[
"crime=>criminality"
]
TheprecedingruletellsElasticsearchtochangethecrimetermtothecriminalitytermwhenthecrimetermisencounteredduringanalysis.
Synonymsstoredonthefilesystem
www.EBooksWorld.ir
![Page 397: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/397.jpg)
Apartfromstoringthesynonymsrulesinthemappings,Elasticsearchallowsustouseafile-basedsynonymsruleset.Touseafile,weneedtospecifythesynonyms_pathpropertyinsteadofthesynonymsone.Thesynonyms_pathpropertyshouldbesettothenameofthefilethatholdsthesynonym’sdefinitionandthespecifiedfilepathisrelativetotheElasticsearchconfigdirectory.So,ifwestoreoursynonymsinthesynonyms.txtfileandwesavethatfileintheconfigdirectory,then,inordertouseit,weshouldsetsynonyms_pathtothevalueofsynonyms.txt.
Forexample,thisishowoursynonymfilterwouldlooklikeifwewanttousethesynonymsstoredinafile:
"filter":{
"synonym":{
"type":"synonym",
"synonyms_path":"synonyms.txt"
}
}
www.EBooksWorld.ir
![Page 398: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/398.jpg)
DefiningsynonymrulesSofarwehavediscussedwhatwehavetodoinordertousesynonymexpansionsinElasticsearch.Nowlet’sseewhatformatsofsynonymsareallowed.
UsingApacheSolrsynonymsThemostcommonsynonymstructureintheApacheLuceneworldisprobablytheoneusedbyApacheSolr(http://lucene.apache.org/solr/),thesearchenginebuiltontopofLucene,justlikeElasticsearchis.ThisisthedefaultwayofhandlingsynonymsinElasticsearchandthepossibilitiesofdefininganewsynonymarediscussedinthefollowingsections.
Explicitsynonyms
Asimplemappingallowsustomapalistofwordsontootherwords.So,inourcase,ifwewantthewordcriminalitytobemappedtocrimeandthewordabusetobemappedtopunishment,weneedtodefinethefollowingentries:
criminality=>crime
abuse=>punishment
Ofcourse,asinglewordcanbemappedintomultipleonesandmultipleonescanbemappedintoasingleone.Forexample:
starwars,wars=>starwars
Theprecedingexamplemeansthatstarwarsandwarswillbechangedtostarwarsbythesynonymfilter.
Equivalentsynonyms
Inadditiontotheexplicitmapping,Elasticsearchallowsustouseequivalentsynonyms.Forexample,thefollowingdefinitionwillmakeallthewordsexchangeablesothatyoucanuseanyofthemtomatchadocumentthathasoneoftheminitscontents:
star,wars,starwars,starwars
Expandingsynonyms
AsynonymfilterallowsustouseoneadditionalpropertywhenitcomestoApacheSolrformatsynonyms–theexpandproperty.Whentheexpandpropertyissettotrue(bydefaultitissettofalse),allsynonymswillbeexpandedbyElasticsearchtoallequivalentforms.Forexample,let’ssaywehavethefollowingfilterconfiguration:
"filter":{
"synonym":{
"type":"synonym",
"expand":false,
"synonyms":[
"one,two,three"
]
}
}
www.EBooksWorld.ir
![Page 399: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/399.jpg)
Elasticsearchwillmaptheprecedingsynonymdefinitiontothefollowing:
one,two,three=>one
Thismeansthatthewordsone,two,andthreewillbechangedtoone.However,ifwesettheexpandpropertytotrue,thesamesynonymdefinitionwillbeinterpretedinthefollowingway:
one,two,three=>one,two,three
Thisbasicallymeansthateachofthewordsfromtheleft-sideofthedefinitionwillbeexpandedtoallthewordsontheright-side.
UsingWordNetsynonymsIfwewanttouseWordNet-structured(tolearnmoreaboutWordNet,visithttp://wordnet.princeton.edu/)synonyms,weneedtoprovideanadditionalpropertyforoursynonymfilter.ThepropertynameisformatandweshouldsetitsvaluetowordnetinorderforElasticsearchtounderstandthatformat.
www.EBooksWorld.ir
![Page 400: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/400.jpg)
Queryorindex-timesynonymexpansionAswithalltheanalyzers,onecanwonderwhentousethesynonymfilter–duringindexing,duringquerying,ormaybeduringindexingandquerying.Ofcourse,itdependsonyourneeds.However,rememberthatusingindex-timesynonymsrequiresdatareindexingaftereachsynonymchange.That’sbecausetheyneedtobereappliedtoallthedocuments.Ifweuseonlythequery-timesynonyms,wecanupdatethesynonym’slistsandhavethemappliedwithoutdatareindexation.
www.EBooksWorld.ir
![Page 401: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/401.jpg)
www.EBooksWorld.ir
![Page 402: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/402.jpg)
UnderstandingtheexplaininformationComparedtodatabases,usingsystemscapableofperformingfull-textsearchcanoftenbeanythingotherthanobvious.Wecansearchinmanyfieldssimultaneouslyandthedataintheindexcanvaryfromtheonesprovidedasthevaluesofthedocumentfields(becauseoftheanalysisprocess,synonyms,abbreviations,andothers).It’sevenworse!Bydefault,searchenginessortdatabyrelevance,whichmeansthateachdocumentisgivenanumberindicatinghowsimilarthedocumentistothequery.Thekeypointhereisunderstandingthehowsimilarphrase.Aswediscussedinthebeginningofthechapter,scoringtakesmanyfactorsintoaccount–howmanysearchedwordswerefoundinthedocument,howfrequentthewordis,howmanytermsareinthefield,andsoon.Thisseemscomplicatedandfindingoutwhyadocumentwasfoundandwhyanotherdocumentisbetterisnoteasy.Fortunately,Elasticsearchprovidesuswithtoolsthatcananswerthesequestionsandwewilllookattheminthissection.
www.EBooksWorld.ir
![Page 403: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/403.jpg)
UnderstandingfieldanalysisOneofthecommonquestionsaskedwhenanalyzingthereturneddocumentsiswhyagivendocumentwasnotfound.Inmanycases,theproblemliesinthemappingsdefinitionandtheanalysisprocessconfiguration.Fordebuggingtheanalysisprocess,ElasticsearchprovidesadedicatedRESTAPIendpoint–the_analyzeone.
Usingitisverysimple.Let’sseehowitisusedbyrunningarequesttoElasticsearchtogiveusinformationonhowthecrimeandpunishmentphraseisanalyzed.Todothat,wewillrunacommandusingHTTPGETtothe_analyzeRESTend-pointandwewillprovidethephraseastherequestbody.Thefollowingcommanddoesthat:
curl-XGET'localhost:9200/_analyze?pretty'-d'CrimeandPunishment'
Inresponse,wegetthefollowingdata:
{
"tokens":[{
"token":"crime",
"start_offset":0,
"end_offset":5,
"type":"<ALPHANUM>",
"position":0
},{
"token":"and",
"start_offset":6,
"end_offset":9,
"type":"<ALPHANUM>",
"position":1
},{
"token":"punishment",
"start_offset":10,
"end_offset":20,
"type":"<ALPHANUM>",
"position":2
}]
}
Aswecansee,Elasticsearchdividedtheinputphraseintothreetokens.Duringprocessing,thephrasewasdividedintotokensonthebasisofwhitespacecharactersandwaslowercased.Thisshowsusexactlywhatwouldbehappeningduringtheanalysisprocess.Wecanalsoprovidethenameoftheanalyzer.Forexample,wecanchangetheprecedingcommandtosomethinglikethis:
curl-XGET'localhost:9200/_analyze?analyzer=standard&pretty'-d'Crimeand
Punishment'
Theprecedingcommandwillallowustocheckhowthestandardanalyzeranalyzesthedata.
ItisworthnotingthatthereisanotherformofanalysisAPIavailable–onewhichallowsustoprovidetokenizersandfilters.Itisveryhandywhenwewanttoexperimentwithconfigurationbeforecreatingthetargetmappings.Insteadofspecifyingtheanalyzer
www.EBooksWorld.ir
![Page 404: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/404.jpg)
parameterintherequest,weprovidethetokenizerandthefiltersparameters.Wecanprovideasingletokenizerandalistoffilters(separatedbycommacharacter).Forexample,toillustratehowtokenizationusingwhitespacetokenizerworkswithlowercaseandkstemfilterswewouldrunthefollowingrequest:
curl-XGET'localhost:9200/library/_analyze?
tokenizer=whitespace&filters=lowercase,kstem&pretty'-d'JohnSmith'
Aswecansee,ananalysisAPIcanbeveryusefulfortrackingdownbugsinthemappingconfiguration.Itisalsopricelesswhenwewanttosolveproblemswithqueriesandmatching.Itcanshowushowouranalyzerswork,whattermstheyproduce,andwhattheattributesofthosetermsare.Withsuchinformation,analyzingthequeryproblemswillbeeasiertotrackdown.
www.EBooksWorld.ir
![Page 405: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/405.jpg)
ExplainingthequeryInadditiontolookingatwhathappenedduringanalysis,Elasticsearchallowsustoexplainhowthescorewascalculatedforaparticularqueryanddocument.Let’slookatthefollowingexample:
curl-XGET'localhost:9200/library/book/1/_explain?pretty&q=quiet'
Theprecedingrequestspecifiesadocumentandaquerytorun.ThedocumentisspecifiedintheURIandthequeryispassedusingtheqparameter.Usingthe_explainendpoint,weaskElasticsearchforanexplanationabouthowthedocumentwasmatchedbyElasticsearch(ornotmatched).TheresponsereturnedbyElasticsearchfortheprecedingrequestlooksasfollows:
{
"_index":"library",
"_type":"book",
"_id":"1",
"matched":true,
"explanation":{
"value":0.057534903,
"description":"sumof:",
"details":[{
"value":0.057534903,
"description":"weight(_all:quietin0)[PerFieldSimilarity],result
of:",
"details":[{
"value":0.057534903,
"description":"fieldWeightin0,productof:",
"details":[{
"value":1.0,
"description":"tf(freq=1.0),withfreqof:",
"details":[{
"value":1.0,
"description":"termFreq=1.0",
"details":[]
}]
},{
"value":0.30685282,
"description":"idf(docFreq=1,maxDocs=1)",
"details":[]
},{
"value":0.1875,
"description":"fieldNorm(doc=0)",
"details":[]
}]
}]
},{
"value":0.0,
"description":"matchonrequiredclause,productof:",
"details":[{
"value":0.0,
"description":"#clause",
"details":[]
www.EBooksWorld.ir
![Page 406: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/406.jpg)
},{
"value":3.2588913,
"description":"_type:book,productof:",
"details":[{
"value":1.0,
"description":"boost",
"details":[]
},{
"value":3.2588913,
"description":"queryNorm",
"details":[]
}]
}]
}]
}
}
Itcanlookslightlycomplicatedandwell,itiscomplicated.Itisevenworseifwerealizethatthisisonlyasimplequery!Elasticsearch,andmorespecificallytheLucenelibrary,showstheinternalinformationaboutthescoringprocess.Wewillonlyscratchthesurfaceandwillexplainthemostimportantthingsabouttheprecedingresponse.
ThefirstthingthatyoucannoticeisthatfortheparticularqueryElasticsearchprovidedtheinformationifthedocumentwasamatchornot.Ifthematchedpropertyissettotrue,itmeansthatthedocumentwasamatchfortheprovidedquery.
Thenextimportantthingistheexplanationobject.Itcontainsthreeproperties:thevalue,thedescription,andthedetails.Thevalueisthescorecalculatedforthegivenpartofthequery.Thedescriptionisthesimplifiedtextrepresentationoftheinternalscorecalculation,andthedetailsobjectcontainsdetailedinformationaboutthescorecalculation.ThenicethingisthatthedetailsobjectwillagaincontainthesamethreepropertiesandthisishowElasticsearchprovidesuswithinformationonhowthescoreiscalculated.
Forexample,let’sanalyzethefollowingpartoftheresponse:
"value":0.057534903,
"description":"sumof:",
"details":[{
"value":0.057534903,
"description":"weight(_all:quietin0)[PerFieldSimilarity],result
of:",
"details":[{
"value":0.057534903,
"description":"fieldWeightin0,productof:",
"details":[{
"value":1.0,
"description":"tf(freq=1.0),withfreqof:",
"details":[{
"value":1.0,
"description":"termFreq=1.0",
"details":[]
}]
},{
www.EBooksWorld.ir
![Page 407: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/407.jpg)
"value":0.30685282,
"description":"idf(docFreq=1,maxDocs=1)",
"details":[]
},{
"value":0.1875,
"description":"fieldNorm(doc=0)",
"details":[]
}]
}]
Thescoreoftheelementis0.057534903(thevalueproperty)anditisasumof(weseethatinthedescriptionproperty)alltheinnerelements.Inthedescriptiononthefirstlevelofnestingoftheprecedingfragment,wecanseethatPerFieldSimilarityhasbeenusedandthatthescoreofthatelementistheresultoftheinnerelements–thesecondlevelofnesting.
Onthesecondlevelofdetailsnesting,wecanseethreeelements.Thefirstoneshowsusthescoreoftheelement,whichistheproductofthetwoscoresoftheelementsbelowit.Wecanalsoseevariousinternalstatisticsretrievedfromtheindex:thetermfrequencywhichinformsushowcommonthetermis(termFreq=1.0),theinverteddocumentfrequency,whichshowsushowoftenthetermappearsinthedocuments(idf(docFreq=1,maxDocs=1)),andthefieldnormalizationfactor(fieldNorm(doc=0)).
TheExplainAPIsupportsthefollowingparameters:analyze_wildcard,analyzer,default_operator,df,fields,lenient,lowercase_expanded_terms,parent,preference,routing,_source,_source_exclude,and_source_include.Tolearnmoreaboutalltheseparameters,refertotheofficialElasticsearchdocumentationregardingExplainAPI,whichisavailableathttps://www.elastic.co/guide/en/elasticsearch/reference/current/search-explain.html.
www.EBooksWorld.ir
![Page 408: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/408.jpg)
www.EBooksWorld.ir
![Page 409: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/409.jpg)
SummaryThechapterwejustfinishedwasfocusedonquerying;notaboutthematchingpartofitbutmostlyaboutscoring.WelearnedhowApacheLuceneTF/IDFscoringworks.WesawthescriptingcapabilitiesofElasticsearchandwehandledmultilingualdata.Weusedboostingtoinfluencehowthescoresofthereturneddocumentswerecalculatedandweusedsynonyms.Finally,weusedexplaininformationtoseehowthedocumentscoreswerecalculatedbythequery.
Inthenextchapter,wewillfullyfocusonElasticsearchdataanalysiscapabilities–theaggregations,theirtypes,andhowtheycanbeused.
www.EBooksWorld.ir
![Page 410: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/410.jpg)
www.EBooksWorld.ir
![Page 411: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/411.jpg)
Chapter7.AggregationsforDataAnalysisInthepreviouschapter,wediscussedthequeryingsideofElasticsearchagain.WelearnedhowtheLuceneTF/IDFalgorithmworksandhowtouseElasticsearchscriptingcapabilities.Wehandledmultilingualdataandinfluenceddocumentscoreswithboosts.WeusedsynonymstomatchwordsthathavethesamemeaningandweusedElasticsearchExplainAPItoseehowdocumentscoreswerecalculated.Bytheendofthischapter,youwillhavelearnedthefollowingtopics:
WhatareaggregationsHowtheElasticsearchaggregationengineworksHowtousemetricsaggregationsHowtousebucketsaggregationsHowtousepipelineaggregations
www.EBooksWorld.ir
![Page 412: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/412.jpg)
AggregationsIntroducedinElasticsearch1.0,aggregationsaretheheartofdataanalyticsinElasticsearch.Highlyflexibleandperformant,aggregationsbroughtElasticsearch1.0toanewpositionasafull-featuredanalysisengine.ExtendedthroughthelifeofElasticsearch1.x,in2.xtheyareyetmorepowerful,lessmemorydemanding,andfaster.Withthisframework,youcanuseElasticsearchastheanalysisenginefordataextractionandvisualization.Let’sseehowthatfunctionalityworksandwhatwecanachievebyusingit.
www.EBooksWorld.ir
![Page 413: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/413.jpg)
GeneralquerystructureTouseaggregations,weneedtoaddanadditionalsectioninourquery.Ingeneral,ourquerieswithaggregationslooklikethis:
{
"query":{…},
"aggs":{
"aggregation_name":{
"aggregation_type":{
...
}
}
}
}
Intheaggsproperty(youcanuseaggregationsifyouwant;aggsisjustanabbreviation),youcandefineanynumberofaggregations.EachaggregationisdefinedbyitsnameandoneofthetypesofaggregationsthatareprovidedbyElasticsearch.Onethingtorememberthoughisthatthekeydefinesthenameoftheaggregation(youwillneedittodistinguishparticularaggregationsintheserverresponse).Let’stakeourlibraryindexandcreatethefirstqueryusinguseaggregations.Acommandsendingsuchaquerylookslikethis:
curl'localhost:9200/library/_search?
search_type=query_then_fetch&size=0&pretty'-d'{
"aggs":{
"years":{
"stats":{
"field":"year"
}
},
"words":{
"terms":{
"field":"copies"
}
}
}
}'
Thisquerydefinestwoaggregations.Theaggregationnamedyearsshowsstatisticsfortheyearfield.Thewordsaggregationcontainsinformationaboutthetermsusedinagivenfield.
NoteInourexamplesweassumedthatweperformaggregationinadditiontosearching.Ifwedon’tneedfounddocuments,abetterideaistousethesizeparameterandsetitto0.Thisomitssomeunnecessaryworkandismoreefficient.Insuchacase,theendpointshouldbe/library/_search?size=0.YoucanreadmoreaboutsearchtypesinChapter3,UnderstandingtheQueryingProcess.
Let’snowlookattheresponsereturnedbyElasticsearchfortheprecedingquery:
www.EBooksWorld.ir
![Page 414: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/414.jpg)
{
"took":2,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"words":{
"doc_count_error_upper_bound":0,
"sum_other_doc_count":0,
"buckets":[{
"key":0,
"doc_count":2
},{
"key":1,
"doc_count":1
},{
"key":6,
"doc_count":1
}]
},
"years":{
"count":4,
"min":1886.0,
"max":1961.0,
"avg":1928.0,
"sum":7712.0
}
}
}
Asyousee,boththeaggregations(yearsandwords)werereturned.Thefirstaggregationwedefinedinourquery(years)returnedgeneralstatisticsforthegivenfieldgatheredacrossallthedocumentsthatmatchedourquery.Thesecondofthedefinedaggregations(words)wasabitdifferent.Itcreatedseveralsetscalledbucketsthatwerecalculatedonthereturneddocumentsandeachoftheaggregatedvalueswaswithinoneofthesesets.Asyoucansee,therearemultipleaggregationtypesavailableandtheyreturndifferentresults.Wewillseethedifferencesinthelaterpartofthissection.
Thegreatthingabouttheaggregationengineisthatitallowsyoutohavemultipleaggregationsandthataggregationscanbenested.Thismeansthatyoucanhaveindefinitelevelsofnestingandanynumberofaggregationsingeneral.Theextendedstructureofthequeryisshownnext:
{
"query":{…},
"aggs":{
www.EBooksWorld.ir
![Page 415: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/415.jpg)
"first_aggregation_name":{
"aggregation_type":{
...
},
"aggregations":{
"first_nested_aggregation":{
...
},
.
.
.
"nth_nested_aggregation":{
...
}
}
},
.
.
.
"nth_aggregation_name":{
...
}
}
}
www.EBooksWorld.ir
![Page 416: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/416.jpg)
InsidetheaggregationsengineAggregationsworkonthebasisofresultsreturnedbythequery.Thisisveryhandyaswegettheinformationthatweareinterestedin,bothfromthequeryaswellasthedataanalysisperspective.SowhatdoesElasticsearchdowhenweincludetheaggregationpartofthequeryintherequestthatwesendtoElasticsearch?Firstofall,theaggregationisexecutedoneachrelevantshardandtheresultsarereturnedtothenodethatisresponsibleforrunningthatquery.Thatnodewaitsforthepartialresultstobecalculated;afteritgetsalltheresults,itmergestheresults,producingthefinalresults.
Thisapproachisnothingnewwhenitcomestodistributedsystemsandhowtheyworkandcommunicate,butcancauseissueswhenitcomestotheprecisionoftheresults.Inmostcasesthisisnotaproblem,butyoushouldbeawareaboutwhattoexpect.Let’simaginethefollowingexample:
Theprecedingimageshowsasimplifiedviewofthreeshards,eachcontainingdocumentshavingonlyElasticsearchandSolrtermsinthem.Nowimaginethatweareinterestedinasingletermforourindex.Thetermsaggregationwhenrunusingsize=1wouldreturnasingleterm,thatwouldbetheonethatisthemostfrequent(ofcourselimitedtothequerywe’verun).SoouraggregatornodewouldseepartialresultstellingusthatElasticsearchispresentin19documentsinShard1andtheSolrtermispresentin10documentsinShard2andShard3,whichmeansthatthetoptermisSolr,whichisnottrue.Thisisanextremecase,butthereareusecases(suchasaccounting)whereprecisioniskeyandyoushouldbeawareaboutsuchsituations.
NoteComparedtoqueries,aggregationsareheavierforElasticsearchintermsofbothCPUcyclesandmemoryconsumption.WewilldiscussthisinmoredetailintheCachingAggregationssectionofthischapter.
www.EBooksWorld.ir
![Page 417: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/417.jpg)
www.EBooksWorld.ir
![Page 418: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/418.jpg)
AggregationtypesElasticsearch2.xallowsustousethreetypesofaggregation:metrics,buckets,andpipeline.Themetricsaggregationsreturnametric,justlikethestatsaggregationweusedforthestatsfield.Thebucketaggregationsreturnbuckets,thekeyandthenumberofdocumentssharingthesamevalues,ranges,andsoon,justlikethetermsaggregationweusedforthecopiesfield.Finally,thepipelineaggregationsintroducedinElasticsearch2.0aggregatetheoutputoftheotheraggregationsandtheirmetrics,whichallowsustodoevenmoresophisticateddataanalysis.Knowingallthat,let’snowlookatalltheaggregationswecanuseinElasticsearch2.x.
www.EBooksWorld.ir
![Page 419: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/419.jpg)
MetricsaggregationsWewillstartwiththemetricsaggregations,whichcanaggregatevaluesfromdocumentsintoasinglemetric.Thisisalwaysthecasewithmetricsaggregations–youcanexpectthemtobeasinglemetriconthebasisofthedata.Let’snowtakealookatthemetricsaggregationsavailableinElasticsearch2.x.
Minimum,maximum,average,andsumThefirstgroupofmetricsaggregationsthatwewanttoshowyouistheonethatcalculatesthebasicvaluefromthegivendocuments.Theseaggregationsare:
min:Thiscalculatestheminimumvaluefromthegivennumericfieldinthereturneddocumentsmax:Thiscalculatesthemaximumvaluefromthegivennumericfieldinthereturneddocumentsavg:Thiscalculatesanaveragefromthegivennumericfieldinthereturneddocumentssum:Thiscalculatesthesumfromthegivennumericfieldinthereturneddocuments
Asyoucansee,theprecedingmentionedaggregationsareprettyself-explanatory.So,let’strytocalculatetheaveragevalueonourdata.Forexample,let’sassumethatwewanttocalculatetheaveragenumberofcopiesforourbooks.Thequerytodothatwilllookasfollows:
{
"aggs":{
"avg_copies":{
"avg":{
"field":"copies"
}
}
}
}
TheresultsreturnedbyElasticsearchafterrunningtheprecedingquerywillbeasfollows:
{
"took":5,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"avg_copies":{
"value":1.75
www.EBooksWorld.ir
![Page 420: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/420.jpg)
}
}
}
So,wehaveanaverageof1.75copiesperbook.Itisveryeasytocalculate–(6+0+1+0)/4isequalto1.75.Seemsthatwegotitright.
Missingvalues
ThenicethingaboutthepreviouslymentionedaggregationsisthatwecancontrolwhatvalueElasticsearchcanuseifthefieldswe’vespecifieddon’thaveany.Forexample,ifwewantedElasticsearchtouse0asthevalueforthecopiesfieldinourpreviousexample,wewouldaddthemissingpropertytoourqueryandandsetitto0.Forexample:
{
"aggs":{
"avg_copies":{
"avg":{
"field":"copies",
"missing":0
}
}
}
}
Usingscripts
Theinputvaluescanalsobegeneratedbyascript.Forexample,ifwewanttofindtheminimumvaluefromallthevaluesintheyearfield,butwewanttosubtract1000fromthosevalues,wewillsendanaggregationlikethefollowingone:
{
"aggs":{
"min_year":{
"min":{
"script":"doc['year'].value-1000"
}
}
}
}
NoteNotethattheprecedingqueryrequiresinlinescriptstobeallowed.Thismeansthatthequeryrequiresthescript.inlinepropertysettoonintheelasticsearch.ymlfile.
Inthiscase,thevaluetheaggregationswillusewillbetheoriginalyearfieldvaluereducedby1000.
WecanalsousethevaluescriptcapabilitiesofElasticsearch.Forexample,toachievethesameasthepreviousscript,wecanusethefollowingquery:
{
"aggs":{
"min_year":{
"min":{
www.EBooksWorld.ir
![Page 421: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/421.jpg)
"field":"year",
"script":{
"inline":"_value-factor",
"params":{
"factor":1000
}
}
}
}
}
}
IfyouarenotfamiliarwithElasticsearchscriptingcapabilities,youcanreadmoreaboutitintheScriptingcapabilitiesofElasticsearchsectionofChapter6,MakeYourSearchBetter.
Onethingworthrememberingisthatusingthecommandlinemayrequireproperescapingofthevaluesinthedocarray.Forexample,thecommandthatexecutesthefirstscriptedquerywouldlookasfollows:
curl-XGET'localhost:9200/library/_search?size=0&pretty'-d'{
"aggs":{
"min_year":{
"min":{
"script":"doc[\"year\"].value-1000"
}
}
}
}'
FieldvaluestatisticsandextendedstatisticsThenextaggregationswewilldiscussaretheonesthatprovideuswiththestatisticalinformationaboutthenumericfieldwearerunningtheaggregationon:thestatsandextended_statsaggregations.
Forexample,thefollowingqueryprovidesextendedstatisticsfortheyearfield:
{
"aggs":{
"extended_statistics":{
"extended_stats":{
"field":"year"
}
}
}
}
Theresponsetotheprecedingquerywillbeasfollows:
{
"took":1,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
www.EBooksWorld.ir
![Page 422: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/422.jpg)
"failed":0
},
"hits":{
"total":4,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"extended_statistics":{
"count":4,
"min":1886.0,
"max":1961.0,
"avg":1928.0,
"sum":7712.0,
"sum_of_squares":1.4871654E7,
"variance":729.5,
"std_deviation":27.00925767213901,
"std_deviation_bounds":{
"upper":1982.018515344278,
"lower":1873.981484655722
}
}
}
}
Asyoucansee,intheresponsewegotinformationaboutthenumberofdocumentswithvalueintheyearfield,theminimumvalue,themaximumvalue,theaverage,andthesum.Thesearethevaluesthatwewillgetifwerunthestatsaggregationinsteadofextended_stats.Theextended_statsaggregationprovidesadditionalinformation,suchasthesumofsquares,variance,andstandarddeviation.Elasticsearchprovidestwotypesofaggregationsbecauseextended_statsisslightlymoreexpensivewhenitcomestoprocessingpower.
NoteThestatsandextended_statsaggregations,similartothemin,max,avg,andsumaggregations,supportscriptingandallowustospecifywhichvalueshouldbeusedforthefieldsthatdon’thavevalueinthespecifiedfield.
ValuecountThevalue_countaggregationisasimpleaggregationwhichallowscountingvaluesinaggregateddocuments.Thisisquiteusefulwhenusedwithnestedaggregations.Wearenotfocusingonthattopicrightnow,butitissomethingtokeepinmind.Forexample,tousethevalue_countaggregationonthecopiedfield,wewillrunthefollowingquery:
{
"aggs":{
"count":{
"value_count":{
"field":"copies"
}
}
}
www.EBooksWorld.ir
![Page 423: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/423.jpg)
}
NoteThevalue_countaggregationallowsustousescripts,discussedearlierinthischapterwhenwedescribedthemin,max,avg,andsumaggregations.PleaserefertothebeginningofMetricsaggregationsectionearlierinthecurrentchapterforfurtherreference.
FieldcardinalityOneoftheaggregationthatallowsustocontrolhowresourcehungrytheaggregationwillbebycontrollingitsprecision,thecardinalityaggregationcalculatesthecountofdistinctvaluesinagivenfield.However,onethingneedstoberemembered:thecalculatedcountisanapproximation,nottheexactvalue.ElasticsearchusestheHyperLogLog++algorithm(http://static.googleusercontent.com/media/research.google.com/fr//pubs/archive/40671.pdftocalculatethevalue.
Thisaggregationhasawidevarietyofusecases,suchasshowingthenumberofdistinctvaluesinafieldthatisresponsibleforholdingthestatuscodeforyourindexedApacheaccesslogs.Onequery,andyouknowtheapproximatedcountofthedistinctvaluesinthatfield.
Forexample,wecanrequestthecardinalityforourtitlefield:
{
"aggs":{
"card_title":{
"cardinality":{
"field":"title"
}
}
}
}
Tocontroltheprecisionofthecardinalitycalculation,wecanspecifytheprecision_thresholdproperty–thehigherthevalue,themoreprecisetheaggregationwillbeandthemoreresourcesitwillneed.Thecurrentmaximumprecision_thresholdvalueis40000andthedefaultdependsontheparentaggregation.Anexamplequeryusingtheprecision_thresholdpropertylooksasfollows:
{
"aggs":{
"card_title":{
"cardinality":{
"field":"title",
"precision_threshold":1000
}
}
}
}
Percentiles
www.EBooksWorld.ir
![Page 424: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/424.jpg)
ThepercentilesaggregationisanotherexampleofaggregationinElasticsearch.Itusesanalgorithmicapproximationapproachtoprovideuswithresults.ItusestheT-Digestalgorithm(https://github.com/tdunning/t-digest/blob/master/docs/t-digest-paper/histo.pdf)fromTedDunningandOtmarErtlandallowsustocalculatepercentiles:metricsthatshowushowmanyresultsareaboveacertainvalue.Forexample,the99thpercentileshowsusthevaluethatisgreaterthan99percentoftheothervalues.
Let’sgointoanexampleandlookataquerythatwillcalculatepercentilesfortheyearfieldinourdata:
{
"aggs":{
"copies_percentiles":{
"percentiles":{
"field":"year"
}
}
}
}
TheresultsreturnedbyElasticsearchfortheprecedingrequestwilllookasfollows:
{
"took":26,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"copies_percentiles":{
"values":{
"1.0":1887.2899999999997,
"5.0":1892.4499999999998,
"25.0":1918.25,
"50.0":1932.5,
"75.0":1942.25,
"95.0":1957.25,
"99.0":1960.25
}
}
}
}
Asyoucansee,thevaluethatishigherthan99percentofthevaluesis1960.25.
Youmaywonderwhysuchaggregationisimportant.Itisveryusefulforperformancemetrics;forexample,whereweusuallylookataveragesforsomeperiodoftime.Imaginethattheaverageresponsetimeofourqueriesforthelasthouris50milliseconds,whichis
www.EBooksWorld.ir
![Page 425: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/425.jpg)
notbad.However,ifthe95thpercentilewouldshow2seconds,thatwouldmeanthatabout5percentoftheusershadtowaittwoormoresecondsforthesearchresults,whichisnotthatgood.
Bydefault,thepercentilesaggregationcalculatessevenpercentiles:1,5,25,50,75,95,and99.Wecancontrolthisbyusingthepercentspropertyandspecifywhichpercentilesweareinterestedin.Forexample,ifwewanttogetonlythe95thandthe99thpercentile,wechangeourquerytothefollowingone:
{
"aggs":{
"copies_percentiles":{
"percentiles":{
"field":"year",
"percents":["95","99"]
}
}
}
}
NoteSimilartothemin,max,avg,andsumaggregations,thepercentilesaggregationsupportsscriptingandallowsustospecifywhichvalueshouldbeusedforthefieldsthatdon’thavevalueinthespecifiedfield.
We’vementionedearlierthatthepercentilesaggregationusesanalgorithmicapproachandisanapproximation.Aswithallapproximations,wecancontroltheprecisionandmemoryusageofthealgorithm.Wedothatbyusingthecompressionproperty,whichdefaultsto100.ItisaninternalpropertyofElasticsearchanditsimplementationdetailsmaychangebetweenversions.Itisworthknowingthatsettingthecompressionvaluetoonehigherthan100canincreasethealgorithmprecisionatthecostofmemoryusage.
PercentileranksThepercentile_ranksaggregationissimilartothepercentilesonethatwejustdiscussed.Itallowsustoshowwhichpercentileagivenvaluehas.Forexample,toshowuswhichpercentileyear1932andyear1960are,werunthefollowingquery:
{
"aggs":{
"copies_percentile_ranks":{
"percentile_ranks":{
"field":"year",
"values":["1932","1960"]
}
}
}
}
TheresponsereturnedbyElasticsearchwillbeasfollows:
{
"took":2,
www.EBooksWorld.ir
![Page 426: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/426.jpg)
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"copies_percentile_ranks":{
"values":{
"1932.0":49.5,
"1960.0":61.5
}
}
}
}
TophitsaggregationThetop_hitsaggregationkeepstrackofthemostrelevantdocumentbeingaggregated.Thisdoesn’tsoundveryappealing,butitallowsustoimplementoneofthemostdesiredfunctionalitiesinElasticsearchcalleddocumentgrouping,fieldcollapsing,ordocumentfolding.Suchfunctionalityisveryusefulinsomeusecases—forexample,whenwewanttoshowabookcatalogbutonlyonefromasinglepublisher.Todothatwithoutthetop_hitsaggregation,wewouldhavetorunmultiplequeries.Withthetop_hitsaggregation,weneedonlyasinglequery.
Thetop_hitsaggregationwasintroducedinElasticsearch1.3.Infact,thementioneddocumentfoldingismoreorlessasideeffectandonlyoneofthepossibleusageexamplesofthetop_hitsaggregation.
Theideabehindthetop_hitsaggregationissimple.Everydocumentthatisassignedtoaparticularbucketcanbealsoremembered.Bydefault,onlythreedocumentsperbucketareremembered.
NoteNotethat,inordertoshowthefullpotentialofthetop_hitsaggregation,wedecidedtouseoneofthebucketingaggregationsaswellandnestthemtoshowthedocumentgroupingfunctionalityimplementation.Thebucketingaggregationsaredescribedindetaillaterinthischapter.
Toshowyouapotentialusecasethatleveragesthetop_hitsaggregation,wehavedecidedtouseasimpleexample.Wewouldliketogetthemostrelevantbookpublishedevery100years.Todothatweusethefollowingquery:
{
"aggs":{
"when":{
www.EBooksWorld.ir
![Page 427: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/427.jpg)
"histogram":{
"field":"year",
"interval":100
},
"aggs":{
"book":{
"top_hits":{
"_source":{
"include":["title","available"]
},
"size":1
}
}
}
}
}
}
Intheprecedingexample,wedidthehistogramaggregationonyearranges.Everybucketwascreatedforeveryonehundredyears.Thenestedtop_hitsaggregationsremembersasingledocumentwiththegreatestscorefromeachbucket(becauseofthesizepropertybeingsetto1).Weaddedtheincludeoptiononlyforsimplerresults,sothatweonlyreturnthetitleandavailablefieldsforeveryaggregateddocument.TheresponsereturnedbyElasticsearchwillbeasfollows:
{
"took":8,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"when":{
"buckets":[{
"key":1800,
"doc_count":1,
"book":{
"hits":{
"total":1,
"max_score":1.0,
"hits":[{
"_index":"library",
"_type":"book",
"_id":"4",
"_score":1.0,
"_source":{
"available":true,
"title":"CrimeandPunishment"
www.EBooksWorld.ir
![Page 428: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/428.jpg)
}
}]
}
}
},{
"key":1900,
"doc_count":3,
"book":{
"hits":{
"total":3,
"max_score":1.0,
"hits":[{
"_index":"library",
"_type":"book",
"_id":"2",
"_score":1.0,
"_source":{
"available":false,
"title":"Catch-22"
}
}]
}
}
}]
}
}
}
Wecanseethat,becauseofthetop_hitsaggregation,wehavethemostscoringdocument(fromeachbucket)includedintheresponse.Inourparticularcase,thequerywasthematch_alloneandallthedocumentshadthesamescore,sothetop-scoringdocumentforeverybucketwasmoreorlessrandom.However,youneedtorememberthatthisisthedefaultbehavior.Ifwewanttohavecustomsorting,thisisnotaproblemforElasticsearch.Wejustneedtoaddthesortpropertyforourtop_hitsaggregator.Forexample,wecanreturnthefirstbookfromagivencentury:
{
"aggs":{
"when":{
"histogram":{
"field":"year",
"interval":100
},
"aggs":{
"book":{
"top_hits":{
"sort":{
"year":"asc"
},
"_source":{
"include":["title","available"]
},
"size":1
}
}
www.EBooksWorld.ir
![Page 429: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/429.jpg)
}
}
}
}
Weaddedsortingtothetop_hitsaggregation,sotheresultsaresortedonthebasisoftheyearfield.Thismeansthatthefirstdocumentwillbetheonewiththelowestvalueinthatfieldandthisisthedocumentthatisgoingtobereturnedforeachbucket.
Additionalparameters
Sortingandfieldinclusionisnoteverythingthatwecanwedoinsidethetop_hitsaggregation.Becausethisaggregationreturnsdocuments,wecanalsousefunctionalitiessuchas:
highlightingexplainscriptingfielddatafield(uninvertedrepresentationofthefields)version
Wejustneedtoincludeanappropriatesectioninthetop_hitsaggregationbody,similartowhatwedowhenweconstructaquery.Forexample:
{
"aggs":{
"when":{
"histogram":{
"field":"year",
"interval":100
},
"aggs":{
"book":{
"top_hits":{
"highlight":{
"fields":{
"title":{}
}
},
"explain":true,
"version":true,
"_source":{
"include":["title","available"]
},
"fielddata_fields":["title"],
"script_fields":{
"century":{
"script":"(doc[\"year\"].value/100).intValue()"
}
},
"size":1
}
}
}
}
www.EBooksWorld.ir
![Page 430: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/430.jpg)
}
}
NoteNotethattheprecedingqueryrequirestheinlinescriptstobeallowed.Thismeansthatthequeryrequiresthescript.inlinepropertysettoonintheelasticsearch.ymlfile.
GeoboundsaggregationThegeo_boundsaggregationisasimpleaggregationthatallowsustocomputetheboundingboxthatincludesallthegeo_pointtypefieldvaluesfromtheaggregateddocuments.
NoteIfyouareinterestedinspatialsearches,thesectiondedicatedtoitiscalledGeoandisincludedinChapter8,BeyondFull-textSearching.
Weonlyneedtoprovidethefield(byusingthefieldproperty;itneedstobeofthegeo_pointtype).Wecanalsoprovidewrap_longitude(valuestrueorfalse;itdefaultstotrue)iftheboundingboxisallowedtooverlaptheinternationaldateline.Inresponse,wegetthelatitudeandlongitudeofthetop-leftandbottom-rightcornersoftheboundingbox.Anexamplequeryusingthisaggregationlooksasfollows(usingthehypotheticallocationfield):
{
"aggs":{
"box":{
"geo_bounds":{
"field":"location"
}
}
}
}
ScriptedmetricsaggregationThelastmetricaggregationwewanttodiscussisthescripted_metricaggregation,whichallowsustodefineourownaggregationcalculationusingscripts.Forthisaggregation,wecanprovidethefollowingscripts(map_scriptistheonlyrequiredone,therestareoptional):
init_script:Thisscriptisrunduringinitializationandallowsustosetupaninitialstateofthecalculation.map_script:Thisistheonlyrequiredscript.Itisexecutedonceforeverydocumentthatneedstostorethecalculationinanobjectcalled_agg.combine_script:ThisscriptisexecutedonceoneachshardafterElasticsearchfinishesdocumentcollectiononthatshard.reduce_script:Thisscriptisexecutedonceonthenodethatiscoordinatingaparticularqueryexecution.Thisscripthasaccesstothe_aggsvariable,whichisanarrayofthevaluesreturnedbycombine_script.
www.EBooksWorld.ir
![Page 431: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/431.jpg)
Forexample,wecanusethescripted_metricaggregationtocalculateallthecopiesofallthebookswehaveinourlibrarybyrunningthefollowingrequest(weshowthewholerequesttoshowhowthenamesareescaped):
curl-XGET'localhost:9200/library/_search?size=0&pretty'-d'{
"aggs":{
"all_copies":{
"scripted_metric":{
"init_script":"_agg[\"all_copies\"]=0",
"map_script":"_agg.all_copies+=doc.copies.value",
"combine_script":"return_agg.all_copies",
"reduce_script":"sum=0;for(numberin_aggs){sum+=number};
returnsum"
}
}
}
}'
Ofcourse,theprecedingscriptisjustasimplesumandwecouldusesumaggregation,butwejustwantedtoshowyouasimpleexampleofwhatyoucandowiththescripted_metricaggregation.
NoteNotethattheprecedingqueryrequiresinlinescriptstobeallowed.Thismeansthatthequeryrequiresthescript.inlinepropertysettoonintheelasticsearch.ymlfile.
Asyoucansee,theinit_scriptpartoftheaggregationisusedtoinitializetheall_copiesvariable.Next,wehavemap_script,whichisexecutedonceforeverydocumentandwejustaddthevalueofthecopiesfieldtotheearlierinitializedvariable.Thecombine_scriptpart,executedonceoneachshard,tellsElasticsearchtoreturnthecalculatedvariable.Finally,thereduce_scriptpart,executedonceforthewholequeryontheaggregatornode,willrunaforloop,whichwillgothroughallthereturnedvaluesthatarestoredinthe_aggsarrayandreturnthesumofthose.ThefinalresultreturnedbyElasticsearchfortheprecedingquerylooksasfollows:
{
"took":2,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"all_copies":{
"value":7
}
www.EBooksWorld.ir
![Page 432: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/432.jpg)
}
}
www.EBooksWorld.ir
![Page 433: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/433.jpg)
BucketsaggregationsThesecondtypeofaggregationsthatwewilldiscussarethebucketsaggregations.Incomparisontometricsaggregations,bucketaggregationreturnsdatanotasasinglemetricbutasalistofkeyvaluepairscalledbuckets.Forexample,thetermsaggregationreturnsthenumberofdocumentsassociatedwitheachterminagivenfield.Theverypowerfulthingaboutbucketsaggregationsisthattheycanhavesub-aggregations,whichmeansthatwecannestotheraggregationsinsidetheaggregationsthatreturnbuckets(wewilldiscussthisattheendofthebucketsaggregationdiscussion).Let’slookatthebucketaggregationsthatareprovidedbyElasticsearchnow.
FilteraggregationThefilteraggregationisasimplebucketingaggregationthatallowsustofiltertheresultstoasinglebucket.Forexample,let’sassumethatwewanttogetacountandtheaveragecopiescountofallthebooksthatarenovels,whichmeanstheyhavethetermnovelinthetagsfield.Thequerythatwillreturnsuchresultslooksasfollows:
{
"aggs":{
"novels_count":{
"filter":{
"term":{
"tags":"novel"
}
},
"aggs":{
"avg_copies":{
"avg":{
"field":"copies"
}
}
}
}
}
}
Asyoucansee,wedefinedthefilterinthefiltersectionoftheaggregationdefinitionandwedefinedasecondnestedaggregation.Thenestedaggregationistheonethatwillberunonthefiltereddocuments.
TheresponsereturnedbyElasticsearchlooksasfollows:
{
"took":13,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
www.EBooksWorld.ir
![Page 434: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/434.jpg)
"max_score":0.0,
"hits":[]
},
"aggregations":{
"novels_count":{
"doc_count":2,
"avg_copies":{
"value":3.5
}
}
}
}
Inthereturnedbucket,wehaveinformationaboutthenumberofdocuments(representedbythedoc_countproperty)andtheaveragenumberofcopies,whichisallwewanted.
FiltersaggregationThesecondbucketaggregationwewanttoshowyouisthefiltersaggregation.Whilethepreviouslydiscussedfilteraggregationresultedinasinglebucket,thefiltersaggregationreturnsmultiplebuckets–oneforeachofthedefinedfilters.Let’sextendourpreviousexampleandassumethat,inadditiontotheaveragenumberofcopiesforthenovels,wealsowanttoknowtheaveragenumberofcopiesforthebooksthatareavailable.Thequerythatwillgetusthisinformationwillusethefiltersaggregationandwilllookasfollows:
{
"aggs":{
"count":{
"filters":{
"filters":{
"novels":{
"term":{
"tags":"novel"
}
},
"available":{
"term":{
"available":true
}
}
}
},
"aggs":{
"avg_copies":{
"avg":{
"field":"copies"
}
}
}
}
}
}
Let’sstophereandlookatthedefinitionoftheaggregation.Asyoucansee,wedefined
www.EBooksWorld.ir
![Page 435: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/435.jpg)
twofiltersusingthefilterssectionofthefiltersaggregation.EachfilterhasanameandtheactualElasticsearchfilter;thefirstiscallednovelsandthesecondiscalledavailable.Elasticsearchwillusethesenamesinthereturnedresponse.ThethingtorememberisthatElasticsearchwillcreateabucketforeachdefinedfilterandwillcalculatethenestedaggregationthatwedefined–inourcase,theonethatcalculatestheaveragenumberofcopies.
NoteThefiltersaggregationallowsustoreturnonemorebucketinadditiontothedefinedones–abucketwithallthedocumentsthatdidn’tmatchthefilters.Inordertocalculatesuchabucket,weneedtoaddtheother_bucketpropertytothebodyoftheaggregationandsetittotrue.
TheresultsreturnedbyElasticsearchareasfollows:
{
"took":4,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"count":{
"buckets":{
"novels":{
"doc_count":2,
"avg_copies":{
"value":3.5
}
},
"available":{
"doc_count":2,
"avg_copies":{
"value":0.5
}
}
}
}
}
}
Asyoucansee,wegottwobuckets,whichiswhatweexpected.
TermsaggregationOneofthemostcommonlyusedbucketaggregationsisthetermsaggregation.Itallows
www.EBooksWorld.ir
![Page 436: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/436.jpg)
ustogetinformationaboutthetermsandthecountofdocumentshavingthoseterms.Forexample,oneofthesimplestusesisgettingthecountofthebooksthatareavailableandnotavailable.Wecandothatbyrunningthefollowingquery:
{
"aggs":{
"counts":{
"terms":{
"field":"available"
}
}
}
}
Intheresponse,wewillgettwobuckets(becausetheBooleanfieldcanonlyhavetwovalues–trueandfalse).Here,thiswilllookasfollows:
{
"took":7,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"counts":{
"doc_count_error_upper_bound":0,
"sum_other_doc_count":0,
"buckets":[{
"key":0,
"key_as_string":"false",
"doc_count":2
},{
"key":1,
"key_as_string":"true",
"doc_count":2
}]
}
}
}
Bydefault,thedataissortedonthebasisofdocumentcount,whichmeansthatthemostcommontermswillbeplacedontopoftheaggregationresults.Ofcourse,wecancontrolthisbehaviorbyspecifyingtheorderpropertyandprovidingtheorderjustlikeweusuallydowhensortingbyarbitraryfieldvalues.Elasticsearchallowsustosortbythedocumentcount(usingthe_countstaticvalue)andbytheterm(usingthe_termstaticvalue).Forexample,ifwewanttosortourprecedingaggregationresultsbydescendingterm,wecanrunthefollowingquery:
www.EBooksWorld.ir
![Page 437: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/437.jpg)
{
"aggs":{
"counts":{
"terms":{
"field":"available",
"order":{"_term":"desc"}}
}
}
}
However,that’snotallwhenitcomestosorting.Wecanalsosortbytheresultsofthenestedaggregationsthatwereincludedinthequery.
Notetermsaggregation,similartothemin,max,avg,andsumaggregationsdiscussedinthemetricsaggregationsectionofthischapter,supportsscriptingandallowsustospecifywhichvalueshouldbeusedforthefieldsthatdon’thaveavalueinthespecifiedfield.
Countsareapproximate
Thethingtorememberwhendiscussingtermsaggregationisthatthecountsareapproximate.Thisisbecauseeachshardprovidesitsowncountsandreturnsthataggregatedinformationtothecoordinatingnode.Thecoordinatingnodeaggregatestheinformationitgotreturningthefinalinformationtotheclient.Becauseofthat,dependingonthedataandhowitisdistributedbetweentheshards,someinformationaboutthecountsmaybelostandthecountswillnotbeexact.Ofcourse,whendealingwithlowcardinalityfields,theapproximationwillbeclosertoexactnumbers,butstillthisissomethingthatshouldbeconsideredwhenusingthetermsaggregation.
Wecancontrolhowmuchinformationisreturnedfromeachoftheshardstothecoordinatingnode.Wecandothisbyspecifyingthesizeandtheshard_sizeproperties.Thesizepropertyspecifieshowmanybucketswillbereturnedatmost.Thehigherthesizeproperty,themoreaccuratethecalculationwillbe.However,thatwillcostusadditionalmemoryandCPUcycles,whichmeansthatthecalculationwillbemoreexpensiveandwillputmorepressureonthehardware.Thisisbecausetheresultsreturnedtothecoordinatingnodefromeachshardwillbelargerandtheresultmergingprocesswillbeharder.
Theshard_sizepropertycanbeusedtominimizetheworkthatneedstobedonebythecoordinatingnode.Whenset,thecoordinatingnodewillfetch(fromeachshard)thenumberofbucketsdeterminedbytheshard_sizeproperty.Thisallowsustoincreasetheprecisionoftheaggregationwhileavoidingtheadditionaloverheadonthecoordinatingnode.Rememberthattheshard_sizepropertycannotbesmallerthanthesizeproperty.
Finally,thesizepropertycanbesetto0,whichwilltellElasticsearchnottolimitthenumberofreturnedbuckets.Itisusuallynotwisetosetthesizepropertyto0asitcanresultinhighresourceconsumption.Also,avoidsettingthesizepropertyto0forhighcardinalityfieldsasthiswilllikelymakeyourElasticsearchclusterexplode.
Minimumdocumentcount
www.EBooksWorld.ir
![Page 438: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/438.jpg)
Elasticsearchprovidesuswithtwoadditionalproperties,whichcanbeusefulincertainsituations:min_doc_countandshard_min_doc_count.Themin_doc_countpropertydefaultsto1andspecifieshowmanydocumentsmustmatchatermtobeincludedintheaggregationresults.Onethingtorememberisthatsettingthemin_doc_countpropertyto0willresultinreturningalltheterms,nomatteriftheyhaveamatchingdocumentornot.Thiscanresultinaverylargeresultsetforaggregationresults.Forexample,ifwewanttoreturntermsmatchedby5ormoredocuments,wewillrunthefollowingquery:
{
"aggs":{
"counts":{
"terms":{
"field":"available",
"min_doc_count":5}
}
}
}
Theshard_min_doc_countpropertyisverysimilaranddefineshowmanydocumentsmustmatchatermtobeincludedintheaggregation’sresults,butontheshardlevel.
RangeaggregationTherangeaggregationallowsustodefineoneormorerangesandElasticsearchcalculatesbucketsforthem.Forexample,ifwewanttocheckhowmanybookswerepublishedinagivenperiodoftime,wecreatethefollowingquery:
{
"aggs":{
"years":{
"range":{
"field":"year",
"ranges":[
{"to":1850},
{"from":1851,"to":1900},
{"from":1901,"to":1950},
{"from":1951,"to":2000},
{"from":2001}
]
}
}
}
}
Wespecifythefieldwewanttheaggregationtobecalculatedonandthearrayofranges.Eachrangeisdefinedbyoneortwoproperties:thetwoandfromsimilartotherangequerieswhichwealreadydiscussed.
TheresultreturnedbyElasticsearchforourdatalooksasfollows:
{
"took":23,
"timed_out":false,
"_shards":{
www.EBooksWorld.ir
![Page 439: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/439.jpg)
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"years":{
"buckets":[{
"key":"*-1850.0",
"to":1850.0,
"to_as_string":"1850.0",
"doc_count":0
},{
"key":"1851.0-1900.0",
"from":1851.0,
"from_as_string":"1851.0",
"to":1900.0,
"to_as_string":"1900.0",
"doc_count":1
},{
"key":"1901.0-1950.0",
"from":1901.0,
"from_as_string":"1901.0",
"to":1950.0,
"to_as_string":"1950.0",
"doc_count":2
},{
"key":"1951.0-2000.0",
"from":1951.0,
"from_as_string":"1951.0",
"to":2000.0,
"to_as_string":"2000.0",
"doc_count":1
},{
"key":"2001.0-*",
"from":2001.0,
"from_as_string":"2001.0",
"doc_count":0
}]
}
}
}
Forexample,between1901and1950wehadtwobooksreleased.
NoteTherangeaggregation,similartothemin,max,avg,andsumaggregationsdiscussedinthemetricsaggregationssectionofthischapter,supportsscriptingandallowsustospecifywhichvalueshouldbeusedforthefieldsthatdon’thaveavalueinthespecifiedfield.
Keyedbuckets
www.EBooksWorld.ir
![Page 440: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/440.jpg)
Onethingthatshouldmentionwhenitcomestotherangeaggregationisthatwecangivethedefinedrangesnames.Forexample,let’sassumethatwewanttousethenamesBefore18thcenturyforthebooksreleasedbefore1799,18thcenturyforthebooksreleasedbetween1800and1900,19thcenturyforthebooksreleasedbetween1900and1999,andAfter19thcenturyforthebooksreleasedafter2000.Wecandothisbyaddingthekeypropertytoeachdefinedrange,givingitthename,andaddingthekeyedpropertysettotrue.Settingthekeyedpropertytotruewillassociateauniquestringvaluetoeachbucketandthekeypropertydefinesthenameforthebucketthatwillbeusedastheuniquename.Aquerythatdoesthatwilllookasfollows:
{
"aggs":{
"years":{
"range":{
"field":"year",
"keyed":true,
"ranges":[
{"key":"Before18thcentury","to":1799},
{"key":"18thcentury","from":1800,"to":1899},
{"key":"19thcentury","from":1900,"to":1999},
{"key":"After19thcentury","from":2000}
]
}
}
}
}
TheresponsereturnedbyElasticsearchinsuchacasewilllookasfollows:
{
"took":2,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"years":{
"buckets":{
"Before18thcentury":{
"to":1799.0,
"to_as_string":"1799.0",
"doc_count":0
},
"18thcentury":{
"from":1800.0,
"from_as_string":"1800.0",
"to":1899.0,
www.EBooksWorld.ir
![Page 441: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/441.jpg)
"to_as_string":"1899.0",
"doc_count":1
},
"19thcentury":{
"from":1900.0,
"from_as_string":"1900.0",
"to":1999.0,
"to_as_string":"1999.0",
"doc_count":3
},
"After19thcentury":{
"from":2000.0,
"from_as_string":"2000.0",
"doc_count":0
}
}
}
}
}
NoteAnimportantandquiteusefulpointabouttherangeaggregationisthatthedefinedrangesneednotbedisjoint.Insuchcases,Elasticsearchwillproperlycountthedocumentformultiplebuckets.
DaterangeaggregationThedate_rangeaggregationissimilartothepreviouslydiscussedrangeaggregationbutitisdesignedforfieldsthatusedate-basedtypes.However,inthelibraryindex,thedocumentshaveyears,butthefieldisanumber,notadate.Forthepurposeofshowinghowthisaggregationworks,let’simaginethatwewanttoextendourlibraryindextosupportnewspapers.Todothiswewillcreateanewindex(calledlibrary2)byusingthefollowingcommand:
curl-XPOSTlocalhost:9200/_bulk--data-binary'{"index":{"_index":
"library2","_type":"book","_id":"1"}}
{"title":"Fishingnews","published":"2010/12/0310:00:00","copies":3,
"available":true}
{"index":{"_index":"library2","_type":"book","_id":"2"}}
{"title":"Knittingmagazine","published":"2010/11/0711:32:00",
"copies":1,"available":true}
{"index":{"_index":"library2","_type":"book","_id":"3"}}
{"title":"Theguardian","published":"2009/07/1304:33:00","copies":0,
"available":false}
{"index":{"_index":"library2","_type":"book","_id":"4"}}
{"title":"HadoopWorld","published":"2012/01/0104:00:00","copies":6,
"available":true}
'
Forthepurposeofthisexample,wewillleavethemappingsdefinitionforElasticsearch–thisissufficientinthiscase.Let’sstartwiththefirstqueryusingthedate_rangeaggregation:
{
www.EBooksWorld.ir
![Page 442: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/442.jpg)
"aggs":{
"years":{
"date_range":{
"field":"published",
"ranges":[
{"to":"2009/12/31"},
{"from":"2010/01/01","to":"2010/12/31"},
{"from":"2011/01/01"}
]
}
}
}
}
Comparedwiththeordinaryrangeaggregation,theonlythingthatchangedistheaggregationtype,whichisnowdate_range.ThedatescanbepassedasastringinaformrecognizedbyElasticsearchorasanumbervalue(numberofmillisecondssince1970-01-01).TheresponsereturnedbyElasticsearchfortheprecedingquerylooksasfollows:
{
"took":5,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"years":{
"buckets":[{
"key":"*-2009/12/3100:00:00",
"to":1.2622176E12,
"to_as_string":"2009/12/3100:00:00",
"doc_count":1
},{
"key":"2010/01/0100:00:00-2010/12/3100:00:00",
"from":1.262304E12,
"from_as_string":"2010/01/0100:00:00",
"to":1.2937536E12,
"to_as_string":"2010/12/3100:00:00",
"doc_count":2
},{
"key":"2011/01/0100:00:00-*",
"from":1.29384E12,
"from_as_string":"2011/01/0100:00:00",
"doc_count":1
}]
}
}
}
www.EBooksWorld.ir
![Page 443: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/443.jpg)
Asyoucansee,theresponseisnodifferentwhencomparedtotheresponsereturnedbytherangeaggregation.Wehavetwoattributesforeachbucket-namedfromandtowhichrepresentthenumberofmillisecondsfrom1970-01-01.Thepropertiesfrom_as_stringandto_as_stringpresentthesameinformationasfromandto,butinahuman-readableform.Ofcoursethekeyedparameterandkeyinthedefinitionofdaterangeworkinthealreadydescribedway.
Elasticsearchalsoallowsustodefinetheformatofpresenteddatesusingtheformatattribute.Inourexample,wepresentedthedateswithyearresolution,sothedayandtimepartswereunnecessary.Ifwewanttoshowthemonthnames,wecansendaquerysuchasthefollowingone:
{
"aggs":{
"years":{
"date_range":{
"field":"published",
"format":"MMMMYYYY",
"ranges":[
{"to":"December2009"},
{"from":"January2010","to":"December2010"},
{"from":"January2011"}
]
}
}
}
}
Notethatthedatesinthetoandfromparametersalsoneedtobeprovidedinthespecifiedformat.Oneofthereturnedrangeslooksasfollows:
{
"key":"January2010-December2010",
"from":1.262304E12,
"from_as_string":"January2010",
"to":1.2911616E12,
"to_as_string":"December2010",
"doc_count":1
}
NoteTheavailableformatswecanuseinformataredefinedintheJodaTimelibrary.Thefulllistisavailableathttp://joda-time.sourceforge.net/apidocs/org/joda/time/format/DateTimeFormat.html.
Thereisonemorethingaboutthedate_rangeaggregationthatwewanttomention.Imaginethatsometimewemaywanttobuildanaggregationthatcanchangewithtime.Forexample,wemaywanttoseehowmanynewspaperswerepublishedinthelast3,6,9,and12months.Thisispossiblewithouttheneedtoadjustthequeryeverytime,aswecanuseconstantssuchasnow-9M.Thefollowingexampleshowsthis:
{
www.EBooksWorld.ir
![Page 444: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/444.jpg)
"aggs":{
"years":{
"date_range":{
"field":"published",
"format":"dd-MM-YYYY",
"ranges":[
{"to":"now-9M/M"},
{"to":"now-9M"},
{"from":"now-6M/M","to":"now-9M/M"},
{"from":"now-3M/M"}
]
}
}
}
}
Thekeyhereisexpressionssuchasnow-9M.Elasticsearchdoesthemathandgeneratestheappropriatevalue.Forexample,youcanusey(year),M(month),w(week),d(day),h(hour),m(minute),ands(second).Forexample,theexpressionnow+3dmeansthreedaysfromnow.The/Minourexampletakesonlythedateroundedtomonths.Thankstosuchnotation,weonlycountfullmonths.Thesecondadvantageisthatthecalculateddateismorecache-friendlywithouttheroundingdatechangeseverymillisecondthatmakeeverycachebasedontherangeirrelevantandbasicallyuselessinmostcases.
IPv4rangeaggregationAveryinterestingaggregationistheip_rangeoneasitworksonInternetaddresses.ItworksonthefieldsdefinedwiththeiptypeandallowsdefiningrangesgivenbytheIPrangeinCIDRnotation(http://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing).Anexampleusageoftheip_rangeaggregationlooksasfollows:
{
"aggs":{
"access":{
"ip_range":{
"field":"ip",
"ranges":[
{"from":"192.168.0.1","to":"192.168.0.254"},
{"mask":"192.168.1.0/24"}
]
}
}
}
}
Theresponsetotheprecedingqueryisasfollows:
"access":{
"buckets":[
{
"from":3232235521,
"from_as_string":"192.168.0.1",
"to":3232235774,
"to_as_string":"192.168.0.254",
"doc_count":0
www.EBooksWorld.ir
![Page 445: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/445.jpg)
},
{
"key":"192.168.1.0/24",
"from":3232235776,
"from_as_string":"192.168.1.0",
"to":3232236032,
"to_as_string":"192.168.2.0",
"doc_count":4
}
]
}
Similartotherangeaggregation,wedefinebothendsofthebracketsandthemask.TherestisdonebyElasticsearchitself.
MissingaggregationThemissingaggregationallowsustocreateabucketandseehowmanydocumentshavenovalueinaspecifiedfield.Forexample,wecancheckhowmanyofourbooksinthelibraryindexdon’thavetheoriginaltitledefined–theotitlefield.Todothis,werunthefollowingquery:
{
"aggs":{
"missing_original_title":{
"missing":{
"field":"otitle"
}
}
}
}
TheresponsereturnedbyElasticsearchinthiscasewilllookasfollows:
{
"took":15,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"missing_original_title":{
"doc_count":2
}
}
}
Aswecansee,wehavetwodocumentswithouttheotitlefield.
www.EBooksWorld.ir
![Page 446: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/446.jpg)
HistogramaggregationThehistogramaggregationisaninterestingonebecauseofitsautomation.Thisaggregationdefinesbucketsitself.Weareonlyresponsiblefordefiningthefieldandtheinterval,andtherestisdoneautomatically.Thesimplestformofaquerythatusesthisaggregationlooksasfollows:
{
"aggs":{
"years":{
"histogram":{
"field":"year",
"interval":100
}
}
}
}
Thenewinformationweneedtoprovideisinterval,whichdefinesthelengthofeveryrangethatwillbeusedtocreateabucket.Wesettheintervalto100,whichinourcasewillresultinbucketsthatare100yearswide.Theaggregationpartoftheresponsetotheprecedingquerythatwassenttoourlibraryindexisasfollows:
{
"took":13,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"years":{
"buckets":[{
"key":1800,
"doc_count":1
},{
"key":1900,
"doc_count":3
}]
}
}
}
Similartotherangeaggregation,thehistogramaggregationallowsustousethekeyedpropertytodefinenamedbuckets.Theotheravailableoptionismin_doc_count,whichallowsustospecifytheminimumnumberofdocumentsrequiredtocreateabucket.Ifwesetthemin_doc_countpropertytozero,Elasticsearchwillalsoincludebucketswiththedocumentcountofzero.Wecanalsousethemissingpropertytospecifythevalue
www.EBooksWorld.ir
![Page 447: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/447.jpg)
Elasticsearchshouldusewhenadocumentdoesn’thaveavalueinthespecifiedfield.
www.EBooksWorld.ir
![Page 448: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/448.jpg)
DatehistogramaggregationAsadate_rangeaggregationisaspecializedformoftherangeaggregation,date_histogramisanextensionofthehistogramaggregationthatworksondates.Forthepurposeofthisexample,wewillagainusethedataweindexedwhendiscussingthedateaggregation.Thismeansthatwewillrunourqueriesagainsttheindexcalledlibrary2.Anexamplequeryusingthedate_histogramaggregationlooksasfollows:
{
"aggs":{
"years":{
"date_histogram":{
"field":"published",
"format":"yyyy-MM-ddHH:mm",
"interval":"10d",
"min_doc_count":1}
}
}
}
Thedifferencebetweenthehistogramanddate_histogramaggregationsistheintervalproperty.Thevalueofthispropertyisnowastringdescribingthetimeinterval,whichinourcaseis10days.Ofcoursewecansetittoanythingwewant.Itusesthesamesuffixeswediscussedwhiletalkingaboutformatsinthedate_rangeaggregation.Itisworthmentioningthatthenumbercanbeafloatvalue.Forexample,1.5mmeansthatthelengthofthebucketwillbeoneandahalfminutes.Theformatattributeisthesameasinthedate_rangeaggregation.Thankstoit,Elasticsearchcanaddahuman-readabledatetextaccordingtothedefinedformat.Ofcoursetheformatattributeisnotrequiredbutuseful.Inadditiontothat,similartotheotherrangeaggregations,thekeyedandmin_doc_countattributesstillwork.
TimezonesElasticsearchstoresallthedatesintheUTCtimezone.YoucandefinethetimezonetobeusedbyElasticsearchbyusingthetime_zoneattribute.Bysettingthisproperty,webasicallytellElasticsearchwhichtimezoneshouldbeusedtoperformthecalculations.Therearethreenotationswithwhichtosettheseattributes:
Wecansetthehoursoffset;forexample,time_zone:5Wecanusethetimeformat;forexample,time_zone:"-04:30"Wecanusethenameofthetimezone;forexample,time_zone:"Europe\Warsaw"
NoteLookathttp://joda-time.sourceforge.net/timezones.htmltoseetheavailabletimezones.
www.EBooksWorld.ir
![Page 449: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/449.jpg)
GeodistanceaggregationsThenexttwoaggregationsareconnectedwithmapsandspatialsearches.WewilltalkaboutgeotypesandqueriesintheElasticsearchspatialcapabilitiessectionofChapter8,BeyondFull-textSearching,sofeelfreetoskipthesetwotopicsnowandreturntothemlater.
Lookatthefollowingquery:
{
"aggs":{
"neighborhood":{
"geo_distance":{
"field":"location",
"origin":[-0.1275,51.507222],
"ranges":[
{"to":1200},
{"from":1201}
]
}
}
}
}
Youcanseethatthequeryissimilartotherangeaggregation.Theprecedingaggregationwillcalculatethenumberofdocumentsthatfallintotwobuckets:onecloserthan1200kmandthesecondonefurtherthan1200kmfromthegeographicalpointdefinedbytheoriginproperty(intheprecedingcase,theoriginisLondon).TheaggregationsectionoftheresponsereturnedbyElasticsearchlooksasfollows:
"neighborhood":{
"buckets":[
{
"key":"*-1200.0",
"from":0,
"to":1200,
"doc_count":1
},
{
"key":"1201.0-*",
"from":1201,
"doc_count":4
}
]
}
Thekeyedandthekeyattributesworkinthegeo_distanceaggregationaswell,sowecaneasilymodifytheresponsetoourneedsandcreatenamedbuckets.
Thegeo_distanceaggregationsupportsafewadditionalparametersthatareshowninthefollowingquery:
{
"aggs":{
www.EBooksWorld.ir
![Page 450: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/450.jpg)
"neighborhood":{
"geo_distance":{
"field":"location",
"origin":{"lon":-0.1275,"lat":51.507222},
"unit":"m",
"distance_type":"plane",
"ranges":[
{"to":1200},
{"from":1201}
]
}
}
}
}
Wehavehighlightedthreethingsintheprecedingquery.Thefirstchangeishowwedefinedtheoriginpoint.Thistimewespecifiedthelocationbyprovidingthelatitudeandlongitudeexplicitly.
Thesecondchangeistheunitattribute.Itdefinestheunitsusedintherangesarray.Thepossiblevaluesare:km(thedefault,kilometers),mi(miles),in(inches),yd(yards),m(meters),cm(centimeters),andmm(millimeters).
Thelastattribute,distance_type,specifieshowElasticsearchcalculatesthedistance.Thepossiblevaluesare(fromthefastestbutleastaccuratetotheslowestbutthemostaccurate):plane,sloppy_arc(thedefault),andarc.
www.EBooksWorld.ir
![Page 451: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/451.jpg)
GeohashgridaggregationThesecondaggregationrelatedtogeographicalanalysisisbasedongridsandiscalledgeohash_grid.Itorganizesareasintogridsandassignseverylocationtoacellinsuchagrid.Todothisefficiently,ElasticsearchusesGeohash(http://en.wikipedia.org/wiki/Geohash),whichencodesthelocationintoastring.Thelongerthestringis,themoreaccuratethedescriptionofaparticularlocation.Forexample,oneletterissufficienttodeclareaboxofaboutfivethousandsquarekilometersand5lettersareenoughtoincreasetheaccuracytofivesquarekilometers.Let’slookatthefollowingquery:
{
"aggs":{
"neighborhood":{
"geohash_grid":{
"field":"location",
"precision":5
}
}
}
}
Wedefinedthegeohash_gridaggregationwithbucketsthathaveaprecisionoffivesquarekilometers(theprecisionattributedescribesthenumberoflettersusedinthegeohashstringobject).Thetablewithresolutionsversusthelengthofgeohashcanbefoundathttps://www.elastic.co/guide/en/elasticsearch/reference/master/search-aggregations-bucket-geohashgrid-aggregation.html.
Ofcourse,themoreaccuratewewanttheaggregationtobe,themoreresourcesElasticsearchwillconsume,becauseofthenumberofbucketsthattheaggregationhastocalculate.Bydefault,Elasticsearchdoesnotgeneratemorethan10,000buckets.Youcanchangethisbehaviorbyusingthesizeattribute,butkeepinmindthattheperformancemaysufferforverywidequeriesconsistingofthousandsofbuckets.
www.EBooksWorld.ir
![Page 452: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/452.jpg)
GlobalaggregationTheglobalaggregationisanaggregationthatdefinesasinglebucketcontainingallthedocumentsfromagivenindexandtype,andnotinfluencedbythequeryitself.Thethingthatdifferentiatestheglobalaggregationfromalltheothersisthattheglobalaggregationhasanemptybody.Forexample,lookatthefollowingquery:
{
"query":{
"term":{
"available":"true"
}
},
"aggs":{
"all_books":{
"global":{}
}
}
}
Inourlibraryindex,weonlyhavetwoavailablebooks,buttheresponsetotheprecedingquerylooksasfollows:
{
"took":1,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":3,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"all_books":{
"doc_count":4
}
}
}
Asyoucansee,theglobalaggregationisnotboundbythequery.Becausetheresultoftheglobalaggregationisasinglebucketcontainingallthedocuments(notnarroweddownbythequeryitself),itisaperfectcandidateforuseasatop-levelparentaggregationfornestingaggregations.
www.EBooksWorld.ir
![Page 453: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/453.jpg)
SignificanttermsaggregationThesignificant_termsaggregationallowsustogetthetermsthatarerelevantandprobablythemostsignificantforagivenquery.Thegoodthingisthatitdoesn’tonlyshowthetoptermsfromtheresultsofthegivenquery,butalsotheonethatseemstobethemostimportantone.
Theusecasesforthisaggregationtypecanvaryfromfindingthemosttroublesomeserverworkinginyourapplicationenvironment,tosuggestingnicknamesfromtext.WheneverElasticsearchseesasignificantchangeinthepopularityofaterm,suchatermisacandidateforbeingsignificant.
NoteRememberthatthesignificant_termsaggregationisveryexpensivewhenitcomestoresourcesandrunningagainstlargeindices.Workisbeingdonetoprovidealightweightversionofthataggregation;asaresult,theAPIforsignificant_termsaggregationmaychangeinthefuture.
Thebestwaytodescribethesignificant_termsaggregationtypeistouseanexample.Let’sstartwithindexing12simpledocuments,whichrepresentreviewsofworkdonebyinterns:
curl-XPOST'localhost:9200/interns/review/1'-d'{"intern":"Richard",
"grade":"bad","type":"grade"}'
curl-XPOST'localhost:9200/interns/review/2'-d'{"intern":"Ralf",
"grade":"perfect","type":"grade"}'
curl-XPOST'localhost:9200/interns/review/3'-d'{"intern":"Richard",
"grade":"bad","type":"grade"}'
curl-XPOST'localhost:9200/interns/review/4'-d'{"intern":"Richard",
"grade":"bad","type":"review"}'
curl-XPOST'localhost:9200/interns/review/5'-d'{"intern":"Richard",
"grade":"good","type":"grade"}'
curl-XPOST'localhost:9200/interns/review/6'-d'{"intern":"Ralf",
"grade":"good","type":"grade"}'
curl-XPOST'localhost:9200/interns/review/7'-d'{"intern":"Ralf",
"grade":"perfect","type":"review"}'
curl-XPOST'localhost:9200/interns/review/8'-d'{"intern":"Richard",
"grade":"medium","type":"review"}'
curl-XPOST'localhost:9200/interns/review/9'-d'{"intern":"Monica",
"grade":"medium","type":"grade"}'
curl-XPOST'localhost:9200/interns/review/10'-d'{"intern":"Monica",
"grade":"medium","type":"grade"}'
curl-XPOST'localhost:9200/interns/review/11'-d'{"intern":"Ralf",
"grade":"good","type":"grade"}'
curl-XPOST'localhost:9200/interns/review/12'-d'{"intern":"Ralf",
"grade":"good","type":"grade"}'
Ofcourse,toshowtherealpowerofthesignificant_termsaggregation,weshoulduseawaylargerdataset.However,forthepurposeofthisbook,wewillconcentrateonthisexample,soitiseasiertoillustratehowthisaggregationworks.
Nowlet’stryfindingthemostsignificantgradeforRichard.Todothiswewillusethe
www.EBooksWorld.ir
![Page 454: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/454.jpg)
followingquery:
curl-XGET'localhost:9200/interns/_search?size=0&pretty'-d'{
"query":{
"match":{
"intern":"Richard"
}
},
"aggregations":{
"description":{
"significant_terms":{
"field":"grade"
}
}
}
}'
Theresultoftheprecedingquerylooksasfollows:
{
"took":2,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":5,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"description":{
"doc_count":5,
"buckets":[{
"key":"bad",
"doc_count":3,
"score":0.84,
"bg_count":3
}]
}
}
}
Asyoucansee,forourqueryElasticsearchinformedusthatthemostsignificantgradeforRichardisbad.Maybeitwasn’tthebestinternshipforhim;whoknows.
ChoosingsignificanttermsTocalculatesignificantterms,Elasticsearchlooksfordatathatreportsasignificantchangeintheirpopularitybetweentwosetsofdata:theforegroundsetandthebackgroundset.Theforegroundsetisthedatareturnedbyourquery,whilethebackgroundsetisthedatainourindex(orindices,dependingonhowwerunourqueries).Ifatermexistsin10documentsoutofonemillionindexed,butappearsin5documentsfromthe10returned,
www.EBooksWorld.ir
![Page 455: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/455.jpg)
thensuchatermisdefinitelysignificantandworthconcentratingon.
Let’sgetbacktoourprecedingexamplenowtoanalyzeitabit.Richardgotthreegradesfromthereviewers–badthreetimes,mediumonetime,andgoodonetime.Fromthesethree,thebadvalueappearedinthreeoutofthefivedocumentsmatchingthequery.Ingeneral,thebadgradeappearedinthreedocuments(thebg_countproperty)outofthe12documentsintheindex(thisisourbackgroundset).Thisgivesus25percentoftheindexeddocuments.Ontheotherhand,thebadgradeappearedinthreeoutofthefivedocumentsmatchingthequery(thisisourforegroundset),whichgivesus60percentofthedocuments.Asyoucansee,thechangeinpopularityissignificantforthebadgradeandthat’swhyElasticsearchhasreturneditinthesignificant_termsaggregationresults.
MultiplevalueanalysisThesignificant_termsaggregationcanbenestedandprovideuswithnicedataanalysiscapabilitiesthatconnecttwomultiplesetsofdata.Forexample,let’strytofindasignificantgradeforeachoftheinternsthatwehaveinformationabout.Todothiswewillnestthesignificant_termsaggregationinsidethetermsaggregation.Thequerythatdoesthatlooksasfollows:
curl-XGET'localhost:9200/interns/_search?size=0&pretty'-d'{
"aggregations":{
"grades":{
"terms":{
"field":"intern"
},
"aggregations":{
"significantGrades":{
"significant_terms":{
"field":"grade"
}
}
}
}
}
}'
TheresultsreturnedbyElasticsearchfortheprecedingqueryareasfollows:
{
"took":2,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":12,
"max_score":0.0,
"hits":[]
},
"aggregations":{
www.EBooksWorld.ir
![Page 456: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/456.jpg)
"grades":{
"doc_count_error_upper_bound":0,
"sum_other_doc_count":0,
"buckets":[{
"key":"ralf",
"doc_count":5,
"significantGrades":{
"doc_count":5,
"buckets":[{
"key":"good",
"doc_count":3,
"score":0.48,
"bg_count":4
}]
}
},{
"key":"richard",
"doc_count":5,
"significantGrades":{
"doc_count":5,
"buckets":[{
"key":"bad",
"doc_count":3,
"score":0.84,
"bg_count":3
}]
}
},{
"key":"monica",
"doc_count":2,
"significantGrades":{
"doc_count":2,
"buckets":[]
}
}]
}
}
}
www.EBooksWorld.ir
![Page 457: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/457.jpg)
SampleraggregationThesampleraggregationisoneoftheexperimentalaggregationsinElasticsearch.Itallowsustolimitthesubaggregationprocessingtoasampleofdocumentsthataretop-scoringones.Thisallowsfilteringandpotentialremovalofgarbageinthedata.Itisaverynicecandidateasatop-levelaggregationtolimittheamountofdatathesignificant_termsaggregationrunson.Thesimplestexampleofusingthisaggregationisasfollows:
{
"aggs":{
"sampler_example":{
"sampler":{
"field":"tags",
"max_docs_per_value":1,
"shard_size":10
},
"aggs":{
"best_terms":{
"terms":{
"field":"title"
}
}
}
}
}
}
Toseetherealpowerofsampling,wewillhavetoplaywithitonalargerdataset,butfornowwewilldiscusstheprecedingexample.Thesampleraggregationwasdefinedwiththreeproperties:field,max_docs_per_value,andshard_size.Thefirsttwopropertiesallowustocontrolthediversityofthesampling.WetellElasticsearchhowmanydocumentsatmaximum(thevalueofthemax_doc_per_valueproperty)canbecollectedonashardwiththesamevalueinthedefinedfield(thevalueofthefieldproperty).
Theshard_sizepropertytellsElasticsearchhowmanydocuments(atmost)tocollectfromeachshard.
www.EBooksWorld.ir
![Page 458: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/458.jpg)
ChildrenaggregationThechildrenaggregationisasingle-bucketaggregationthatcreatesabucketwithallthechildrenofthespecifiedtype.Let’sgetbacktotheUsingtheparent-childrelationshipsectioninChapter5,ExtendingYourIndexStructure,andlet’susethecreatedshopindex.Tocreateabucketofallchildrendocumentswiththevariationtypeintheshopindex,werunthefollowingquery:
{
"aggs":{
"variation_children":{
"children":{
"type":"variation"
}
}
}
}
TheresponsereturnedbyElasticsearchisasfollows:
{
"took":4,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":3,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"variation_children":{
"doc_count":2
}
}
}
NoteBecausethechildrenaggregationusesparent–childfunctionality,itreliesonthe_parentfield,whichneedstobepresent.
www.EBooksWorld.ir
![Page 459: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/459.jpg)
NestedaggregationIntheUsingnestedobjectssectionofChapter5,ExtendingYourIndexStructure,welearnedaboutnesteddocuments.Let’susethatdatatolookintothenexttypeofaggregation–thenestedone.Let’screatethesimplestworkingquery,whichlookslikethis(weusetheshop_nestedindexcreatedinthementionedchapter):
{
"aggs":{
"variations":{
"nested":{
"path":"variation"
}
}
}
}
Theprecedingqueryissimilarinstructuretoanyotheraggregation.However,insteadofprovidingthefieldnameonwhichtheaggregationshouldbecalculated,itcontainsasingleparameterpath,whichpointstothenesteddocument.Intheresponsewegetanumberofnesteddocuments:
{
"took":4,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"variations":{
"doc_count":2
}
}
}
Theprecedingresponsemeansthatwehavetwonesteddocumentsintheindex,withtheprovidedtypevariation.
www.EBooksWorld.ir
![Page 460: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/460.jpg)
ReversenestedaggregationThereverse_nestedaggregationisaspecial,single-bucketaggregationthatallowsaggregationonparentdocumentsfromthenesteddocuments.Thereverse_nestedaggregationdoesn’thaveabodysimilartoglobalaggregation.Soundsquitecomplicated,butitisnot.Let’slookatthefollowingquerythatwerunagainsttheshop_nestedindexcreatedinChapter5,ExtendingYourIndexStructureintheUsingnestedobjectssection:
{
"aggs":{
"variations":{
"nested":{
"path":"variation"
},
"aggs":{
"sizes":{
"terms":{
"field":"variation.size"
},
"aggs":{
"product_name_terms":{
"reverse_nested":{},
"aggs":{
"product_name_terms_per_size":{
"terms":{
"field":"name"
}
}
}
}
}
}
}
}
}
}
Westartwiththetoplevelaggregation,whichisthesamenestedaggregationthatweusedwhendiscussingthenestedaggregation.However,weincludeasub-aggregationthatusesreverse_nestedtobeabletoshowtermsfromthetitleforeachsizereturnedbythetop-levelnestedaggregation.Thisispossiblebecause,whenthereverse_nestedaggregationisused,Elasticsearchcalculatesthedataonthebasisoftheparentdocumentsinsteadofusingthenesteddocuments.
NoteRememberthatthereverse_nestedaggregationmustbeusedinsidethenestedaggregation.
Theresponsetotheprecedingquerywilllookasfollows:
{
"took":7,
www.EBooksWorld.ir
![Page 461: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/461.jpg)
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"variations":{
"doc_count":2,
"sizes":{
"doc_count_error_upper_bound":0,
"sum_other_doc_count":0,
"buckets":[{
"key":"XL",
"doc_count":1,
"product_name_terms":{
"doc_count":1,
"product_name_terms_per_size":{
"doc_count_error_upper_bound":0,
"sum_other_doc_count":0,
"buckets":[{
"key":"shirt",
"doc_count":1
},{
"key":"test",
"doc_count":1
}]
}
}
},{
"key":"XXL",
"doc_count":1,
"product_name_terms":{
"doc_count":1,
"product_name_terms_per_size":{
"doc_count_error_upper_bound":0,
"sum_other_doc_count":0,
"buckets":[{
"key":"shirt",
"doc_count":1
},{
"key":"test",
"doc_count":1
}]
}
}
}]
}
}
}
}
www.EBooksWorld.ir
![Page 462: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/462.jpg)
NestingaggregationsandorderingbucketsWhentalkingaboutbucketaggregations,wejustneedtogetbacktothetopicofnestingaggregations.Thisisaverypowerfultechnique,becauseitallowsyoutofurtherprocessthedatafordocumentsinthebuckets.Forexample,thetermsaggregationwillreturnabucketforeachtermandthestatsaggregationcanshowusthestatisticsfordocumentsineachbucket.Forexample,let’slookatthefollowingquery:
{
"aggs":{
"copies":{
"terms":{
"field":"copies"
},
"aggs":{
"years":{
"stats":{
"field":"year"
}
}
}
}
}
}
Thisisanexampleofnestedaggregations.Thetermsaggregationwillreturnbucketsforeachtermfromthecopiesfield(threebucketsinthecaseofourdata),andthestatsaggregationwillcalculatestatisticsfortheyearfieldforthedocumentsfallingintoeachbucketreturnedbythetopaggregation.TheresponsefromElasticsearchfortheprecedingquerylooksasfollows:
{
"took":3,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"copies":{
"doc_count_error_upper_bound":0,
"sum_other_doc_count":0,
"buckets":[{
"key":0,
"doc_count":2,
"years":{
"count":2,
www.EBooksWorld.ir
![Page 463: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/463.jpg)
"min":1886.0,
"max":1936.0,
"avg":1911.0,
"sum":3822.0
}
},{
"key":1,
"doc_count":1,
"years":{
"count":1,
"min":1929.0,
"max":1929.0,
"avg":1929.0,
"sum":1929.0
}
},{
"key":6,
"doc_count":1,
"years":{
"count":1,
"min":1961.0,
"max":1961.0,
"avg":1961.0,
"sum":1961.0
}
}]
}
}
}
Thisisapowerfulfeatureandallowsustobuildverycomplexdataprocessingpipelines.Ofcourse,wearenotlimitedtoasinglenestedaggregationandwecannestmultipleofthemandevennestanaggregationinsideanestedaggregation.Forexample:
{
"aggs":{
"popular_tags":{
"terms":{
"field":"copies"
},
"aggs":{
"years":{
"terms":{
"field":"year"
},
"aggs":{
"available_by_year":{
"stats":{
"field":"available"
}
}
}
},
"available":{
"stats":{
"field":"available"
www.EBooksWorld.ir
![Page 464: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/464.jpg)
}
}
}
}
}
}
Asyoucansee,thepossibilitiesarealmostunlimited,ifyouhaveenoughmemoryandCPUpowertohandleverycomplicatedaggregations.
BucketsorderingThereisonemorefeatureaboutnestedaggregationsandtheorderingofaggregationresults.Elasticsearchcanusevaluesfromthenestedaggregationstosorttheparentbuckets.Forexample,let’slookatthefollowingquery:
{
"aggs":{
"availability":{
"terms":{
"field":"copies",
"order":{"numbers.avg":"desc"}
},
"aggs":{
"numbers":{"stats":{}}
}
}
}
}
Inthepreviousexample,theorderintheavailabilityaggregationisbasedontheaveragevaluefromthenumbersaggregation.Thenotationnumbers.avgisrequiredinthiscase,becausestatsisamultivaluedaggregationandprovidesmultipleinformationandwewereinterestedintheaverage.Ifitwerethesumaggregation,thenameoftheaggregationwouldbesufficient.
www.EBooksWorld.ir
![Page 465: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/465.jpg)
www.EBooksWorld.ir
![Page 466: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/466.jpg)
PipelineaggregationsThelasttypeofaggregationwewilldiscussispipelineaggregations.Tillnowwe’velearnedaboutmetricsaggregationsandbucketaggregations.Thefirstonereturnedmetricswhilethesecondtypereturnedbuckets.Andbothmetricsandbucketsaggregationsworkedonthebasisofreturneddocuments.Pipelineaggregationsaredifferent.Theyworkontheoutputoftheotheraggregationsandtheirmetrics,allowingfunctionalitiessuchasmoving-averagecalculations(https://en.wikipedia.org/wiki/Moving_average).
NoteRememberthatpipelineaggregationswereintroducedinElasticsearch2.0andareconsideredexperimental.ThismeansthattheAPIcanchangeinthefuture,breakingbackwards-compatibility.
www.EBooksWorld.ir
![Page 467: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/467.jpg)
AvailabletypesTherearetwotypesofpipelineaggregation.Thesocalledparentaggregationsfamilyworksontheoutputofotheraggregations.Theyareabletoproducenewbucketsornewaggregationstoaddtoexistingbuckets.Thesecondtypeiscalledsiblingaggregationsandtheseaggregationsareabletoproducenewaggregationsonthesamelevel.
www.EBooksWorld.ir
![Page 468: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/468.jpg)
ReferencingotheraggregationsBecauseoftheirnature,thepipelineaggregationsneedtobeabletoaccesstheresultsoftheotheraggregations.Wecandothatviathebuckets_pathproperty,whichisdefinedusingaspecifiedformat.WecanuseafewkeywordsthatallowustotellElasticsearchexactlywhichaggregationandmetricweareinterestedin.The>separatestheaggregationsandthe.characterseparatestheaggregationfromitsmetrics.Forexample,my_sum.summeansthatwetakethesummetricofanaggregationcalledmy_sum.Anotherexampleispopular_tags>my_sum.sum,whichmeansthatweareinterestedinthesummetricofasubaggregationcalledmy_sum,whichisnestedinsidethepopular_tagsaggregation.Inadditiontothis,wecanuseaspecialpathcalled_count.Thiscanbeusedtocalculatethepipelineaggregationsondocumentcountinsteadofspecifiedmetrics.
www.EBooksWorld.ir
![Page 469: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/469.jpg)
GapsinthedataOurdatacancontaingaps–situationswherethedatadoesn’texist.Forsuchusecases,wehavetheabilitytospecifythegap_policypropertyandsetittoskiporinsert_zeros.TheskipvaluetellsElasticsearchtoignorethemissingdataandcontinuefromthenextavailablevalue,whileinsert_zerosreplacesthemissingvalueswithzero.
www.EBooksWorld.ir
![Page 470: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/470.jpg)
PipelineaggregationtypesMostoftheaggregationswewillshowinthissectionareverysimilartotheoneswe’vealreadyseeninthesectionsaboutmetricsandbucketsaggregations.Becauseofthat,wewon’tdiscussthemindepth.Therearealsonew,specificpipelineaggregationsthatwewanttotalkaboutinalittlemoredata.
Min,max,sum,andaveragebucketaggregationsThemin_bucket,max_bucket,sum_bucket,andavg_bucketaggregationsaresiblingaggregations,similarinwhattheyreturntothemin,max,sum,andavgaggregations.However,insteadofworkingonthedatareturnedbythequery,theyworkontheresultsoftheotheraggregations.
Toshowyouasimpleexampleofhowthisaggregationworks,let’scalculatethesumofallthebucketsreturnedbytheotheraggregations.Thequerythatwilldothatlooksasfollows:
{
"aggs":{
"periods_histogram":{
"histogram":{
"field":"year",
"interval":100
},
"aggs":{
"copies_per_100_years":{
"sum":{
"field":"copies"
}
}
}
},
"sum_copies":{
"sum_bucket":{
"buckets_path":"periods_histogram>copies_per_100_years"
}
}
}
}
Asyoucansee,weusedthehistogramaggregationandweincludedanestedaggregationthatcalculatesthesumofthecopiesfield.Oursum_bucketsiblingaggregationisusedoutsidethemainaggregationandreferstoitusingthebuckets_pathproperty.IttellsElasticsearchthatweareinterestedinsummingthevaluesofmetricsreturnedbythecopies_per_100_yearsaggregation.TheresultreturnedbyElasticsearchforthisquerylooksasfollows:
{
"took":2,
"timed_out":false,
"_shards":{
"total":5,
www.EBooksWorld.ir
![Page 471: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/471.jpg)
"successful":5,
"failed":0
},
"hits":{
"total":4,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"periods_histogram":{
"buckets":[{
"key":1800,
"doc_count":1,
"copies_per_100_years":{
"value":0.0
}
},{
"key":1900,
"doc_count":3,
"copies_per_100_years":{
"value":7.0
}
}]
},
"sum_copies":{
"value":7.0
}
}
}
Asyoucansee,Elasticsearchaddedanotherbuckettotheresults,calledsum_copies,whichholdsthevaluewewereinterestedin.
CumulativesumaggregationThecumulative_sumaggregationisaparentpipelineaggregationthatallowsustocalculatethesuminthehistogramordate_histogramaggregation.Asimpleexampleoftheaggregationlooksasfollows:
{
"aggs":{
"periods_histogram":{
"histogram":{
"field":"year",
"interval":100
},
"aggs":{
"copies_per_100_years":{
"sum":{
"field":"copies"
}
},
"cumulative_copies_sum":{
"cumulative_sum":{
"buckets_path":"copies_per_100_years"
}
www.EBooksWorld.ir
![Page 472: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/472.jpg)
}
}
}
}
}
Becausethisaggregationisaparentpipelineaggregation,itisdefinedinthesubaggregations.Thereturnedresultlooksasfollows:
{
"took":2,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"periods_histogram":{
"buckets":[{
"key":1800,
"doc_count":1,
"copies_per_100_years":{
"value":0.0
},
"cumulative_copies_sum":{
"value":0.0
}
},{
"key":1900,
"doc_count":3,
"copies_per_100_years":{
"value":7.0
},
"cumulative_copies_sum":{
"value":7.0
}
}]
}
}
}
Thefirstcumulative_copies_sumis0becauseofthesumdefinedinthebucket.Thesecondisthesumofallthepreviousonesandthecurrentbucket,whichmeans7.Thenextwillbethesumofallthepreviousonesandthenextbucket.
BucketselectoraggregationThebucket_selectoraggregationisanothersiblingparentaggregation.Itallowsusingascripttodecideifabucketshouldberetainedintheparentmulti-bucketaggregation.For
www.EBooksWorld.ir
![Page 473: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/473.jpg)
example,tokeeponlybucketsthathavemorethanonecopyperperiod,wecanrunthefollowingquery(itneedsthescript.inlinepropertytobesettoonintheelasticsearch.ymlfile):
{
"aggs":{
"periods_histogram":{
"histogram":{
"field":"year",
"interval":100
},
"aggs":{
"copies_per_100_years":{
"sum":{
"field":"copies"
}
},
"remove_empty_buckets":{
"bucket_selector":{
"buckets_path":{
"sum_copies":"copies_per_100_years"
},
"script":"sum_copies>1"
}
}
}
}
}
}
Therearetwoimportantthingshere.Thefirstisthebuckets_pathproperty,whichisdifferenttowhatwe’veusedsofar.Nowitusesakeyandavalue.Thekeyisusedtoreferencethevalueinthescript.Thesecondimportantthingisthescriptproperty,whichdefinesthescriptthatdecidesiftheprocessedbucketshouldberetained.TheresultsreturnedbyElasticsearchinthiscaseareasfollows:
{
"took":330,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"periods_histogram":{
"buckets":[{
"key":1900,
"doc_count":3,
"copies_per_100_years":{
www.EBooksWorld.ir
![Page 474: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/474.jpg)
"value":7.0
}
}]
}
}
}
Aswecansee,thebucketwiththecopies_per_100_yearsvalueequalto0hasbeenremoved.
BucketscriptaggregationThebucket_scriptaggregation(siblingparent)allowsustodefinemultiplebucketpathsandusetheminsideascript.Theusedmetricsmustbethenumerictypeandthereturnedvaluealsoneedstobenumeric.Anexampleofusingthisaggregationfollows(thefollowingqueryneedsthescript.inlinepropertytobesettoonintheelasticsearch.ymlfile):
{
"aggs":{
"periods_histogram":{
"histogram":{
"field":"year",
"interval":100
},
"aggs":{
"copies_per_100_years":{
"sum":{
"field":"copies"
}
},
"stats_per_100_years":{
"stats":{
"field":"copies"
}
},
"example_bucket_script":{
"bucket_script":{
"buckets_path":{
"sum_copies":"copies_per_100_years",
"count":"stats_per_100_years.count"
},
"script":"sum_copies/count*1000"
}
}
}
}
}
}
Therearetwothingshere.Thefirstthingisthatwe’vedefinedtwoentriesinthebuckets_pathproperty.Weareallowedtodothatinthebucket_scriptaggregation.Eachentryisakeyandavalue.Thekeyisthenameofthevaluethatwecanuseinthescript.Thesecondisthepathtotheaggregationmetricweareinterestedin.Ofcourse,the
www.EBooksWorld.ir
![Page 475: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/475.jpg)
scriptpropertydefinesthescriptthatreturnsthevalue.
Thereturnedresultsfortheprecedingqueryareasfollows:
{
"took":5,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
"max_score":0.0,
"hits":[]
},
"aggregations":{
"periods_histogram":{
"buckets":[{
"key":1800,
"doc_count":1,
"copies_per_100_years":{
"value":0.0
},
"stats_per_100_years":{
"count":1,
"min":0.0,
"max":0.0,
"avg":0.0,
"sum":0.0
},
"example_bucket_script":{
"value":0.0
}
},{
"key":1900,
"doc_count":3,
"copies_per_100_years":{
"value":7.0
},
"stats_per_100_years":{
"count":3,
"min":0.0,
"max":6.0,
"avg":2.3333333333333335,
"sum":7.0
},
"example_bucket_script":{
"value":2333.3333333333335
}
}]
}
}
}
www.EBooksWorld.ir
![Page 476: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/476.jpg)
SerialdifferencingaggregationTheserial_diffaggregationisaparentpipelineaggregationthatimplementsatechniquewherethevaluesintimeseriesdata(suchasahistogramordatehistogram)aresubtractedfromthemselvesatdifferenttimeperiods.Thistechniqueallowsdrawingthedatachangesbetweentimeperiodsinsteadofdrawingthewholevalue.Youknowthatthepopulationofacitygrowswithtime.Ifweusetheserialdifferencingaggregationwiththeperiodofoneday,wecanseethedailygrowth.
Tocalculatetheserial_diffaggregation,weneedtheparentaggregation,whichisahistogramoradate_histogram,andweneedtoprovideitwithbuckets_path,whichpointstothemetricweareinterestedin,andlag(apositive,non-zerointegervalue),whichtellswhichpreviousbuckettosubtractfromthecurrentone.Wecanomitlag,inwhichcaseElasticsearchwillsetitto1.
Let’snowlookatasimplequerythatusesthediscussedaggregation:
{
"aggs":{
"periods_histogram":{
"histogram":{
"field":"year",
"interval":100
},
"aggs":{
"copies_per_100_years":{
"sum":{
"field":"copies"
}
},
"first_difference":{
"serial_diff":{
"buckets_path":"copies_per_100_years",
"lag":1
}
}
}
}
}
}
Theresponsetotheprecedingquerylooksasfollows:
{
"took":68,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
"max_score":0.0,
www.EBooksWorld.ir
![Page 477: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/477.jpg)
"hits":[]
},
"aggregations":{
"periods_histogram":{
"buckets":[{
"key":1800,
"doc_count":1,
"copies_per_100_years":{
"value":0.0
}
},{
"key":1900,
"doc_count":3,
"copies_per_100_years":{
"value":7.0
},
"first_difference":{
"value":7.0
}
}]
}
}
}
Asyoucansee,withthesecondbucketwegotouraggregation(wewillgetitwitheverybucketafterthataswell).Thecalculatedvalueis7becausethecurrentvalueofcopies_per_100_yearsis7andthepreviousis0.Subtracting0from7givesus7.
DerivativeaggregationThederivativeaggregationisanotherexampleofparentpipelineaggregation.Asitsnamesuggests,itcalculatesaderivative(https://en.wikipedia.org/wiki/Derivative)ofagivenmetricfromahistogramordatehistogram.Theonlythingweneedtoprovideisbuckets_path,whichpointstothemetricweareinterestedin.Anexamplequeryusingthisaggregationlooksasfollows:
{
"aggs":{
"periods_histogram":{
"histogram":{
"field":"year",
"interval":100
},
"aggs":{
"copies_per_100_years":{
"sum":{
"field":"copies"
}
},
"derivative_example":{
"derivative":{
"buckets_path":"copies_per_100_years"
}
}
}
www.EBooksWorld.ir
![Page 478: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/478.jpg)
}
}
}
MovingavgaggregationThelastpipelineaggregationthatwewanttodiscussisthemoving_avgone.Itcalculatesthemovingaveragemetric(https://en.wikipedia.org/wiki/Moving_average)overthebucketsoftheparentaggregation(yes,thisisaparentpipelineaggregation).Similartothefewpreviouslydiscussedaggregations,itneedstoberunontheparenthistogramordatehistogramaggregation.
Whencalculatingthemovingaverage,Elasticsearchwilltakethewindow(specifiedbythewindowpropertyandsetto5bydefault),calculatetheaverageforbucketsinthewindow,movethewindowonebucketfurther,andrepeat.Ofcoursewealsoneedtoprovidebuckets_path,whichpointstothemetricthatthemovingaverageshouldbecalculatedfor.
Anexampleofusingthisaggregationlooksasfollows:
{
"aggs":{
"periods_histogram":{
"histogram":{
"field":"year",
"interval":10
},
"aggs":{
"copies_per_10_years":{
"sum":{
"field":"copies"
}
},
"moving_avg_example":{
"moving_avg":{
"buckets_path":"copies_per_10_years"
}
}
}
}
}
}
Wewillomitincludingtheresponsefortheprecedingqueryasitisquitelarge.
Predictingfuturebuckets
Theverynicethingaboutmovingaverageaggregationisthatitsupportspredictions;itcanattempttoextrapolatethedataithasandcreatefuturebuckets.Toforcetheaggregationtopredictbuckets,wejustneedtoaddthepredictpropertytoanymovingaverageaggregationandsetittothenumberofpredictionswewanttoget.Forexample,ifwewanttoaddfivepredictionstotheprecedingquery,wewillchangeittolookasfollows:
www.EBooksWorld.ir
![Page 479: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/479.jpg)
{
"aggs":{
"periods_histogram":{
"histogram":{
"field":"year",
"interval":10
},
"aggs":{
"copies_per_10_years":{
"sum":{
"field":"copies"
}
},
"moving_avg_example":{
"moving_avg":{
"buckets_path":"copies_per_10_years",
"predict":5
}
}
}
}
}
Ifyoulookattheresultsandcomparetheresponsereturnedforthepreviousquerywiththeonewithpredictions,youwillnoticethatthelastbucketinthepreviousqueryendsonthekeypropertyequalto1960,whilethequerywithpredictionsendsonthekeypropertyequalto2010,whichisexactlywhatwewantedtoachieve.
Themodels
Bydefault,Elasticsearchusesthesimplestmodelforcalculatingthemovingaveragesaggregation,butwecancontrolthatbyspecifyingthemodelproperty;thispropertyholdsthenameofthemodelandthesettingsobject,whichwecanusetoprovidemodelproperties.
Thepossiblemodelsare:simple,linear,ewma,holt,andholt_winters.Discussingeachofthemodelsindetailisbeyondthescopeofthebook,soifyouareinterestedindetailsaboutthedifferentmodels,refertotheofficialElasticsearchdocumentationregardingthemovingaveragesaggregationavailableathttps://www.elastic.co/guide/en/elasticsearch/reference/master/search-aggregations-pipeline-movavg-aggregation.html.
Anexamplequeryusingdifferentmodellooksasfollows:
{
"aggs":{
"periods_histogram":{
"histogram":{
"field":"year",
"interval":10},
"aggs":{
"copies_per_10_years":{
"sum":{
"field":"copies"
www.EBooksWorld.ir
![Page 480: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/480.jpg)
}},
"moving_avg_example":{
"moving_avg":{
"buckets_path":"copies_per_10_years",
"model":"holt",
"settings":{
"alpha":0.6,
"beta":0.4
}
}
}
}
}
}
}
www.EBooksWorld.ir
![Page 481: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/481.jpg)
www.EBooksWorld.ir
![Page 482: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/482.jpg)
SummaryThechapterwejustfinishedwasallaboutdataanalysisinElasticsearch:theaggregationsengine.Welearnedwhattheaggregationsareandhowtheywork.Weusedmetrics,buckets,andnewlyintroducedpipelineaggregations,andlearnedwhatwecandowiththem.
Inthenextchapter,we’llgobeyondfulltextsearching.Wewillusesuggesterstobuildefficientautocompletefunctionalityandcorrecttheusers’spellingmistakes.Wewillseewhatpercolationisandhowtouseitinourapplication.WewillusethegeospatialabilitiesofElasticsearchandwe’lllearnhowtoefficientlyfetchlargeamountofdatafromElasticsearch.
www.EBooksWorld.ir
![Page 483: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/483.jpg)
www.EBooksWorld.ir
![Page 484: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/484.jpg)
Chapter8.BeyondFull-textSearchingThepreviouschapterwasfullydedicatedtodataanalysisandhowwecanperformitwithElasticsearch.Welearnedhowtouseaggregations,whattypesofaggregationareavailable,andwhataggregationsareavailablewithineachtypeandhowtousethem.Inthischapter,wewillgetbacktoqueryrelatedtopics.Bytheendofthischapter,youwillhavelearnedthefollowingtopics:
WhatispercolatorandhowtouseitWhatarethegeospatialcapabilitiesofElasticsearchHowtouseandbuildfunctionalitiesusingElasticsearchsuggestersHowtousetheScrollAPItoefficientlyfetchlargenumbersofresults
www.EBooksWorld.ir
![Page 485: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/485.jpg)
PercolatorHaveyoueverwonderedwhatwouldhappenifwereversethetraditionalmodelofusingqueriestofinddocumentsinElasticsearch?Doesitmakesensetohaveadocumentandsearchforqueriesmatchingit?Itisnotsurprisingthatthereisawholerangeofsolutionswherethismodelisveryuseful.Wheneveryouoperateonanunboundedstreamofinputdata,whereyousearchfortheoccurrencesofparticularevents,youcanusethisapproach.Thiscanbeusedforthedetectionoffailuresinamonitoringsystemorforthe“Tellmewhenaproductwiththedefinedcriteriawillbeavailableinthisshop”functionality.Inthissection,wewilllookathowanElasticsearchpercolatorworksandhowwecanuseittoimplementoneoftheaforementionedusecases.
www.EBooksWorld.ir
![Page 486: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/486.jpg)
TheindexInalltheexamplestobeusedwhendiscussingpercolatorfunctionality,wewilluseanindexcallednotifier.Thementionedindexiscreatedbyusingthefollowingcommand:
curl-XPOST'localhost:9200/notifier'-d'{
"mappings":{
"book":{
"properties":{
"title":{
"type":"string"
},
"otitle":{
"type":"string"
},
"year":{
"type":"integer"
},
"available":{
"type":"boolean"
},
"tags":{
"type":"string",
"index":"not_analyzed"
}
}
}
}
}'
Itisquitesimple.Itcontainsasingletypeandfivefields,whichwillbeusedduringourjourneythroughtheworld.
www.EBooksWorld.ir
![Page 487: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/487.jpg)
PercolatorpreparationElasticsearchexposesaspecialtypecalled.percolatorthatistreateddifferently.Thismeansthatwecanstoreanydocumentsandalsosearchthemlikeanordinarytypeinanyindex.IfyoulookatanyElasticsearchquery,youwillnoticethateachisavalidJSONdocument,whichmeansthatwecanindexandstoreitasadocumentaswell.Thethingisthatpercolatorallowsustoinversethesearchlogicandsearchforquerieswhichmatchagivendocument.Thisispossiblebecauseofthetwojustdiscussedfeatures:thespecial.percolatortypeandthefactthatqueriesinElasticsearcharevalidJSONdocuments.
Let’sgetbacktothelibraryexamplefromChapter2,IndexingYourData,andtrytoindexoneofthequeriesinthepercolator.Weassumethatourusersneedtobeinformedwhenanybookmatchingthecriteriadefinedbythequeryisavailable.
Lookatthefollowingquery1.jsonfilethatcontainsanexamplequerygeneratedbytheuser:
{
"query":{
"bool":{
"must":{
"term":{
"title":"crime"
}
},
"should":{
"range":{
"year":{
"gt":1900,
"lt":2000
}
}
},
"must_not":{
"term":{
"otitle":"nothing"
}
}
}
}
}
Toenhancetheexample,wealsoassumethatourusersareallowedtodefinefiltersusingourhypotheticaluserinterface.Forexample,ourusermaybeinterestedintheavailablebooksthatwerewrittenbeforetheyear2010.Anexamplequerythatcouldhavebeenconstructedbysuchauserinterfacewouldlookasfollows(thequerywaswrittentothequery2.jsonfile):
{
"query":{
"bool":{
"must":{
"range":{
www.EBooksWorld.ir
![Page 488: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/488.jpg)
"year":{
"lt":2010
}
}
},
"filter":{
"term":{
"available":true
}
}
}
}
}
Now,let’sregisterbothqueriesinthepercolator(notethatweareregisteringthequeriesandhaven’tindexedanydocuments).Inordertodothis,wewillrunthefollowingcommands:
curl-XPUT'localhost:9200/notifier/.percolator/1'[email protected]
curl-XPUT'localhost:9200/notifier/.percolator/old_books'[email protected]
Intheprecedingexamples,weusedtwocompletelydifferentidentifiers.Wedidthatinordertoshowthatwecanuseanidentifierthatbestdescribesthequery.Itisuptoustodecideunderwhichnamewewouldlikethequerytoberegistered.
Wearenowreadytouseourpercolator.Ourapplicationwillprovidedocumentstothepercolatorandcheckifanyofthealreadyregisteredqueriesmatchthedocument.Thisisexactlywhatapercolatorallowsustodo-toreversethesearchlogic.Insteadofindexingthedocumentsandrunningqueriesagainstthem,westorethequeriesandsendthedocumentstofindthematchingqueries.
Let’suseanexampledocumentthatwillmatchbothstoredqueries;itwillhavetherequiredtitleandthereleasedate,andwillmentionwhetheritiscurrentlyavailable.Thecommandtosendsuchadocumenttothepercolatorlooksasfollows:
curl-XGET'localhost:9200/notifier/book/_percolate?pretty'-d'{
"doc":{
"title":"CrimeandPunishment",
"otitle":"Преступлéниеинаказáние",
"author":"FyodorDostoevsky",
"year":1886,
"characters":["Raskolnikov","SofiaSemyonovnaMarmeladova"],
"tags":[],
"copies":0,
"available":true
}
}'
Asweexpected,bothqueriesmatchedandtheElasticsearchresponseincludestheidentifiersofthematchingqueries.Sucharesponselooksasfollows:
{
"took":36,
"_shards":{
"total":5,
www.EBooksWorld.ir
![Page 489: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/489.jpg)
"successful":5,
"failed":0
},
"total":2,
"matches":[{
"_index":"notifier",
"_id":"old_books"
},{
"_index":"notifier",
"_id":"1"
}]
}
Thisworkslikeacharm.Oneveryimportantthingtonoteistheendpointusedinthisquery:_percolate.Usingthisendpointisrequiredwhenwewanttousethepercolator.Theindexnamecorrespondstotheindexwherethequerieswerestored,andthetypeisequaltothetypedefinedinthemappings.
NoteTheresponseformatcontainsinformationabouttheindexandthequeryidentifier.Thisinformationisincludedforcaseswhenwesearchagainstmultipleindicesatonce.Whenusingasingleindex,addinganadditionalqueryparameter,percolate_format=ids,willchangetheresponseasfollows:
"matches":["old_books","1"]
www.EBooksWorld.ir
![Page 490: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/490.jpg)
GettingdeeperBecausethequeriesregisteredinapercolatorareinfactdocuments,wecanuseanormalquerysenttoElasticsearchinordertochoosewhichqueriesstoredinthe.percolatortypeshouldbeusedinthepercolationprocess.Thismaysoundweird,butitreallygivesalotofpossibilities.Inourlibrary,wecanhaveseveralgroupsofusers.Let’sassumethatsomeofthemhavepermissionstoborrowveryrarebooks,orthatwehaveseveralbranchesinthecityandtheusercandeclarewhereheorshewouldliketogetthebookfrom.
Let’sseehowsuchusecasescanbeimplementedbyusingthepercolator.Todothis,wewillneedtoupdateourmappingandincludethebranchinformation.Wedothatbyrunningthefollowingcommand:
curl-XPOST'localhost:9200/notifier/.percolator/_mapping'-d'{
".percolator":{
"properties":{
"branches":{
"type":"string",
"index":"not_analyzed"
}
}
}
}'
Now,inordertoregisteraquery,weusethefollowingcommand:
curl-XPUT'localhost:9200/notifier/.percolator/3'-d'{
"query":{
"term":{
"title":"crime"
}
},
"branches":["brA","brB","brD"]
}'
Intheprecedingexample,weregisteredaquerythatshowsauser’sinterest.Ourhypotheticaluserisinterestedinanybookwiththetermcrimeinthetitlefield(thetermqueryisresponsibleforthis).Heorshewantstoborrowthisbookfromoneofthethreelistedbranches.Whenspecifyingthemappings,wedefinedthatthebranchesfieldisanon-analyzedstringfield.Wecannowincludeaqueryalongwiththedocumentwesentpreviously.Let’slookathowtodothis.
Ourbooksystemjustgotthebook,anditisreadytoreportthebookandcheckwhetherthebookisofinteresttoanyone.Tocheckthis,wesendthedocumentthatdescribesthebookandaddanadditionalquerytosucharequest-thequerythatwilllimittheuserstoonlytheonesinterestedinthebrBbranch.Sucharequestlooksasfollows:
curl-XGET'localhost:9200/notifier/book/_percolate?pretty'-d'{
"doc":{
"title":"CrimeandPunishment",
"otitle":"
www.EBooksWorld.ir
![Page 491: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/491.jpg)
Преступлéниеинаказáние
",
"author":"FyodorDostoevsky",
"year":1886,
"characters":["Raskolnikov","SofiaSemyonovnaMarmeladova"],
"tags":[],
"copies":0,
"available":true
},
"size":10,
"filter":{
"term":{
"branches":"brB"
}
}
}'
Ifeverythingwasexecutedcorrectly,theresponsereturnedbyElasticsearchshouldlookasfollows(weindexedourquerywith3asanidentifier):
{
"took":27,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"total":1,
"matches":[{
"_index":"notifier",
"_id":"3"
}]
}
ControllingthesizeofreturnedresultsThesizeoftheresultswhenitcomestopercolatormakesthedifference.Themorequeriesasingledocumentmatches,themoreresultswillbereturnedandmorememorywillbeneededbyElasticsearch.Becauseofthis,thereisoneadditionalthingtonote-thesizeparameter.Itallowsustolimitthenumberofmatchesreturned.
PercolatorandscorecalculationInthepreviousexamples,wefilteredourqueriesusingasingletermquery,butwedidn’tthinkaboutthescoringprocessatall.Elasticsearchallowsustocalculatethescorewhenusingthepercolator.Let’schangethepreviouslyuseddocumentsenttothepercolatorandadjustitsothatscoringisused:
curl-XGET'localhost:9200/notifier/book/_percolate?pretty'-d'{
"doc":{
"title":"CrimeandPunishment",
"otitle":"Преступлéниеинаказáние",
"author":"FyodorDostoevsky",
"year":1886,
"characters":["Raskolnikov","SofiaSemyonovnaMarmeladova"],
www.EBooksWorld.ir
![Page 492: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/492.jpg)
"tags":[],
"copies":0,
"available":true
},
"size":10,
"query":{
"term":{
"branches":"brB"
}
},
"track_scores":true,
"sort":{
"_score":"desc"
}
}'
Asyoucansee,weusedthequerysectionandincludedanadditionaltrack_scoresattributesettotrue.Thisisneeded,becausebydefaultElasticsearchwon’tcalculatethescoreforthedocumentsbecauseofperformance.Ifweneedscoresinthepercolationprocess,weshouldbeawarethatsuchquerieswillbeslightlymoredemandingwhenitcomestoCPUprocessingpowerthantheonesthatomitcalculatingthescore.
NoteIntheprecedingexample,wetoldElasticsearchtosortourresultonthebasisofthescoreindescendingorder.Thisisthedefaultbehaviorwhentrack_scoresisturnedon,sowecanomitsortdeclaration.Atthetimeofwriting,sortingonscoreindescendingdirectionistheonlyavailableoption.
CombiningpercolatorswithotherfunctionalitiesIfweareallowedtousequeriesalongwiththedocumentssentforpercolation,whycanwenotuseotherElasticsearchfunctionalities?Ofcourse,thisispossible.Forexample,thefollowingdocumentissentalongwithanaggregationandtheresultswillincludetheaggregationcalculation:
curl-XGET'localhost:9200/notifier/book/_percolate?pretty'-d'{
"doc":{
"title":"CrimeandPunishment",
"available":true
},
"aggs":{
"test":{
"terms":{
"field":"branches"
}
}
}
}'
Aswecansee,percolatorallowsustorunbothqueryandaggregations.Lookatthefollowingexampledocument:
curl-XGET'localhost:9200/notifier/book/_percolate?pretty'-d'{
www.EBooksWorld.ir
![Page 493: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/493.jpg)
"doc":{
"title":"CrimeandPunishment",
"year":1886,
"available":true
},
"size":10,
"highlight":{
"fields":{
"title":{}
}
}
}'
Asyoucansee,itcontainsahighlightingsection.AfragmentoftheresponsereturnedbyElasticsearchlooksasfollows:
{
"_index":"notifier",
"_id":"3",
"highlight":{
"title":["<em>Crime</em>andPunishment"]
}
}
NoteNotethattherearesomelimitationswhenitcomestothequerytypessupportedbythepercolatorfunctionality.Inthecurrentimplementation,parent-childrelationsarenotavailableinthepercolator,soyoucan’tusequeriessuchashas_child,top_children,andhas_parent.
www.EBooksWorld.ir
![Page 494: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/494.jpg)
GettingthenumberofmatchingqueriesSometimesyoudon’tcareaboutthematchedqueriesandyouonlywantthenumberofmatchedqueries.Insuchcases,sendingadocumentagainstthestandardpercolatorendpointisnotefficient.Elasticsearchexposesthe_percolate/countendpointtohandlesuchcasesinanefficientway.Anexampleofsuchacommandfollows:
curl-XGET'localhost:9200/notifier/book/_percolate/count?pretty'-d'{
"doc":{...}
}'
www.EBooksWorld.ir
![Page 495: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/495.jpg)
IndexeddocumentpercolationInthefinal,closingparagraphofthepercolationsection,wewanttoshowyouonemorething–thepossibilityofpercolatingadocumentthatisalreadyindexed.Todothis,weneedtousetheGEToperationonthedocumentandprovideinformationaboutwhichpercolatorindexshouldbeused.Let’slookatthefollowingcommand:
curl-XGET'localhost:9200/library/book/1/_percolate?
percolate_index=notifier'
Thiscommandchecksthedocumentwiththe1identifierfromourlibraryindexagainstthepercolatorindexdefinedbythepercolate_indexparameter.Rememberthat,bydefault,Elasticsearchwillusethepercolatorinthesameindexasthedocument;that’swhywe’vespecifiedthepercolate_indexparameter.
www.EBooksWorld.ir
![Page 496: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/496.jpg)
www.EBooksWorld.ir
![Page 497: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/497.jpg)
ElasticsearchspatialcapabilitiesThesearchserverssuchasElasticsearchareusuallylookedatfromtheperspectiveoffull-textsearching.Elasticsearch,becauseofitsmarketingasbeingpartofELK(Elasticsearch,Logstash,andKibana),isalsohighlyknownforbeingabletohandlelargeamountoftimeseriesdata.However,thisisonlyapartofthewholeview.Sometimesbothofthementionedusecasesarenotenough.Imaginesearchingforlocalservices.Fortheenduser,themostimportantthingistheaccuracyoftheresults.Byaccuracy,wenotonlymeantheproperresultsofthefull-textsearch,butalsotheresultsbeingasnearastheycanintermsoflocation.Inseveralcases,thisisthesameasatextsearchongeographicalnamessuchascitiesorstreets,butinothercaseswecanfinditveryusefultobeabletosearchonthebasisofthegeographicalcoordinatesofourindexeddocuments.AndthisisalsoafunctionalitythatElasticsearchiscapableofhandling.
WiththereleaseofElasticsearch2.2,thegeo_pointtypereceivedalotofchanges,especiallyinternallywherealltheoptimizationsweredone.Priorto2.2,thegeo_pointtypewasstoredintheindexasatwonotanalyzedstringvaluesandthischanged.WiththereleaseofElasticsearch2.2,thegeo_pointtypegotallthegreatimprovementsfromApacheLucenelibraryandisnowmoreefficient.
www.EBooksWorld.ir
![Page 498: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/498.jpg)
MappingpreparationforspatialsearchesInordertodiscussthespatialsearchfunctionality,let’sprepareanindexwithalistofcities.Thiswillbeaverysimpleindexwithonetypenamedpoi(whichstandsforthepointofinterest),thenameofthecity,anditscoordinates.Themappingsareasfollows:
{
"mappings":{
"poi":{
"properties":{
"name":{"type":"string"},
"location":{"type":"geo_point"}
}
}
}
}
Assumingthatweputthisdefinitionintothemapping1.jsonfile,wecancreateanindexbyrunningthefollowingcommand:
curl-XPUTlocalhost:9200/[email protected]
Theonlynewthingintheprecedingmappingsisthegeo_pointtype,whichisusedforthelocationfield.Byusingit,wecanstorethegeographicalpositionofourcityandusespatial-basedfunctionalities.
www.EBooksWorld.ir
![Page 499: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/499.jpg)
ExampledataOurexampledocuments1.jsonfilewithdocumentslooksasfollows:
{"index":{"_index":"map","_type":"poi","_id":1}}
{"name":"NewYork","location":"40.664167,-73.938611"}
{"index":{"_index":"map","_type":"poi","_id":2}}
{"name":"London","location":[-0.1275,51.507222]}
{"index":{"_index":"map","_type":"poi","_id":3}}
{"name":"Moscow","location":{"lat":55.75,"lon":37.616667}}
{"index":{"_index":"map","_type":"poi","_id":4}}
{"name":"Sydney","location":"-33.859972,151.211111"}
{"index":{"_index":"map","_type":"poi","_id":5}}
{"name":"Lisbon","location":"eycs0p8ukc7v"}
Inordertoperformabulkrequest,weaddedinformationabouttheindexname,type,anduniqueidentifiersofourdocuments;so,wecannoweasilyimportthisdatausingthefollowingcommand:
curl-XPOSTlocalhost:9200/[email protected]
Onethingthatweshouldtakeacloserlookatisthelocationfield.Wecanusevariousnotationsforcoordination.Wecanprovidethelatitudeandlongitudevaluesasastring,asapairofnumbers,orasanobject.Notethatthestringandarraymethodsofprovidingthegeographicallocationhavedifferentordersforthelatitudeandlongitudeparameters.ThelastrecordshowsthatthereisalsoapossibilitytogivecoordinationasaGeohashvalue(thenotationisdescribedindetailathttp://en.wikipedia.org/wiki/Geohash).
Additionalgeo_fieldpropertiesWiththereleaseofElasticsearch2.2,thenumberofparametersthatthegeo_pointtypecanaccepthasbeenreducedandisasfollows:
geohash:BooleanparametertellingElasticsearchwhetherthe.geohashfieldshouldbecreated.Defaultstofalseunlessgeohash_prefixisused.geohash_precision:Maximumsizeofgeohashandgeohash_prefix.geohash_prefix:BooleanparametertellingElasticsearchtoindexthegeohashanditsprefixes.Defaultstofalse.ignore_malformed:BooleanparametertellingElasticsearchtoignoreabadlywrittengeo_fieldpointinsteadofrejectingthewholedocument.Defaultstofalse,whichmeansthatthebadlyformattedgeo_fielddatawillresultinanindexationerrorforthewholedocument.lat_lon:BooleanparametertellingElasticsearchtoindexthespatialdataintwoseparatefieldscalled.latand.lon.Defaultstofalse.precision_step:Parameterallowingcontroloverhowournumericgeographicalpointswillbeindexed.
Keepinmindthatthegeohashfieldrelatedandlat_lonfieldrelatedpropertieswerenotremovedforbackward-compatibilityreasons.Theuserscanstillusethem.However,thequerieswillnotusethembutwillinsteadusethehighlyoptimizeddatastructurethatis
www.EBooksWorld.ir
![Page 500: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/500.jpg)
builtduringindexingbythegeo_pointtype.
www.EBooksWorld.ir
![Page 501: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/501.jpg)
SamplequeriesNowlet’slookatseveralexamplesofusingcoordinatesandsolvingcommonrequirementsinmodernapplicationsthatrequiregeographicaldatasearchingalongwithfull-textsearching.
NoteIfyouareinterestedinallthegeospatialqueriesthatareavailableforElasticsearchusers,refertotheofficialdocumentationavailableathttps://www.elastic.co/guide/en/elasticsearch/reference/current/geo-queries.html.
Distance-basedsortingLet’sstartwithaverycommonrequirement:sortingthereturnedresultsbydistancefromagivenpoint.Inourexample,wewanttogetallthecitiesandsortthembytheirdistancesfromthecapitalofFrance,Paris.Todothis,wesendthefollowingquerytoElasticsearch:
curl-XGETlocalhost:9200/map/_search?pretty-d'{
"query":{
"match_all":{}
},
"sort":[{
"_geo_distance":{
"location":"48.8567,2.3508",
"unit":"km"
}
}]
}'
IfyouremembertheSortingdatasectionfromChapter4,ExtendingYourQueryingKnowledge,you’llnoticethattheformatisslightlydifferent.Weareusingthe_geo_distancekeytoindicatesortingbydistance.Wemustgivethebaselocation(thelocationattribute,whichholdstheinformationofthelocationofParisinourcase),andweneedtospecifytheunitsthatcanbeusedintheresults.Theavailablevaluesarekmandmi,whichstandforkilometersandmiles,respectively.Theresultofsuchaquerywillbeasfollows:
{
"took":5,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":5,
"max_score":null,
"hits":[{
"_index":"map",
"_type":"poi",
"_id":"2",
www.EBooksWorld.ir
![Page 502: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/502.jpg)
"_score":null,
"_source":{
"name":"London",
"location":[-0.1275,51.507222]
},
"sort":[343.17487356850313]
},{
"_index":"map",
"_type":"poi",
"_id":"5",
"_score":null,
"_source":{
"name":"Lisbon",
"location":"eycs0p8ukc7v"
},
"sort":[1452.9506736367805]
},{
"_index":"map",
"_type":"poi",
"_id":"3",
"_score":null,
"_source":{
"name":"Moscow",
"location":{
"lat":55.75,
"lon":37.616667
}
},
"sort":[2483.837565935267]
},{
"_index":"map",
"_type":"poi",
"_id":"1",
"_score":null,
"_source":{
"name":"NewYork",
"location":"40.664167,-73.938611"
},
"sort":[5832.645958617513]
},{
"_index":"map",
"_type":"poi",
"_id":"4",
"_score":null,
"_source":{
"name":"Sydney",
"location":"-33.859972,151.211111"
},
"sort":[16978.094780773998]
}]
}
}
Aswiththeotherexamplesofsorting,Elasticsearchshowsinformationaboutthevalueusedforsorting.Let’slookatthehighlightedrecord.Aswecansee,thedistancebetweenParisandLondonisabout343km,andifyoucheckatraditionalmap,youwillseethat
www.EBooksWorld.ir
![Page 503: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/503.jpg)
thisistrue.
BoundingboxfilteringThenextexamplethatwewanttoshowisnarrowingdowntheresultstoaselectedareathatisboundedbyagivenrectangle.Thisisveryhandyifwewanttoshowresultsonthemaporwhenweallowausertomarkthemapareaforsearching.YoualreadyreadaboutfiltersintheFilteringyourresultssectionofChapter4,ExtendingYourQueryingKnowledge,buttherewedidn’tmentionspatialfilters.Thefollowingqueryshowshowwecanfilterbyusingtheboundingbox:
curl-XGETlocalhost:9200/map/_search?pretty-d'{
"query":{
"bool":{
"must":{"match_all":{}},
"filter":{
"geo_bounding_box":{
"location":{
"top_left":"52.4796,-1.903",
"bottom_right":"48.8567,2.3508"
}
}
}
}
}
}'
Intheprecedingexample,weselectedamapfragmentbetweenBirminghamandParisbyprovidingthetop-leftandbottom-rightcornercoordinates.Thesetwocornersareenoughtospecifyanyrectanglewewant,andElasticsearchwilldotherestofthecalculationforus.Thefollowingscreenshotshowsthespecifiedrectangleonthemap:
www.EBooksWorld.ir
![Page 504: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/504.jpg)
Aswecansee,theonlycityfromourdatathatmeetsthecriteriaisLondon.So,let’scheckwhetherElasticsearchknowsthisbyrunningtheprecedingquery.Let’snowlookatthereturnedresults:
{
"took":38,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":1,
"max_score":1.0,
"hits":[{
"_index":"map",
"_type":"poi",
"_id":"2",
"_score":1.0,
"_source":{
"name":"London",
"location":[-0.1275,51.507222]
}
}]
}
}
www.EBooksWorld.ir
![Page 505: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/505.jpg)
Asyoucansee,againElasticsearchagreeswiththemap.
LimitingthedistanceThelastexampleshowsthenextcommonrequirement:limitingtheresultstotheplacesthatarelocatednofurtherthanthedefineddistancefromagivenpoint.Forexample,ifwewanttolimitourresultstoallthecitieswithinthe500kmradiusfromParis,wecanusethefollowingquery:
curl-XGETlocalhost:9200/map/_search?pretty-d'{
"query":{
"bool":{
"must":{"match_all":{}},
"filter":{
"geo_distance":{
"location":"48.8567,2.3508",
"distance":"500km"
}
}
}
}
}'
Ifeverythinggoeswell,Elasticsearchshouldonlyreturnasinglerecordfortheprecedingquery,andtherecordshouldbeLondonagain.However,wewillleaveitforyouasareadertocheck.
www.EBooksWorld.ir
![Page 506: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/506.jpg)
ArbitrarygeoshapesSometimes,usingasinglegeographicalpointorasinglerectangleisjustnotenough.Insuchcasessomethingmoresophisticatedisneeded,andElasticsearchaddressesthisbygivingyouthepossibilitytodefineshapes.Inordertoshowyouhowwecanleveragecustomshape-limitinginElasticsearch,weneedtomodifyourindexorcreateanewoneandintroducethegeo_shapetype.Ournewmappinglooksasfollows(wewillusethistocreateanindexcalledmap2):
{
"mappings":{
"poi":{
"properties":{
"name":{"type":"string","index":"not_analyzed"},
"location":{"type":"geo_shape"}
}
}
}
}
Assumingwewrotetheprecedingmappingdefinitiontothemapping2.jsonfile,wecancreateanindexbyusingthefollowingcommand:
curl-XPUTlocalhost:9200/[email protected]
NoteElasticsearchallowsustosetseveralattributesforthegeo_shapetype.Themostcommonlyusedistheprecisionparameter.Duringindexing,theshapeshavetobeconvertedtoasetofterms.Themoreaccuracyrequired,themoretermsshouldbegenerated,whichisdirectlyreflectedintheindexsizeandperformance.Precisioncanbedefinedinthefollowingunits:in,inch,yd,yard,mi,miles,km,kilometers,m,meters,cm,centimeters,ormm,millimeters.Bydefault,theprecisionissetto50m.
Next,let’schangeourexampledatatomatchournewindexstructureandcreatethedocuments2.jsonfilewiththefollowingcontents:
{"index":{"_index":"map2","_type":"poi","_id":1}}
{"name":"NewYork","location":{"type":"point","coordinates":
[-73.938611,40.664167]}}
{"index":{"_index":"map2","_type":"poi","_id":2}}
{"name":"London","location":{"type":"point","coordinates":
[-0.1275,51.507222]}}
{"index":{"_index":"map2","_type":"poi","_id":3}}
{"name":"Moscow","location":{"type":"point","coordinates":[
37.616667,55.75]}}
{"index":{"_index":"map2","_type":"poi","_id":4}}
{"name":"Sydney","location":{"type":"point","coordinates":
[151.211111,-33.865143]}}
{"index":{"_index":"map2","_type":"poi","_id":5}}
{"name":"Lisbon","location":{"type":"point","coordinates":
[-9.142685,38.736946]}}
www.EBooksWorld.ir
![Page 507: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/507.jpg)
Thestructureofthefieldofthegeo_shapetypeisdifferentfromgeo_point.ItissyntacticallycalledGeoJSON(http://en.wikipedia.org/wiki/GeoJSON).Itallowsustodefinevariousgeographicaltypes.Nowit’stimetoindexourdata:
curl-XPOSTlocalhost:9200/[email protected]
Let’ssumupthetypesthatwecanuseduringquerying,atleasttheonesthatwethinkarethemostusefulones.
PointApointisdefinedbythetablewhenthefirstelementisthelongitudeandthesecondisthelatitude.Anexampleofsuchashapeisasfollows:
{
"type":"point",
"coordinates":[-0.1275,51.507222]
}
EnvelopeAnenvelopedefinesaboxgivenbythecoordinatesoftheupper-leftandbottom-rightcornersofthebox.Anexampleofsuchashapeisasfollows:
{
"type":"envelope",
"coordinates":[[-0.087890625,51.50874245880332],[2.4169921875,
48.80686346108517]]
}
PolygonApolygondefinesalistofpointsthatareconnectedtocreateourpolygon.Thefirstandthelastpointinthearraymustbethesamesothattheshapeisclosed.Anexampleofsuchashapeisasfollows:
{
"type":"polygon",
"coordinates":[[
[-5.756836,49.991408],
[-7.250977,55.124723],
[1.845703,51.500194],
[-5.756836,49.991408]
]]
}
Ifyoulookcloselyattheshapedefinition,youwillfindasupplementaryleveloftables.Thankstothis,youcandefinemorethanasinglepolygon.Insuchacase,thefirstpolygondefinesthebaseshapeandtherestofthepolygonsaretheshapesthatwillbeexcludedfromthebaseshape.
MultipolygonThemultipolygonshapeallowsustocreateashapethatconsistsofmultiplepolygons.Anexampleofsuchashapeisasfollows:
www.EBooksWorld.ir
![Page 508: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/508.jpg)
{
"type":"multipolygon",
"coordinates":[
[[
[-5.756836,49.991408],
[-7.250977,55.124723],
[1.845703,51.500194],
[-5.756836,49.991408]
]],[[
[-0.087890625,51.50874245880332],
[2.4169921875,48.80686346108517],
[3.88916015625,51.01375465718826],
[-0.087890625,51.50874245880332]
]]]
}
Themultipolygonshapecontainsmultiplepolygonsandfallsintothesamerulesasthepolygontype.So,wecanhavemultiplepolygonsand,inadditiontothis,wecanincludemultipleexclusionshapes.
AnexampleusageNowthatwehaveourindexwiththegeo_shapefields,wecancheckwhichcitiesarelocatedintheUK.Thequerythatwillallowustodothislooksasfollows:
curl-XGETlocalhost:9200/map2/_search?pretty-d'{
"query":{
"bool":{
"must":{"match_all":{}},
"filter":{
"geo_shape":{
"location":{
"shape":{
"type":"polygon",
"coordinates":[[
[-5.756836,49.991408],[-7.250977,55.124723],
[-3.955078,59.352096],[1.845703,51.500194],
[-5.756836,49.991408]
]]
}
}
}
}
}
}
}'
ThepolygontypedefinestheboundariesoftheUK(inavery,veryimpreciseway),andElasticsearch’sresponseisasfollows:
{
"took":7,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
www.EBooksWorld.ir
![Page 509: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/509.jpg)
"failed":0
},
"hits":{
"total":1,
"max_score":1.0,
"hits":[{
"_index":"map2",
"_type":"poi",
"_id":"2",
"_score":1.0,
"_source":{
"name":"London",
"location":{
"type":"point",
"coordinates":[-0.1275,51.507222]
}
}
}]
}
}
Asfarasweknow,theresponseiscorrect.
StoringshapesintheindexUsually,shapedefinitionsarecomplex,andthedefinedareasdon’tchangetoooften(forexample,theboundariesoftheUK).Insuchcases,itisconvenienttodefinetheshapesintheindexandusetheminqueries.Thisispossible,andwewillnowdiscusshowtodoit.Asusual,wewillstartwiththeappropriatemapping,whichisasfollows:
{
"mappings":{
"country":{
"properties":{
"name":{"type":"string","index":"not_analyzed"},
"area":{"type":"geo_shape"}
}
}
}
}
Thismappingissimilartothemappingusedpreviously.Wehaveonlychangedthefieldnameandsaveditinthemapping3.jsonfile.Let’screateanewindexbyrunningthefollowingcommand:
curl-XPUTlocalhost:9200/[email protected]
Theexampledatathatwewilluselooksasfollows(storedinthefilecalleddocuments3.json):
{"index":{"_index":"countries","_type":"country","_id":1}}
{"name":"UK","area":{"type":"polygon","coordinates":[[[-5.756836,
49.991408],[-7.250977,55.124723],[-3.955078,59.352096],[1.845703,
51.500194],[-5.756836,49.991408]]]}}
{"index":{"_index":"countries","_type":"country","_id":2}}
{"name":"France","area":{"type":"polygon","coordinates":[[[
www.EBooksWorld.ir
![Page 510: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/510.jpg)
3.1640625,42.09822241118974],[-1.7578125,43.32517767999296],[
-4.21875,48.22467264956519],[2.4609375,50.90303283111257],[
7.998046875,48.980216985374994],[7.470703125,44.08758502824516],[
3.1640625,42.09822241118974]]]}}
{"index":{"_index":"countries","_type":"country","_id":3}}
{"name":"Spain","area":{"type":"polygon","coordinates":[[[
3.33984375,42.22851735620852],[-1.845703125,43.32517767999296],[
-9.404296875,43.19716728250127],[-6.6796875,41.57436130598913],[
-7.3828125,36.87962060502676],[-2.109375,36.52729481454624],[
3.33984375,42.22851735620852]]]}}
Toindexthedata,wejustneedtorunthefollowingcommand:
curl-XPOSTlocalhost:9200/[email protected]
Asyoucanseeinthedata,eachdocumentcontainsapolygontype.Thepolygonsdefinetheareaofthegivencountries(again,itisfarfrombeingaccurate).Ifyouremember,thefirstpointofashapeneedstobethesameasthelastonesothattheshapeisclosed.Now,let’schangeourquerytoincludetheshapesfromtheindex.Ournewquerylooksasfollows:
curl-XGETlocalhost:9200/map2/_search?pretty-d'{
"query":{
"bool":{
"must":{"match_all":{}},
"filter":{
"geo_shape":{
"location":{
"indexed_shape":{
"index":"countries",
"type":"country",
"path":"area",
"id":"1"
}
}
}
}
}
}
}'
Whencomparingthesetwoqueries,wecannotethattheshapeobjectchangedtoindexed_shape.WeneedtotellElasticsearchwheretolookforthisshape.Wecandothisbydefiningtheindex(theindexproperty,whichdefaultstoshape),thetype(thetypeproperty),andthepath(thepathproperty,whichdefaultstoshape).Theoneitemlackingisanidpropertyoftheshape.Inourcase,thisis1.However,ifyouwanttoindexmoreshapes,weadviseyoutoindextheshapeswiththeirnameastheiridentifier.
www.EBooksWorld.ir
![Page 511: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/511.jpg)
www.EBooksWorld.ir
![Page 512: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/512.jpg)
UsingsuggestersAlongtimeago,startingfromElasticsearch0.90(whichwasreleasedonApril29,2013),wegottheabilitytouseso-calledsuggesters.Wecandefineasuggesterasafunctionalityallowingustocorrecttheuser’sspellingmistakesandbuildautocompletefunctionalitykeepingperformanceinmind.Thissectionisdedicatedtothesefunctionalitiesandwillhelpyoulearnaboutthem.Wewilldiscusseachavailablesuggestertypeandshowthemostcommonpropertiesthatallowustocontrolthem.However,keepinmindthatthissectionisnotacomprehensiveguidedescribingeachandeveryproperty.Descriptionofallthedetailsaboutsuggestersareaverybroadtopicandisoutofthescopeofthisbook.Ifyouwanttodigintotheirfunctionality,refertotheofficialElasticsearchdocumentation(https://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters.html)ortotheMasteringElasticsearchSecondEditionbookpublishedbyPacktPublishing.
www.EBooksWorld.ir
![Page 513: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/513.jpg)
AvailablesuggestertypesThesehavechangedsincetheinitialintroductionoftheSuggestAPItoElasticsearch.Wearenowabletousefourtypeofsuggesters:
term:Asuggesterreturningcorrectionsforeachwordpassedtoit.Usefulforsuggestionsthatarenotphrases,suchassingletermqueries.phrase:Asuggesterworkingonphrases,returningaproperphrase.completion:Asuggesterdesignedtoprovidefastandefficientautocompleteresults.context:ExtensiontotheSuggestAPIofElasticsearch.Allowsustohandlepartsofthesuggestqueriesinmemoryandthusveryeffectiveintermsofperformance.
www.EBooksWorld.ir
![Page 514: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/514.jpg)
IncludingsuggestionsLet’snowtrygettingsuggestionsalongwiththequeryresults.Forexample,let’suseamatch_allqueryandtrygettingasuggestionforaserlockholnesphrase,whichhastwotermsspelledincorrectly.Todothis,werunthefollowingcommand:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"match_all":{}
},
"suggest":{
"first_suggestion":{
"text":"serlockholnes",
"term":{
"field":"_all"
}
}
}
}'
Asyoucansee,we’veintroducedanewsectiontoourquery–thesuggestone.We’vespecifiedthetextwewanttogetthecorrectionforbyusingthetextproperty.We’vespecifiedthesuggesterwewanttouse(thetermone)andconfigureditspecifyingthenameofthefieldthatshouldbeusedforbuildingsuggestionsusingthefieldproperty.first_suggestionisthenamewegivetooursuggester;weneedtodothisbecausetherecanbemultipleonesused.Thisishowyousendarequestforsuggestioningeneral.
Ifwewanttogetmultiplesuggestionsforthesametext,wecanembedoursuggestionsinthesuggestobjectandplacethetextpropertyasthesuggestobjectoption.Forexample,ifwewanttogetsuggestionsfortheserlockholnestextforthetitlefieldandforthe_allfield,werunthefollowingcommand:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"query":{
"match_all":{}
},
"suggest":{
"text":"serlockholnes",
"first_suggestion":{
"term":{
"field":"_all"
}
},
"second_suggestion":{
"term":{
"field":"title"
}
}
}
}'
Suggesterresponse
www.EBooksWorld.ir
![Page 515: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/515.jpg)
Nowlet’slookattheresponseofthefirstquerywesent.Asyoucanguess,theresponseincludesboththequeryresultsandthesuggestions:
{
"took":10,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
"max_score":1.0,
"hits":[...]
},
"suggest":{
"first_suggestion":[{
"text":"serlock",
"offset":0,
"length":7,
"options":[{
"text":"sherlock",
"score":0.85714287,
"freq":1
}]
},{
"text":"holnes",
"offset":8,
"length":6,
"options":[{
"text":"holmes",
"score":0.8333333,
"freq":1
}]
}]
}
}
Wecanseethatwegotboththesearchresultsandthesuggestions(we’veomittedtheresultstomaketheexamplemorereadable)intheresponse.Thetermsuggesterreturnedalistofpossiblesuggestionsforeachtermthatwaspresentinthetextparameter.Foreachterm,thetermsuggesterreturnsanarrayofpossiblesuggestions.Lookingatthedatareturnedfortheserlockterm,wecanseetheoriginalword(thetextparameter),itsoffsetintheoriginaltextparameter(theoffsetparameter),anditslength(thelengthparameter).
TheoptionsarraycontainssuggestionsforthegivenwordandwillbeemptyifElasticsearchdoesn’tfindanysuggestions.Eachentryinthisarrayisasuggestionanddescribedbythefollowingproperties:
text:Textofthesuggestion.score:Suggestionscore;thehigherthescore,thebetterthesuggestion.freq:Frequencyofthesuggestion.Thefrequencyrepresentshowmanytimesthe
www.EBooksWorld.ir
![Page 516: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/516.jpg)
wordappearsinthedocumentsintheindexwearerunningthesuggestionqueryagainst.
www.EBooksWorld.ir
![Page 517: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/517.jpg)
TermsuggesterThetermsuggesterworksonthebasisofstringeditdistance.Thismeansthatthesuggestionwiththefewestcharactersthatneedtobechanged,added,orremovedtomakethesuggestionlookastheoriginalword,isthebestone.Forexample,let’stakethewordsworlandwork.Tochangetheworltermtowork,weneedtochangethellettertok,soitmeansadistanceof1.Thetextprovidedtothesuggesterisofcourseanalyzedandthentermsarechosentobesuggested.
TermsuggesterconfigurationoptionsThecommonandmostusedtermsuggesteroptionscanbeusedforallthesuggesterimplementationsthatarebasedonthetermone.Currently,thesearethephrasesuggesterandofcoursethebasetermone.Theavailableoptionsare:
text:Thetextwewanttogetthesuggestionsfor.Thisparameterisrequiredinorderforthesuggestertowork.field:Anotherrequiredparameterthatweneedtoprovide.Thefieldparameterallowsustosetwhichfieldthesuggestionsshouldbegeneratedfor.analyzer:Thenameoftheanalyzerwhichshouldbeusedtoanalyzethetextprovidedinthetextparameter.Ifnotset,Elasticsearchutilizestheanalyzerusedforthefieldprovidedbythefieldparameter.size:Defaultsto5andspecifiesthemaximumnumberofsuggestionsallowedtobereturnedbyeachtermprovidedinthetextparameter.suggest_mode:Controlswhichsuggestionswillbeincludedandforwhattermsthesuggestionswillbereturned.Thepossibleoptionsare:missing–thedefaultbehavior,whichmeansthatthesuggesterwillonlyprovidesuggestionsfortermsthatarenotpresentintheindex;popular–meansthatthesuggestionswillonlybereturnedwhentheyaremorefrequentthantheprovidedterm;andfinallyalwaysmeansthatsuggestionswillbereturnedeverytime.sort:AllowsustospecifyhowthesuggestionsaresortedintheresultreturnedbyElasticsearch.Bydefault,itissettoscore,whichtellsElasticsearchthatthesuggestionsshouldbesortedbythesuggestionscorefirst,thesuggestiondocumentfrequencynext,andfinallybytheterm.Thesecondpossiblevalueisfrequency,whichmeansthattheresultsarefirstsortedbythedocumentfrequency,thenbythescore,andfinallybytheterm.
AdditionaltermsuggesteroptionsInadditiontotheprecedingcommontermsuggestoptions,Elasticsearchallowsustouseadditionalonesthatonlymakesenseforthetermsuggesteritself.Someoftheseoptionsareasfollows:
lowercase_terms:Whensettotrue,ittellsElasticsearchtolowercaseallthetermsthatareproducedfromthetextfieldafteranalysis.max_edits:Itdefaultsto2andspecifiesthemaximumeditdistancethatthesuggestioncanhavetobereturnedasatermsuggestion.Elasticsearchallowsusto
www.EBooksWorld.ir
![Page 518: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/518.jpg)
setthisvalueto1or2.prefix_len:Bydefault,itissetto1.Ifwearestrugglingwithsuggesterperformance,increasingthisvaluewillimprovetheoverallperformance,becausefewersuggestionswillneedtobeprocessed.min_word_len:Itdefaultsto4andspecifiestheminimumnumberofcharactersasuggestionmusthaveinordertobereturnedonthesuggestionslist.shard_size:Itdefaultstothevaluespecifiedbythesizeparameterandallowsustosetthemaximumnumberofsuggestionsthatshouldbereadfromeachshard.Settingthispropertytovalueshigherthanthesizeparametercanresultinmoreaccuratedocumentfrequencyatthecostofdegradationinsuggesterperformance.
NoteTheprovidedlistofparametersdoesnotcontainalltheoptionsthatareavailableforthetermsuggester.RefertotheofficialElasticsearchdocumentationforreference,athttps://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-term.html.
www.EBooksWorld.ir
![Page 519: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/519.jpg)
PhrasesuggesterThetermsuggesterprovidesagreatwaytocorrectuserspellingmistakesonpertermbasis,butitisnotgreatforphrases.That’swhythephrasesuggesterwasintroduced.Itisbuiltontopofthetermsuggester,butaddsadditionalphrasecalculationlogictoit.
Let’sstartwithanexampleofhowtousethephrasesuggester.Thistimewewillomitthequerysectioninourquery.Wedothatbyrunningthefollowingcommand:
curl-XGET'localhost:9200/library/_search?pretty'-d'{
"suggest":{
"text":"sherlockholnes",
"our_suggestion":{
"phrase":{"field":"_all"}
}
}
}'
Asyoucanseeintheprecedingcommand,itisalmostthesameaswesentwhenusingthetermsuggester,butinsteadofspecifyingthetermsuggestertypewe’vespecifiedthephrasetype.Theresponsetotheprecedingcommandisasfollows:
{
"took":24,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
"max_score":1.0,
"hits":[...]
},
"suggest":{
"our_suggestion":[{
"text":"sherlockholnes",
"offset":0,
"length":15,
"options":[{
"text":"sherlockholmes",
"score":0.12227806
}]
}]
}
}
Asyoucansee,theresponseisverysimilartotheonereturnedbythetermsuggesterbut,insteadofasinglewordbeingreturned,itisalreadycombinedandreturnedasaphrase.
ConfigurationBecausethephrasesuggesterisbasedonthetermsuggester,itcanalsousesomeofthe
www.EBooksWorld.ir
![Page 520: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/520.jpg)
configurationoptionsprovidedbyit.Thoseoptionsare:text,size,analyzer,andshard_size.Inadditiontothementionedproperties,thephrasesuggesterexposesadditionaloptions.Someoftheseoptionsare:
max_errors:Specifiesthemaximumnumber(orpercentage)oftermsthatcanbeerroneousinordertocreateacorrectionusingit.Thevalueofthispropertycanbeeitheranintegernumber,suchas1,orafloatbetween0and1whichwillbetreatedasapercentagevalue.Bydefault,itissetto1,whichmeansthatatmostasingletermcanbemisspelledinagivencorrection.separator:Defaultstoawhitespacecharacterandspecifiestheseparatorthatwillbeusedtodividethetermsintheresultingbigramfield.
NoteTheprovidedlistofparametersdoesnotcontainalltheoptionsthatareavailableforthephrasesuggester.Infact,thelistiswaymoreextensivethanwhatwe’veprovided.RefertotheofficialElasticsearchdocumentationforreference,athttps://www.elastic.co/guide/en/elasticsearch/reference/current/search-suggesters-phrase.html,ortoMasteringElasticsearchSecondEditionpublishedbyPacktPublishing.
www.EBooksWorld.ir
![Page 521: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/521.jpg)
CompletionsuggesterThecompletionsuggesterallowsustocreateautocompletefunctionalityinaveryperformance-effectiveway,becauseofstoringcomplicatedstructuresintheindexinsteadofcalculatingthemduringquerytime.WeneedtoprepareElasticsearchforthatbyusingadedicatedfieldtypecalledcompletion.Let’sassumethatwewanttocreateanautocompletefeaturetoallowustoshowbookauthors.Inadditiontoauthor’snamewewanttoreturntheidentifiersofthebooksshe/hewrote.Westartwithcreatingtheauthorsindexbyrunningthefollowingcommand:
curl-XPOST'localhost:9200/authors'-d'{
"mappings":{
"author":{
"properties":{
"name":{"type":"string"},
"ac":{
"type":"completion",
"payloads":true,
"analyzer":"standard",
"search_analyzer":"standard"
}
}
}
}
}'
Ourindexwillcontainasingletypecalledauthor.Eachdocumentwillhavetwofields:thenameandtheacfield,whichisthefieldwewilluseforautocomplete.We’vedefinedtheacfieldusingthecompletiontype.Inadditiontothat,we’veusedthestandardanalyzerforboththeindexandthequerytime.Thelastthingisthepayload-theadditional,optionalinformationwewillreturnalongwiththesuggestion-inourcaseitwillbeanarrayofbookidentifiers.
IndexingdataToindexthedata,weneedtoprovidesomeadditionalinformationalongwiththeonesweusuallyprovideduringindexing.Let’slookatthefollowingcommandsthatindextwodocumentsdescribingtheauthors:
curl-XPOST'localhost:9200/authors/author/1'-d'{
"name":"FyodorDostoevsky",
"ac":{
"input":["fyodor","dostoevsky"],
"output":"FyodorDostoevsky",
"payload":{"books":["123456","123457"]}
}
}'
curl-XPOST'localhost:9200/authors/author/2'-d'{
"name":"JosephConrad",
"ac":{
"input":["joseph","conrad"],
"output":"JosephConrad",
www.EBooksWorld.ir
![Page 522: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/522.jpg)
"payload":{"books":["121211"]}
}
}'
Notethestructureofthedatafortheacfield.Wehaveprovidedtheinput,output,andpayloadproperties.Theoptionalpayloadpropertyisusedtoprovidetheadditionalinformationthatwillbereturned.Theinputpropertyisusedtoprovidetheinputinformationthatwillbeusedforbuildingthecompletionusedbythesuggester.Itwillbeusedforuserinputmatching.Theoptionaloutputpropertyisusedtotellthesuggesterwhichdatashouldbereturnedforthedocument.
Wecanalsoomittheadditionalparameterssectionandindexdatainthewayweareusedto,justlikeinthefollowingexample:
curl-XPOST'localhost:9200/authors/author/1'-d'{
"name":"FyodorDostoevsky",
"ac":"FyodorDostoevsky"
}'
However,becausethecompletionsuggesterusesFSTunderthehood,wewon’tbeabletofindtheprecedingdocumentbystartingwiththesecondpartoftheacfield.That’swhywethinkthatindexingthedatainthewayweshowedfirstismoreconvenient,becausewecanexplicitlycontrolwhatwewanttomatchandwhatwewanttoshowasanoutput.
QueryingindexedcompletionsuggesterdataIfwewanttofinddocumentsthathaveauthorsstartingwithfyo,werunthefollowingcommand:
curl-XGET'localhost:9200/authors/_suggest?pretty'-d'{
"authorsAutocomplete":{
"text":"fyo",
"completion":{
"field":"ac"
}
}
}'
Beforewelookattheresults,let’sdiscussthequery.Asyoucansee,we’verunthecommandtothe_suggestendpoint,becausewedon’twanttorunastandardquery;wearejustinterestedintheautocompleteresults.Thequeryisquitesimple.WesetitsnametoauthorsAutocomplete,wesetthetextwewanttogetthecompletionfor(thetextproperty),andweaddedthecompletionobjectwiththeconfigurationinit.Theresultoftheprecedingcommandlooksasfollows:
{
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"authorsAutocomplete":[{
"text":"fyo",
"offset":0,
www.EBooksWorld.ir
![Page 523: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/523.jpg)
"length":3,
"options":[{
"text":"FyodorDostoevsky",
"score":1.0,
"payload":{
"books":["123456","123457"]
}
}]
}]
}
Asyoucanseeintheresponse,wegetthedocumentwewerelookingforalongwiththepayloadinformation,ifitisavailable(fortheprecedingresponse,itisnot).
Wecanalsousefuzzysearches,whichallowustotoleratespellingmistakes.Wedothatbyincludingtheadditionalfuzzysectioninourquery.Forexample,toenablefuzzymatchinginthecompletionsuggesterandsetthemaximumeditdistanceto2(whichmeansthatamaximumoftwoerrorsareallowed),wesendthefollowingquery:
curl-XGET'localhost:9200/authors/_suggest?pretty'-d'{
"authorsAutocomplete":{
"text":"fio",
"completion":{
"field":"ac",
"fuzzy":{
"edit_distance":2
}
}
}
}'
Althoughwe’vemadeaspellingmistake,wewillstillgetthesameresultsaswegotearlier.
CustomweightsBydefault,thetermfrequencyisusedtodeterminetheweightofthedocumentreturnedbytheprefixsuggester.However,thismaynotbethebestsolution.Insuchcases,itisusefultodefinetheweightofthesuggestionbyspecifyingtheweightpropertyforthefielddefinedascompletion.Theweightpropertyshouldbesettoanintegervalue.Thehighertheweightpropertyvalue,themoreimportantthesuggestion.Forexample,ifwewanttospecifyaweightforthefirstdocumentinourexample,werunthefollowingcommand:
curl-XPOST'localhost:9200/authors/author/1'-d'{
"name":"FyodorDostoevsky",
"ac":{
"input":["fyodor","dostoevsky"],
"output":"FyodorDostoevsky",
"payload":{"books":["123456","123457"]},
"weight":30
}
}'
www.EBooksWorld.ir
![Page 524: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/524.jpg)
Nowifwerunourexamplequery,theresultswillbeasfollows:
{
...
"authorsAutocomplete":[{
"text":"fyo",
"offset":0,
"length":3,
"options":[{
"text":"FyodorDostoevsky",
"score":30.0,
"payload":{
"books":["123456","123457"]
}
}]
}]
}
Lookhowthescoreoftheresultchanged.Inourinitialexample,itwas1.0andnowitis30.0.Thisissobecausewesettheweightparameterto30duringindexing.
ContextsuggesterThecontextsuggesterisanextensiontotheElasticsearchSuggestAPIforElasticsearch2.1andolderversionsthatwejustdiscussed.WhendescribingthecompletionsuggesterforElasticsearch2.1,wementionedthatthissuggesterallowsustohandlesuggester-relatedsearchesentirelyinmemory.Usingthissuggester,wecandefinethesocalledcontextforthequerythatwilllimitthesuggestionstoasubsetofdocuments.Becausewedefinethecontextinthemappings,itiscalculatedduringindexation,whichmakesquerytimecalculationseasierandlessdemandingintermsofperformance.
NoteRememberthatthissectionisrelatedtoElasticsearch2.1.ContextsinElasticsearch2.2arehandleddifferentlyandwerediscussedwhendiscussingthecompletionsuggester.
Contexttypes
Elasticsearch2.1supportstwotypesofcontext:categoryandgeo.Thecategorytypeofcontextallowsustoassignadocumenttooneormorecategoriesduringtheindextime.Later,duringthequerytime,wecantellElasticsearchwhichcategoryweareinterestedinandElasticsearchwilllimitthesuggestionstothosecategories.Thegeocontextallowsustolimitthedocumentsreturnedbythesuggesterstoagivenlocationortoacertaindistancefromapoint.Thenicethingaboutcontextisthatwecanhavemultiplecontexts.Forexample,wecanhaveboththecategorycontextandthegeocontextforthesamedocument.Let’snowseewhatweneedtodotousecontextinsuggestions.
Usingcontext
Usingthegeoandcategorycontextisverysimilar–theyjustdifferinparameters.Wewillshowyouhowtousecontextsinanexampleusingthesimplercategorycontextandlaterwewillgetbacktothegeocontextandshowyouwhatweneedtoprovide.
www.EBooksWorld.ir
![Page 525: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/525.jpg)
Thefirststepwhenusingcontextsuggesteriscreatingapropermapping.Let’sgetbacktoourauthormapping,butthistimelet’sassumethateachauthorcanbegivenoneormorecategory–thebrandofbooksshe/heiswriting.Thiswillbeourcontext.Themappingsusingthecontextlookasfollows:
curl-XPOST'localhost:9200/authors_geo_context'-d'{
"mappings":{
"author":{
"properties":{
"name":{"type":"string"},
"ac":{
"type":"completion",
"analyzer":"simple",
"search_analyzer":"simple",
"context":{
"brand":{
"type":"category",
"default":["none"]
}
}
}
}
}
}
}'
We’veintroducedanewsectioninouracfielddefinition:context.Eachcontextisgivenaname,whichisbrandinourcase,andinsidethatobjectweprovideconfiguration.Weneedtoprovidethetypeusingthetypeproperty–wewillbeusingthecategorycontextsuggesternow.Inadditiontothat,we’vesetthedefaultarray,whichprovidesuswiththevalueorvaluesthatshouldbeusedasthedefaultcontext.Ifwewant,wecanalsoprovidethepathproperty,whichwillpointElasticsearchtoafieldinthedocumentsfromwhichthecontextvalueshouldbetaken.
Wecannowindexasingleauthorbymodifyingthecommandsweusedearlier,becauseweneedtoprovidethecontext:
curl-XPOST'localhost:9200/authors_context/author/1'-d'{
"name":"FyodorDostoevsky",
"ac":{
"input":"FyodorDostoevsky",
"context":{
"brand":"drama"
}
}
}'
Asyoucansee,theacfielddefinitionisabitdifferentnow;itisanobject.Theinputpropertyisusedtoprovidethevalueforautocompleteandthecontextobjectisusedtoprovidethevaluesforeachofthecontextsdefinedinthemappings.
Finally,wecanquerythedata.Asyoucouldimagine,wewillagainprovidethecontextweareinterestedin.Thequerythatdoesthatlooksasfollows:
www.EBooksWorld.ir
![Page 526: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/526.jpg)
curl-XGET'localhost:9200/authors_context/_suggest?pretty'-d'{
"authorsAutocomplete":{
"text":"fyo",
"completion":{
"field":"ac",
"context":{
"brand":"drama"
}
}
}
}'
Asyoucansee,we’veincludedthecontextobjectinthequeryinsidethecompletionsectionandwe’vesetthecontextweareinterestedinusingthecontextname.TheresponsereturnedbyElasticsearchisasfollows:
{
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"authorsAutocomplete":[{
"text":"fyo",
"offset":0,
"length":3,
"options":[{
"text":"FyodorDostoevsky",
"score":1.0
}]
}]
}
However,ifwechangethebrandcontexttocomedy,forexample,Elasticsearchwillreturnnoresults,becausewedon’thaveauthorswithsuchacontext.Let’stestitbyrunningthefollowingquery:
curl-XGET'localhost:9200/authors_context/_suggest?pretty'-d'{
"authorsAutocomplete":{
"text":"fyo",
"completion":{
"field":"ac",
"context":{
"brand":"comedy"
}
}
}
}'
ThistimeElasticsearchreturnsthefollowingresponse:
{
"_shards":{
"total":5,
"successful":5,
"failed":0
www.EBooksWorld.ir
![Page 527: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/527.jpg)
},
"authorsAutocomplete":[{
"text":"fyo",
"offset":0,
"length":3,
"options":[]
}]
}
Thisisbecausenoauthorwiththebrandcontextandthevalueofcomedyispresentintheauthors_contextindex.
Usingthegeolocationcontext
Thegeocontextissimilartothecategorycontextwhenitcomestousingit.However,insteadoffilteringbyterms,wefilterusinggeographicalpointsanddistances.Whenweusethegeocontext,weneedtoprovideprecision,whichdefinestheprecisionofthecalculatedgeohash.Thesecondpropertythatweprovideistheneighborsone,whichcanbesettotrueorfalse.Bydefault,itissettotrue,whichmeansthattheneighboringgeohasheswillbeincludedinthecontext.
Inadditiontothat,similartothecategorycontext,wecanprovidepath,whichspecifieswhichfieldtouseasthelookupforthegeographicalpoint,andthedefaultproperty,specifyingthedefaultgeopointforthedocuments.
Forexample,let’sassumethatwewanttofilteronthebirthplaceofourauthors.Themappingsforsuchasuggesterwilllookasfollows:
curl-XPOST'localhost:9200/authors_geo_context'-d'{
"mappings":{
"author":{
"properties":{
"name":{"type":"string"},
"ac":{
"type":"completion",
"analyzer":"simple",
"search_analyzer":"simple",
"context":{
"birth_location":{
"type":"geo",
"precision":["1000km"],
"neighbors":true,
"default":{
"lat":0.0,
"lon":0.0
}
}
}
}
}
}
}
}'
Nowwecanindexthedocumentsandprovidethebirthlocation.Forourexampleauthor,
www.EBooksWorld.ir
![Page 528: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/528.jpg)
itwilllookasfollows(thecentreofMoscow):
curl-XPOST'localhost:9200/authors_geo_context/author/1'-d'{
"name":"FyodorDostoevsky",
"ac":{
"input":"FyodorDostoevsky",
"context":{
"birth_location":{
"lat":55.75,
"lon":37.61
}
}
}
}'
Asyoucansee,we’veprovidedthebirth_locationcontextforourauthor.
Nowduringquerytime,weneedtoprovidethecontextthatweareinterestedinandwecan(butwearenotobligatedto)providetheprecisionasthesubsetoftheprecisionvaluesprovidedinthemappings.We’vedefinedtheprecisionto1000km,solet’sfindalltheauthorsstartingwithfyothatwereborninKazan,whichisabout800kmfromMoscow.Weshouldfindourexampleauthor.
Thequerythatdoesthatlooksasfollows:
curl-XGET'localhost:9200/authors_geo_context/_suggest?pretty'-d'{
"authorsAutocomplete":{
"text":"fyo",
"completion":{
"field":"ac",
"context":{
"birth_location":{
"lat":55.45,
"lon":49.8
}
}
}
}
}'
TheresponsereturnedbyElasticsearchlooksasfollows:
{
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"authorsAutocomplete":[{
"text":"fyo",
"offset":0,
"length":3,
"options":[{
"text":"FyodorDostoevsky",
"score":1.0
}]
www.EBooksWorld.ir
![Page 529: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/529.jpg)
}]
}
However,ifwerunthesamequerybutpointtotheNorthPole,wewillgetnoresults:
curl-XGET'localhost:9200/authors_geo_context/_suggest?pretty'-d'{
"authorsAutocomplete":{
"text":"fyo",
"completion":{
"field":"ac",
"context":{
"birth_location":{
"lat":0.0,
"lon":0.0
}
}
}
}
}'
ThefollowingistheresponsefromElasticsearchinthiscase:
{
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"authorsAutocomplete":[{
"text":"fyo",
"offset":0,
"length":3,
"options":[]
}]
}
www.EBooksWorld.ir
![Page 530: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/530.jpg)
www.EBooksWorld.ir
![Page 531: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/531.jpg)
TheScrollAPILet’simaginethatwehaveanindexwithseveralmilliondocuments.Wealreadyknowhowtobuildourqueryandsoon.However,whentryingtofetchalargenumberofdocuments,youseethatwhengettingfurtherandfurtherwithpagesoftheresults,thequeriesslowdownandfinallytimeoutorresultinmemoryissues.
Thereasonforthisisthatfull-textsearchengines,especiallythosethataredistributed,don’thandlepagingverywell.Ofcourse,gettingafewhundredpagesofresultsisnotaproblemforElasticsearch,butforgoingthroughalltheindexeddocumentsorthroughlargeresultset,aspecializedAPIhasbeenintroduced.
www.EBooksWorld.ir
![Page 532: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/532.jpg)
ProblemdefinitionWhenElasticsearchgeneratesaresponse,itmustdeterminetheorderofthedocumentsthatformtheresult.Ifweareonthefirstpage,thisisnotabigproblem.Elasticsearchjustfindsthesetofdocumentsandcollectsthefirstones;let’ssay,20documents.Butifweareonthetenthpage,Elasticsearchhastotakeallthedocumentsfrompagesonetotenandthendiscardtheonesthatareonpagesonetonine.Thisisevenmorecomplicatedifwehaveadistributedenvironment,becausewedon’tknowfromwhichnodestheresultswillcome.Becauseofthat,eachnodeneedstobuildtheresponseandkeepitinmemoryforsometime.TheproblemisnotElasticsearch-specific;asimilarsituationcanbefoundinthedatabasesystems,forexample,generally,ineverysystemthatusestheso-calledpriorityqueue.
www.EBooksWorld.ir
![Page 533: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/533.jpg)
ScrollingtotherescueThesolutionissimple.SinceElasticsearchhastodosomeoperations(determinethedocumentsforthepreviouspages)foreachrequest,wecanaskElasticsearchtostorethisinformationforsubsequentqueries.Thedrawbackisthatwecannotstorethisinformationforeverduetolimitedresources.Elasticsearchassumesthatwecandeclarehowlongweneedthisinformationtobeavailable.Let’sseehowitworksinpractice.
Firstofall,wequeryElasticsearchasweusuallydo.However,inadditiontoalltheknownparameters,weaddonemore:theparameterwiththeinformationthatwewanttousescrollingwithandhowlongwesuggestthatElasticsearchshouldkeeptheinformationabouttheresults.Wecandothisbysendingaqueryasfollows:
curl'localhost:9200/library/_search?pretty&scroll=5m'-d'{
"size":1,
"query":{
"match_all":{}
}
}'
Thecontentofthisqueryisirrelevant.TheimportantthingishowElasticsearchmodifiestheresponse.LookatthefollowingfirstfewlinesoftheresponsereturnedbyElasticsearch:
{
"_scroll_id":
"cXVlcnlUaGVuRmV0Y2g7NTsxNjo1RDNrYnlfb1JTeU1sX20yS0NRSUZ3OzE3OjVEM2tieV9vUl
N5TWxfbTJLQ1FJRnc7MTg6NUQza2J5X29SU3lNbF9tMktDUUlGdzsxOTo1RDNrYnlfb1JTeU1sX
20yS0NRSUZ3OzIwOjVEM2tieV9vUlN5TWxfbTJLQ1FJRnc7MDs=",
"took":3,
"timed_out":false,
"_shards":{
"total":5,
"successful":5,
"failed":0
},
"hits":{
"total":4,
...
Thenewpartisthe_scroll_idsection.Thisisahandlethatwewilluseinthequeriesthatfollow.Elasticsearchhasaspecialendpointforthis:the_search/scrollendpoint.Let’slookatthefollowingexample:
curl-XGET'localhost:9200/_search/scroll?pretty'-d'{
"scroll":"5m",
"scroll_id":
"cXVlcnlUaGVuRmV0Y2g7NTsyNjo1RDNrYnlfb1JTeU1sX20yS0NRSUZ3OzI3OjVEM2tieV9vUl
N5TWxfbTJLQ1FJRnc7Mjg6NUQza2J5X29SU3lNbF9tMktDUUlGdzsyOTo1RDNrYnlfb1JTeU1sX
20yS0NRSUZ3OzMwOjVEM2tieV9vUlN5TWxfbTJLQ1FJRnc7MDs="
}'
Noweverycalltothisendpointwithscroll_idreturnsthenextpageofresults.
www.EBooksWorld.ir
![Page 534: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/534.jpg)
Rememberthatthishandleisonlyvalidforthedefinedtimeofinactivity.
Ofcourse,thissolutionisnotideal,anditisnotveryappropriatewhentherearemanyrequeststorandompagesofvariousresultsorwhenthetimebetweentherequestsisdifficulttodetermine.However,youcanusethissuccessfullyforusecaseswhereyouwanttogetlargerresultsets,suchastransferringdatabetweenseveralsystems.
www.EBooksWorld.ir
![Page 535: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/535.jpg)
www.EBooksWorld.ir
![Page 536: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/536.jpg)
SummaryInthechapterthatwejustfinished,welearnedaboutsomefunctionalitiesofElasticsearchthatwewon’tprobablyuseeverydayoratleastnoteveryoneofuswillusethem.Wediscussedpercolator–anupsidedownsearchfunctionalitythatallowsustoindexqueriesandfindwhichdocumentsmatchthem.WelearnedaboutthespatialcapabilitiesofElasticsearchandweusedsuggesterstocorrectuserspellingmistakesandbuildahighlyefficientautocompletefunctionality.WealsousedtheScrollAPItoefficientlyfetchlargenumberofresultsfromourElasticsearchindices.
Inthenextchapter,wewillfocusonclustersanditsconfiguration.Wewilldiscussnodediscovery,gateway,andrecoverymodules–whattheyareresponsibleforandhowtoconfigurethemtomatchourneeds.Wewillusetemplatesanddynamictemplates,andwewillseehowtoinstallpluginsextendingElasticsearch’sout-of-theboxfunctionalities.WewilllearnwhatarethecachesofElasticsearchcachesareandhowtoconfigurethemefficientlytomakethemostoutofthem.Finally,wewillusetheupdatesettingsAPItoupdateElasticsearchconfigurationonliveandrunningclusters.
www.EBooksWorld.ir
![Page 537: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/537.jpg)
www.EBooksWorld.ir
![Page 538: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/538.jpg)
Chapter9.ElasticsearchClusterinDetailThepreviouschapterwasfullydedicatedtosearchfunctionalitiesthatarenotonlyaboutfulltextsearching.Welearnedhowtousepercolator–aninversedsearchthatallowsustobuildalteringfunctionalitiesontopofElasticsearch.WelearnedtousespatialfunctionalitiesofElasticsearchandweusedthesuggestAPIthatallowedustocorrectuser’sspellingmistakesaswellasbuildveryefficientautocompletefunctionalities.Butlet’snowfocusonrunningandadministeringElasticsearch.Bytheendofthischapter,youwillhavelearnedthefollowingtopics:
HowdoesElasticsearchfindnewnodesthatshouldjointheclusterWhatarethegatewayandrecoverymodulesHowdotemplatesworkHowtousedynamictemplatesHowtousetheElasticsearchpluginmechanismWhatarethecachesinElasticsearchandhowtotunethemHowtousetheUpdateSettingsAPItoupdateElasticsearchsettingsonrunningclusters
www.EBooksWorld.ir
![Page 539: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/539.jpg)
UnderstandingnodediscoveryWhenstartingyourElasticsearchnode,oneofthefirstthingsthathappensislookingforamasternodethathasthesameclusternameandisvisible.Ifamasterisfound,thenodegetsjoinedintoanalreadyformedcluster.Ifnomasterisfound,thenthenodeitselfisselectedasamaster(ofcourseiftheconfigurationallowssuchbehavior).Theprocessofformingaclusterandfindingnodesiscalleddiscovery.Themoduleresponsiblefordiscoveryhastwomainpurposes:electingamasteranddiscoveringnewnodeswithinacluster.Inthissection,wewilldiscusshowwecanconfigureandtunethediscoverymodule.
www.EBooksWorld.ir
![Page 540: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/540.jpg)
DiscoverytypesBydefault,withoutinstallingadditionalplugins,ElasticsearchallowsustouseZendiscovery,whichprovidesuswithunicastdiscovery.Unicast(http://en.wikipedia.org/wiki/Unicast)allowstransmissionofasinglemessageoverthenetworktoasinglehostatonce.Elasticsearchnodesendsthemessagetothenodesdefinedintheconfigurationandwaitsforaresponse.Whenthenodeisacceptedintothecluster,therecoverymodulekicksinandstartstherecoveryprocessifneeded,orthemasterelectionprocessifthemasterisstillnotelected.
NotePriortoElasticsearch2.0,theZendiscoverymoduleallowedustousemulticastdiscovery.Onamulticastcapablenetwork,ElasticsearchwasabletoautomaticallydiscovernodeswithoutspecifyinganyIPaddressesofotherElasticsearchserverssharingthesameclustername.Thiswasverymistakeproneandnotadvisedforproductionuseandthusitwasdeprecatedandremovedtoaplugin.
Elasticsearcharchitectureisdesignedtobepeertopeer.Whenrunningoperationssuchasindexingorsearching,themasternodedoesn’ttakepartincommunicationandtherelevantnodescommunicatewitheachotherdirectly.
www.EBooksWorld.ir
![Page 541: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/541.jpg)
NoderolesElasticsearchnodescanbeconfiguredtoworkinoneofthefollowingroles:
Master:Thenoderesponsibleformaintainingtheglobalclusterstate,changingitdependingontheneeds,andhandlingtheadditionandremovalofnodes.Therecanonlybeasinglemasternodeactiveinasinglecluster.Data:Thenoderesponsibleforholdingthedataandexecutingdatarelatedoperations(indexationandsearching)ontheshardsthatarepresentlocallyforthenode.Client:Thenoderesponsibleforhandlingrequests.Fortheindexingrequests,theclientnodeforwardstherequesttotheappropriateprimaryshardand,forthesearchrequests,itsendsittoalltherelevantshardsandaggregatestheresults.
Bydefault,eachnodecanworkasmaster,data,orclient.Itcanbeadataandaclientatthesametimeforexample.Onlargeandhighlyloadedclusters,itisveryimportanttodividetherolesofthenodesintheclusterandhavethenodesdoonlyasingleroleatatime.Whendealingwithsuchclusters,youwilloftenseeatleastthreemasternodes,multipledatanodes,andafewclientonlynodesaspartofthewholecluster.
MasternodeItisthemostimportantnodetypefromElasticsearchcluster’spointofview.Ithandlestheclusterstate,changesit,managesthenodesjoiningandleavingthecluster,checksthehealthoftheothernodesinthecluster(byrunningpingrequests),andmanagestheshardrelocationoperations.Ifthemasterissomehowdisconnectedfromthecluster,theremainingnodeswillselectanewmasterfromeachother.Alltheseprocessesaredoneautomaticallyonthebasisoftheconfigurationvaluesweprovide.YouusuallywantthemasternodestoonlycommunicatewiththeotherElasticsearchnodes,usingtheinternalJavacommunication.Toavoidhittingthemasternodesbymistake,itisadvisedtoturnofftheHTTPmoduleforthemintheconfiguration.
DatanodeThedatanodeisresponsibleforholdingthedataintheindices.Thedatanodesaretheonesthatneedthemostdiskspacebecauseofbeingloadedwithdataindexationrequestsandrunningsearchesonthedatatheyhavelocally.Thedatanodes,similartothemasternodescanhavetheHTTPmoduledisabled.
ClientnodeTheclientnodesareinmostcasesnodesthatdon’thaveanydataandarenotmasternodes.Theclientnodesaretheonesthatcommunicatewiththeoutsideworldandwithallthenodesinthecluster.Theyforwardthedatatotheappropriateshardsandaggregatethesearchandaggregationsresultsfromalltheothernodes.
Keepinmindthatclientnodescanhavedataaswell,butinsuchacasetheywillrunboththeindexingrequestsandthesearchrequestsforthelocaldataandwillaggregatethedatafromtheothernodes,whichinlargeclustersmaybetoomuchworkforasinglenode.
www.EBooksWorld.ir
![Page 542: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/542.jpg)
ConfiguringnoderolesBydefault,Elasticsearchallowseverynodetobeamasternode,adatanode,oraclientnode.However,aswealreadymentioned,incertainsituationsyoumaywanttohavenodesthatonlyholddata,clientnodesthatareonlyusedtoprocessrequests,andmasterhoststomanagethecluster.Onesuchsituationiswhenmassiveamountsofdataneedstobehandled,wherethedatanodesshouldbeasperformantaspossible.TotellElasticsearchwhatroleitshouldtake,weusethreeBooleanpropertiessetintheelasticsearch.ymlconfigurationfile:
node.master:Whensettotrue,wetellElasticsearchthatthenodeismastereligible,whichmeansthatitcantaketheroleofamaster.However,notethatthemasterwillbeautomaticallymarkedasnotmastereligibleassoonasitisassignedaclientrole.node.data:Whensettotrue,wetellElasticsearchthatthenodecanbeusedtoholddata.node.client:Whensettotrue,wetellElasticsearchthatthenodeshouldbeusedasaclient.
So,tosetanodetoonlyholddata,weshouldaddthefollowingpropertiestotheelasticsearch.ymlconfigurationfile:
node.master:false
node.data:true
node.client:false
Tosetthenodetonotholddataandonlybeamasternode,weneedtoinstructElasticsearchthatwedon’twantthenodetoholddata.Inordertodothis,weaddthefollowingpropertiestotheelasticsearch.ymlconfigurationfile:
node.master:true
node.data:false
node.client:false
www.EBooksWorld.ir
![Page 543: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/543.jpg)
Settingthecluster’snameIfwedon’tsetthecluster.namepropertyinourelasticsearch.ymlfile,Elasticsearchusestheelasticsearchdefaultvalue.Thisisnotagoodthing,becauseeachnewElasticsearchnodewillhavethesameclusternameandyoumaywanttohavemultipleclustersinthesamenetwork.Insuchacase,connectingthewrongnodestogetherisjustamatteroftime.Becauseofthat,wesuggestsettingthecluster.namepropertytosomeothervalueofyourchoice.Usually,itisagoodideatoadjustclusternamesbasedonclusterresponsibilities.
www.EBooksWorld.ir
![Page 544: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/544.jpg)
ZendiscoveryThedefaultdiscoverymethodusedbyElasticsearchandonethatiscommonlyusedintheElasticsearchworldiscalledZendiscovery.Itsupportsunicastdiscoveryandallowsadjustingvariouspartsofitsconfiguration.
NoteNotethatthereareadditionaldiscoverytypesavailableasplugins,suchasAmazonEC2discovery,MicrosoftAzurediscovery,andGoogleComputeEnginediscovery.
MasterelectionconfigurationImaginethatyouhaveaclusterthatisbuiltof10nodes.Everythingisworkingfineuntilonedaywhenyournetworkfailsand3ofyournodesaredisconnectedfromthecluster,buttheystillseeeachother.BecauseoftheZendiscoveryandmasterelectionprocess,thenodesthatgotdisconnectedelectanewmasterandyouendupwithtwoclusterswiththesamename,withtwomasternodes.Suchasituationiscalledasplit-brainandyoumustavoiditasmuchaspossible.Whensplit-brainhappens,youendupwithtwo(ormore)clustersthatwon’tjoineachotheruntilthenetwork(oranyother)problemsarefixed.Thethingtorememberisthatsplit-brainmayresultinnotrecoverableerrors,suchasdataconflictsinwhichyouendupwithdatacorruptionorpartialdataloss.That’swhyitisimportanttoavoidsuchsituationsatallcosts.
Inordertopreventsplit-brainsituations,Elasticsearchprovidesadiscovery.zen.minimum_master_nodesproperty.Thispropertydefinestheminimumamountofmastereligiblenodesthatshouldbeconnectedtoeachotherinordertoformacluster.Sonowlet’sgetbacktoourcluster;ifwesetthediscovery.zen.minimum_master_nodespropertyto50percentofthetotalnodesavailable+1(whichis6inourcase),wewillendupwithasinglecluster.Whyisthat?Beforethenetworkfailure,wehad10nodes,whichismorethansixnodes,andthosenodesformedacluster.Afterthedisconnectionofthethreenodes,wewouldstillhavethefirstclusterupandrunning.However,becauseonlythreenodesgotdisconnectedandthreeislessthansix,thesethreenodeswouldn’tbeallowedtoelectanewmasterandtheywouldwaitforreconnectionwiththeoriginalcluster.
Ofcoursethisisalsonotaperfectscenario.Itisadvisedtohaveadedicatedmastereligiblenodesonly,thatdon’tworkasdataorclientnodes.Tohaveaquoruminsuchacase,weneedatleastthreededicatedmastereligiblenodes,becausethatwillallowustohaveasinglemasterofflineandstillkeepthequorum.Thisisusuallyenoughtokeeptheclustersinagoodshapewhenitcomestomasterrelatedfeaturesandtobesplit-brainproof.Ofcourse,insuchacase,thediscovery.zen.minimum_master_nodespropertyshouldbesetto2andweshouldhavethethreemasternodesupandrunning.
Furthermore,ElasticsearchallowsustoadditionallyspecifytwoadditionalBooleanproperties:discover.zen.master_election.filter_clientanddiscover.zen.master_election.filter_data.TheyallowustotellElasticsearchtoignorepingrequestsfromtheclientanddatanodesduringmasterelection.Bydefault,the
www.EBooksWorld.ir
![Page 545: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/545.jpg)
firstmentionedpropertyissettotrueandthesecondissettofalse.ThisallowsElasticsearchtofocusonthemasterelectionandnotbeoverloadedwithpingrequestsfromthenodesthatarenotmastereligible.
Inadditiontothementionedproperties,Elasticsearchallowsconfiguringtimeoutsrelatedtothemasterelectionprocess.discovery.zen.ping_timeout,whichdefaultsto3s(threeseconds),allowsconfiguringtimeoutforslownetworks–thehigherthevalue,thelesserthechanceoffailure,buttheelectionprocesscantakelonger.Thesecondpropertyiscalleddiscover.zen.join_timeoutandspecifiesthetimeoutforthejoinrequesttothemaster.Itdefaultsto20timesthediscovery.zen.ping_timeoutproperty.
ConfiguringunicastBecauseofthewayunicastworks,weneedtospecifyatleastahostthattheunicastmessageshouldbesentto.Todothis,weshouldaddthediscovery.zen.ping.unicast.hostspropertytoourelasticsearch.ymlconfigurationfile.Basically,weshouldspecifyallthehoststhatformtheclusterinthediscovery.zen.ping.unicast.hostsproperty(wedon’thavetospecifyallthehosts,wejustneedtoprovideenoughsothatwearesurethatasingleonewillwork).Forexample,ifwewantthehosts192.168.2.1,192.168.2.2and192.168.2.3forourhost,weshouldspecifytheprecedingpropertyinthefollowingway:
discovery.zen.ping.unicast.hosts:192.168.2.1:9300,192.168.2.2:9300,
192.168.2.3:9300
OnecanalsodefinearangeoftheportsElasticsearchcanuse.Forexample,tosaythatportsfrom9300to9399canbeused,wespecifythefollowing:
discovery.zen.ping.unicast.hosts:192.168.2.1:[9300-9399],192.168.2.2:
[9300-9399],192.168.2.3:[9300-9399]
Notethatthehostsareseparatedwithacommacharacterandwe’vespecifiedtheportonwhichweexpectunicastmessages.
FaultdetectionpingsettingsInadditiontothesettingsdiscussedpreviously,wecanalsocontroloralterthedefaultpingconfiguration.Pingisasignalsentbetweenthenodestocheckiftheyarerunningandresponsive.Themasternodepingsalltheothernodesintheclusterandeachoftheothernodesintheclusterpingsthemasternode.Thefollowingpropertiescanbeset:
discovery.zen.fd.ping_interval:Thisdefaultsto1s(onesecond)andspecifieshowoftenthenodespingeachotherdiscovery.zen.fd.ping_timeout:Thisdefaultsto30s(30seconds)anddefineshowlonganodewillwaitfortheresponsetoitspingmessagebeforeconsideringanodeasunresponsivediscovery.zen.fd.ping_retries:Thisdefaultsto3andspecifieshowmanyretriesshouldbetakenbeforeconsideringanodeasnotworking
Ifyouexperiencesomeproblemswithyournetwork,oryouknowthatyournodesneedmoretimetoseethepingresponse,youcanadjusttheprecedingvaluestotheonesthat
www.EBooksWorld.ir
![Page 546: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/546.jpg)
aregoodforyourdeployment.
ClusterstateupdatescontrolAswehavealreadydiscussed,themasternodeistheoneresponsibleforhandlingthechangesoftheclusterstateandElasticsearchallowsustocontrolthatprocess.Formostusecases,thedefaultsettingsaremorethanenough,butyoumayrunintosituationswherechangingthesettingsisrequired.
Themasternodeprocessesasingleclusterstatecommandatatime.Firstthemasternodepropagatesthechangestoothernodesandthenitwaitsforresponse.Eachclusterstatechangeisnotconsideredfinisheduntilenoughnodesrespondtothemasterwithacknoledgment.Thenumberofnodesthatneedtorespondisspecifiedbydiscovery.zen.minimum_master_nodes,whichwearealreadyawareof.ThemaximumtimeanElasticsearchnodewaitsforthenodestorespondis30sbydefaultandisspecifiedbythediscovery.zen.commit_timeoutproperty.Ifnotenoughnodesrespondtothemaster,theclusterstatechangeisrejected.
Onceenoughnodesrespondtothemasterpublishmessage,theclusterstatechangeisacceptedonthemasterandtheclusterstateischanged.Oncethatisdone,themastersendsamessagetoallthenodessayingthatthechangecanbeapplied.Thetimeoutofthismessageisagainsetto30secondsandiscontrolledusingthediscovery.zen.publish_timeoutproperty.
DealingwithmasterunavailabilityIfaclusterhasnomasternode,whateverthereasonmaybe,itisnotfullyoperational.Bydefault,wecan’tchangethemetadata,clusterwidecommandswillnotbeworking,andsoon.Elasticsearchallowsustoconfigurethebehaviorofthenodeswhenthemasternodeisnotelected.Todothat,wecanusethediscovery.zen.no_master_blockpropertywhichthesettingsofallandwrite.Settingthispropertytoallmeansthatalltheoperationsonthenodewillberejected,thatis,thesearchoperations,thewriterelatedoperations,andtheclusterwideoperationssuchashealthormappingsretrieval.Settingthispropertytowritemeansthatonlythewriteoperationwillberejected–thisisthedefaultbehaviorofElasticsearch.
www.EBooksWorld.ir
![Page 547: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/547.jpg)
AdjustingHTTPtransportsettingsWhilediscussingthenodediscoverymoduleandprocess,wementionedtheHTTPmoduleafewtimes.WewouldliketogetbacktothattopicnowanddiscussafewusefulpropertieswhendiscussingandusingElasticsearch.
DisablingHTTPThefirstthingisdisablingtheHTTPcompletely.Thisisusefultoensurethatthemasteranddatanodeswon’tacceptanyqueriesorrequestsingeneralfromusers.TodisabletheHTTPtransportcompletely,wejustneedtoaddthehttp.enabledpropertyandsetittofalseinourelasticsearch.ymlfile.
HTTPportElasticsearchallowsustodefinetheportonwhichitwillbelisteningtoHTTPrequests.Thisisdonebyusingthehttp.portproperty.Itdefaultsto9200-9300,whichmeansthatElasticsearchwillstartfrom9200portandincreaseiftheportisnotavailable(sothenextinstancewilluse9201port,andsoon).Thereisalsohttp.publish_port,whichisveryusefulwhenrunningElasticsearchbehindafirewallandwhentheHTTPportisnotdirectlyaccessible.ItdefineswhichportshouldbeusedbytheclientsconnectingtoElasticsearchanddefaultstothesamevalueasthehttp.portproperty.
HTTPhostWecanalsodefinethehosttowhichElasticsearchwillbind.Tospecifyit,weneedtodefinethehttp.hostproperty.Thedefaultvalueistheonesetbythenetworkmodule.Ifneeded,wecansetthepublishhostandthebindhostseparatelyusingthehttp.publish_hostandhttp.bind_hostproperties.Youusuallydon’thavetospecifythesepropertiesunlessyournodeshavenonstandardhostnamesormultiplenamesandyouwantElasticsearchtobindtoasingleoneonly.
YoucanfindthefulllistofpropertiesallowedfortheHTTPmoduleinElasticsearchofficialdocumentationavailableathttps://www.elastic.co/guide/en/elasticsearch/reference/2.2/modules-http.html.
www.EBooksWorld.ir
![Page 548: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/548.jpg)
www.EBooksWorld.ir
![Page 549: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/549.jpg)
ThegatewayandrecoverymodulesApartfromourindicesandthedataindexedinsidethem,Elasticsearchneedstoholdthemetadata,suchasthetypemappings,theindexlevelsettings,andsoon.Thisinformationneedstobepersistedsomewheresoitcanbereadduringclusterrecovery.Ofcourse,itcouldbestoredinmemory,butfullclusterrestartorafatalfailurewouldresultinthisinformationbeinglost,whichisnotsomethingthatwewant.ThisiswhyElasticsearchintroducedthegatewaymodule.Youcanthinkaboutitasasafeheavenforyourclusterdataandmetadata.Eachtimeyoustartyourcluster,alltheneededdataisreadfromthegatewayand,whenyoumakeachangetoyourcluster,itispersistedusingthegatewaymodule.
www.EBooksWorld.ir
![Page 550: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/550.jpg)
ThegatewayInordertosetthetypeofgatewaywewanttouse,weneedtoaddthegateway.typepropertytotheelasticsearch.ymlconfigurationfileandsetittothelocalvalue.Currently,Elasticsearchrecommendsusingthelocalgatewaytype(gateway.typesettolocal),whichisthedefaultoneandtheonlyoneavailablewithoutadditionalplugins.
Thedefaultlocalgatewaytypestorestheindicesandtheirmetadatainthelocalfilesystem.Comparedtotheothergateways,thewriteoperationtothisgatewayisnotperformedinanasynchronousway,so,wheneverawritesucceeds,youcanbesurethatthedatawaswrittenintothegateway(sobasicallyindexedorstoredinthetransactionlog).
www.EBooksWorld.ir
![Page 551: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/551.jpg)
RecoverycontrolInadditiontochoosingthegatewaytype,Elasticsearchallowsustoconfigurewhentostarttheinitialrecoveryprocess.Therecoveryisaprocessofinitializingalltheshardsandreplicas,readingallthedatafromthetransactionlog,andapplyingthemontheshards.Basically,it’saprocessneededtostartElasticsearch.
Forexample,let’simaginethatwehaveaclusterthatconsistsof10Elasticsearchnodes.WeshouldinformElasticsearchaboutthenumberofnodesbysettinggateway.expected_nodestothatvalue,so10inourcase.WeinformElasticsearchaboutthenumberofexpectednodesthatareeligibletoholdthedataandeligibletobeselectedasamaster.Elasticsearchwillstarttherecoveryprocessimmediatelyifthenumberofnodesintheclusterisequaltothatproperty.
Wewouldalsoliketostarttherecoveryaftersixnodesaretogether.Todothis,weshouldsetthegateway.recover_after_nodespropertyto6.Thispropertyshouldbesettoavaluethatensuresthatthenewestversionoftheclusterstatesnapshotwillbeavailable,whichusuallymeansthatyoushouldstartrecoverywhenmostofyournodesareavailable.
Thereisalsoonemorething.Wewouldlikethegatewayrecoveryprocesstostart5minutesafterthegateway.recover_after_nodesconditionismet.Todothis,wesetthegateway.recover_after_timepropertyto5m.Thispropertytellsthegatewaymodulehowlongtowaitwiththerecoveryprocessafterthenumberofnodesreachedtheminimumspecifiedbythegateway.recovery_after_nodesproperty.Wemaywanttodothisbecauseweknowthatournetworkisquiteslowandwewantthenodescommunicationtobestable.NotethatElasticsearchwon’tdelaytherecoveryifthenumberofmasteranddataeligiblenodesthatformedtheclusterisequaltothevalueofthegateway.expected_nodesproperty.
Theprecedingpropertyvaluesshouldbesetintheelasticsearch.ymlconfigurationfile.Forexample:ifwewouldliketohavethepreviouslydiscussedvalueinthementionedfile,wewouldendupwiththefollowingsectioninthefile:
gateway.recover_after_nodes:6
gateway.recover_after_time:5m
gateway.expected_nodes:10
AdditionalgatewayrecoveryoptionsInadditiontothementionedoptions,Elasticsearchallowsussomeadditionaldegreeofcontrol.Theseadditionaloptionsare:
gateway.recover_after_master_nodes:Thisissimilartothegateway_recover_after_nodesproperty,butinsteadoftakingintoconsiderationallthenodes,itallowsustospecifyhowmanymastereligiblenodesshouldbepresentintheclusterbeforerecoverystartsgateway.recover_after_data_nodes:Thisisalsosimilartothegateway_recover_after_nodesproperty,butitallowsspecifyinghowmanydata
www.EBooksWorld.ir
![Page 552: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/552.jpg)
nodesshouldbepresentintheclusterbeforerecoverystartsgateway.expected_master_nodes:Thisissimilartothegateway.expected_nodesproperty,butinsteadofspecifyingthenumberofallthenodesthatweexpectinthecluster,itallowsspecifyinghowmanymastereligiblenodesweexpecttobepresentgateway.expected_master_nodes:Thisissimilartothegateway.expected_nodesproperty,butallowsspecifyinghowmanymasternodesweexpecttobepresentgateway.expected_data_nodes:Thisisalsosimilartothegateway.expected_nodesproperty,butallowsspecifyinghowmanydatanodesweexpecttobepresent
IndicesrecoveryAPIThereisalsooneotherthingwhenitcomestotherecoveryprocess–theindicesrecoveryAPI.Itallowsustoseetheprocessofindexorindicesrecovery.Touseit,wejustneedtospecifytheindicesandusethe_recoveryend-point.Forexample,tochecktherecoveryprocessofthelibraryindex,wewillrunthefollowingcommand:
curl-XGET'localhost:9200/library/_recovery?pretty'
Theresponsefortheprecedingcommandcanbelargeanddependsonthenumberofshardsintheindexandofcoursetheamountofindiceswewanttogetinformationfor.Inourcase,theresponselooksasfollows(weleftinformationaboutasingleshardtomakeitlessextensive):
{
"library":{
"shards":[{
"id":0,
"type":"STORE",
"stage":"DONE",
"primary":true,
"start_time_in_millis":1444030695956,
"stop_time_in_millis":1444030695962,
"total_time_in_millis":5,
"source":{
"id":"Brt5ejEVSVCkIfvY9iDMRQ",
"host":"127.0.0.1",
"transport_address":"127.0.0.1:9300",
"ip":"127.0.0.1",
"name":"PuffAdder"
},
"target":{
"id":"Brt5ejEVSVCkIfvY9iDMRQ",
"host":"127.0.0.1",
"transport_address":"127.0.0.1:9300",
"ip":"127.0.0.1",
"name":"PuffAdder"
},
"index":{
"size":{
"total_in_bytes":157,
"reused_in_bytes":157,
"recovered_in_bytes":0,
www.EBooksWorld.ir
![Page 553: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/553.jpg)
"percent":"100.0%"
},
"files":{
"total":1,
"reused":1,
"recovered":0,
"percent":"100.0%"
},
"total_time_in_millis":1,
"source_throttle_time_in_millis":0,
"target_throttle_time_in_millis":0
},
"translog":{
"recovered":0,
"total":-1,
"percent":"-1.0%",
"total_on_start":-1,
"total_time_in_millis":4
},
"verify_index":{
"check_index_time_in_millis":0,
"total_time_in_millis":0
}
},
...
]
}
}
Asyoucanseeintheresponse,weseetheinformationabouteachshard.Foreachshard,weseethetypeoftheoperation(thetypeproperty),thestage(thestageproperty)describingwhatpartoftherecoveryprocessisinprogress,andwhetheritisaprimaryshard(theprimaryproperty).Inadditiontothis,weseesectionsaboutthesourceshard,thetargetshard,theindextheshardispartof,theinformationaboutthetransactionlog,andfinallyinformationabouttheindexverification.Allofthisallowsustoseewhatisthestatusoftherecoveryofourindices.
DelayedallocationWealreadydiscussedthatbydefaultElasticsearchtriestobalancetheshardsintheclusteraccordinglytothenumberofnodesinthatcluster.Becauseofthat,whenanodedropsoffthecluster(ormultiplenodesdo)orwhennodesjointhecluster,Elasticsearchstartsrebalancingthecluster,movingtheshardsandthereplicasaround.Thisisusuallyveryexpensive–newprimaryshardsmaybepromotedoutoftheavailablereplicas,largeamountofdatamaybecopiedbetweenthenewprimaryanditsreplicas,andsoon.Andthismaybehappeningbecauseasinglenodewasjustrestartedfor30secondsmaintenance.
Toavoidsuchsituations,Elasticsearchprovidesuswiththepossibilitytocontrolhowlongtowaitbeforebeginningallocationofshardsthatareinunassignedstate.Wecancontrolthedelaybyusingtheindex.unassigned.node_left.delayed_timeoutpropertyandsettingitonperindexbasis.Forexample,toconfiguretheallocationtimeoutforthe
www.EBooksWorld.ir
![Page 554: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/554.jpg)
libraryindexto10minutes,werunthefollowingcommand:
curl-XPUT'localhost:9200/library/_settings'-d'{
"settings":{
"index.unassigned.node_left.delayed_timeout":"10m"
}
}'
Wecanalsoconfiguretheallocationtimeoutforalltheindicesbyrunningthefollowingcommand:
curl-XPUT'localhost:9200/_all/_settings'-d'{
"settings":{
"index.unassigned.node_left.delayed_timeout":"10m"
}
}'
IndexrecoveryprioritizationElasticsearch2.2exposesonemorefeaturewhenitcomestotheindicesrecoveryprocessthatallowsustodefinewhichindicesshouldbeprioritizedwhenitcomestorecovery.Byspecifyingtheindex.prioritypropertyintheindexsettingsandassigningitapositiveintegervalue,wedefinetheorderinwhichElasticsearchshouldrecovertheindices;theoneswiththehigherindex.prioritypropertywillbestartedfirst.
Forexample,let’sassumethatwehavetwoindices,libraryandmap,andwewantthelibraryindextoberecoveredbeforethemapindex.Todothis,wewillrunthefollowingcommands:
curl-XPUT'localhost:9200/library/_settings'-d'{
"settings":{
"index.priority":10
}
}'
curl-XPUT'localhost:9200/map/_settings'-d'{
"settings":{
"index.priority":1
}
}'
Weassignedhigherprioritytothelibraryindexand,becauseofthat,itwillberecoveredfaster.
www.EBooksWorld.ir
![Page 555: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/555.jpg)
www.EBooksWorld.ir
![Page 556: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/556.jpg)
TemplatesanddynamictemplatesIntheMappingsconfigurationsectionofChapter2,IndexingYourData,wediscussedmappings,howtheyarecreated,andhowthetype-determiningmechanismworks.Nowwewillgetintomoreadvancedtopics.Wewillshowyouhowtodynamicallycreatemappingsfornewindicesandhowtoapplysomelogictothetemplates,sothatnewindicesarealreadycreatedwithpredefinedmappings.
www.EBooksWorld.ir
![Page 557: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/557.jpg)
TemplatesInvariouspartsofthebook,whendiscussingindexconfigurationanditsstructure,we’veseenthatthiscanbecomecomplicated,especiallywhenwehavesophisticateddatastructuresthatwewanttoindex,search,andaggregate.Especiallyifyouhavealotofsimilarindices,takingcareofthemappingsineachofthemcanbeaverypainfulprocess–eachnewindexhastobecreatedwithappropriatemappings.Elasticsearchcreatorspredictedthisandimplementedafeaturecalledindextemplates.Eachtemplatedefinesapattern,whichiscomparedtoanewlycreatedindexname.Whenbothofthemmatch,thevaluesdefinedinthetemplatearecopiedtotheindexstructuredefinition.Whenmultipletemplatesmatchthenameofthenewlycreatedindex,allofthemareappliedandthevaluesfromthetemplatesthatareappliedlateroverridethevaluesdefinedinthepreviouslyappliedtemplates.Thisisveryconvenientbecausewecandefineafewcommonsettingsinthegeneraltemplatesandchangetheminthemorespecializedones.Inaddition,thereisanorderparameterthatletsusforcethedesiredtemplateordering.Youcanthinkoftemplatesasdynamicmappingsthatcanbeappliednottothetypesindocumentsbuttotheindices.
AnexampleofatemplateLet’sseearealexampleofatemplate.Imaginethatwewanttocreatemanyindicesinwhichwedon’twanttostorethesourceofthedocumentssothatourindicesaresmaller.Wealsodon’tneedanyreplicas.WecancreateatemplatethatmatchesourneedbyusingtheElasticsearchRESTAPIandthe/_templateendpoint,bysendingthefollowingcommand:
curl-XPUThttp://localhost:9200/_template/main_template?pretty-d'{
"template":"*",
"order":1,
"settings":{
"index.number_of_replicas":0
},
"mappings":{
"_default_":{
"_source":{
"enabled":false
}
}
}
}'
Fromnowon,allthecreatedindiceswillhavenoreplicasandnosourcestored.Thisisbecausethetemplateparametervalueissetto*,whichmatchesallthenamesoftheindices.Notethe_default_typenameinourexample.Thisisaspecialtypenamewhichindicatesthatthecurrentruleshouldbeappliedtoeverydocumenttype.Thesecondinterestingthingistheorderparameter.Let’sdefineasecondtemplatebyusingthefollowingcommand:
curl-XPUThttp://localhost:9200/_template/ha_template?pretty-d'{
"template":"ha_*",
www.EBooksWorld.ir
![Page 558: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/558.jpg)
"order":10,
"settings":{
"index.number_of_replicas":5
}
}'
Afterrunningtheprecedingcommand,allthenewindiceswillbehaveasearlierexcepttheoneswithnamesbeginningwithha_.Incaseoftheseindices,boththetemplatesareapplied.First,thetemplatewiththelowerordervalueisusedandthenthenexttemplateoverwritesthereplica’ssetting.So,theindiceswhosenamesstartwithha_willhavefivereplicasanddisabledsourcesstored.
NoteBeforeversion2.0,Elasticsearchtemplatescouldalsobestoredinfiles.StartingwithElasticsearch2.0,thisfeatureisnolongeravailable.
www.EBooksWorld.ir
![Page 559: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/559.jpg)
DynamictemplatesSometimeswewanttohavethepossibilityofdefiningtypethatisdependentonthefieldnameandthetype.Thisiswheredynamictemplatescanhelp.Dynamictemplatesaresimilartotheusualmappings,buteachtemplatehasitspatterndefined,whichisappliedtoadocument’sfieldname.Ifafieldnamematchesthepattern,thetemplateisused.
Let’shavealookatthefollowingexample:
curl-XPOST'localhost:9200/news'-d'{
"mappings":{
"article":{
"dynamic_templates":[
{
"template_test":{
"match":"*",
"mapping":{
"index":"analyzed",
"fields":{
"str":{
"type":"{dynamic_type}",
"index":"not_analyzed"
}
}
}
}
}
]
}
}
}'
Intheprecedingexample,wedefinedthemappingforthearticletype.Inthismapping,wehaveonlyonedynamictemplatenamedtemplate_test.Thistemplateisappliedforeveryfieldintheinputdocumentbecauseofthesingleasteriskpatterninthematchproperty.Eachfieldwillbetreatedasamultifield,consistingofafieldnamedastheoriginalfield(forexample,title)andthesecondfieldwithanamesuffixedwithstr(forexample,title.str).ThefirstfieldwillhaveitstypedeterminedbyElasticsearch(withthe{dynamic_type}type),andthesecondfieldwillbeastring(becauseofthestringtype).
ThematchingpatternWehavetwowaysofdefiningthematchingpattern.Theyareasfollows:
match:Thistemplateisusedifthenameofthefieldmatchesthepattern(thispatterntypewasusedinourexample)unmatch:Thistemplateisusedifthenameofthefielddoesn’tmatchthepattern
Bydefault,thepatternisverysimpleandusesglobpatterns.Thiscanbechangedbyusingmatch_pattern=regexp.Afteraddingthisproperty,wecanuseallthemagicprovidedbyregularexpressionstomatchandunmatchthepatterns.Therearevariationssuchas
www.EBooksWorld.ir
![Page 560: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/560.jpg)
path_matchandpath_unmatchthatcanbeusedtomatchthenamesinnesteddocuments(byprovidingpath,similartoqueries).
FielddefinitionsWhenwritingatargetfielddefinition,thefollowingvariablescanbeused:
{name}:Thenameoftheoriginalfieldfoundintheinputdocument{dynamic_type}:Thetypedeterminedfromtheoriginaldocument
NoteNotethatElasticsearchchecksthetemplatesintheorderoftheirdefinitionsandthefirstmatchingtemplateisapplied.Thismeansthatthemostgenerictemplates(forexample,with"match":"*")mustbedefinedattheend.
www.EBooksWorld.ir
![Page 561: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/561.jpg)
www.EBooksWorld.ir
![Page 562: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/562.jpg)
ElasticsearchpluginsAtvariousplacesinthisbook,wehaveuseddifferentpluginsthathavebeenabletoextendthecorefunctionalityofElasticsearch.YouprobablyremembertheadditionalprogramminglanguagesusedinscriptsdescribedintheScriptingcapabilitiesofElasticsearchsectionofChapter6,MakeYourSearchBetter.Inthissection,wewilllookathowthepluginsworkandhowtoinstallthem.
www.EBooksWorld.ir
![Page 563: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/563.jpg)
ThebasicsBydefault,Elasticsearchpluginsarelocatedintheirownsubdirectoryinthepluginssubdirectoryofthesearchenginehomedirectory.Ifyouhavedownloadedanewpluginmanually,youcanjustcreateanewdirectorywiththepluginnameandunpackthatpluginarchivetothisdirectory.Thereisalsoamoreconvenientwaytoinstallplugins:byusingthepluginscript.Wehaveuseditseveraltimesinthisbookwithouttalkingaboutit,sothistimelet’stakethetimeanddescribethistool.
Elasticsearchhastwomaintypesofplugins.Thesetwotypescanbecategorizedbasedonthecontentoftheplugin-descriptor.propertiesfile:Javapluginsandsiteplugins.Let’sstartwiththesiteplugins.TheyusuallycontainsetsofHTML,CSS,andJavaScriptfilesandaddadditionalUIcomponentstoElasticsearch.Elasticsearchtreatsthesitepluginsasafilesetthatshouldbeservedbythebuilt-inHTTPserverunderthe/_plugin/plugin_name/URL(forexample,/_plugin/bigdesk/).Thistypeofplugindoesn’tchangeanythingincoreElasticsearchfunctionality.
TheJavapluginsaretheonesthataddormodifythecoreElasticsearchfeatures.TheyusuallycontaintheJARfiles.Theplugin-descriptor.propertiesfilecontainsinformationaboutthemainclassthatshouldbeusedbyElasticsearchasanentrypointtoconfigurepluginsandallowthemtoextendtheElasticsearchfunctionality.ThenicethingabouttheJavapluginsisthattheycancontainthesitepartaswell.Thesitepartofthepluginneedstobeplacedinthe_sitedirectoryifweareunpackingthepluginmanually.
www.EBooksWorld.ir
![Page 564: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/564.jpg)
InstallingpluginsPluginscanbedownloadedfromthreesourcetypes.Thefirstistheofficialrepositorylocatedathttps://download.elastic.co.Allpluginsfromthissourcecanbeinstalledbyreferringtothepluginname.Forexample:
bin/plugininstalllang-javascript
Theprecedingcommandresultsininstallationofapluginthatallowsustouseanadditionalscriptinglanguage,JavaScript.ElasticsearchautomaticallytriestofindapluginversionthatisthesameastheversionofElasticsearchweareusing.Sometimes,likeinthefollowingexample,apluginmayaskforadditionalpermissionsduringinstallation.
Justsoweknowwhattoexpect,thisisanexampleresultofrunningtheprecedingcommand:
->Installinglang-javascript…
Trying
https://download.elastic.co/elasticsearch/release/org/elasticsearch/plugin/
lang-javascript/2.2.0/lang-javascript-2.2.0.zip…
Downloading…...............................................................
.........DONE
Verifying
https://download.elastic.co/elasticsearch/release/org/elasticsearch/plugin/
lang-javascript/2.2.0/lang-javascript-2.2.0.zipchecksumsifavailable…
Downloading.DONE
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@WARNING:pluginrequiresadditionalpermissions@
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
*java.lang.RuntimePermissioncreateClassLoader
*org.elasticsearch.script.ClassPermission<<STANDARD>>
*org.elasticsearch.script.ClassPermission
org.mozilla.javascript.ContextFactory
*org.elasticsearch.script.ClassPermissionorg.mozilla.javascript.Callable
*org.elasticsearch.script.ClassPermission
org.mozilla.javascript.NativeFunction
*org.elasticsearch.script.ClassPermissionorg.mozilla.javascript.Script
*org.elasticsearch.script.ClassPermission
org.mozilla.javascript.ScriptRuntime
*org.elasticsearch.script.ClassPermissionorg.mozilla.javascript.Undefined
*org.elasticsearch.script.ClassPermission
org.mozilla.javascript.optimizer.OptRuntime
See
http://docs.oracle.com/javase/8/docs/technotes/guides/security/permissions.
html
fordescriptionsofwhatthesepermissionsallowandtheassociatedrisks.
Continuewithinstallation?[y/N]y
Installedlang-javascriptinto/Users/someplace/elasticsearch-
2.2.0/plugins/lang-javascript
Installedlang-javascriptinto
/Users/negativ/Developer/Elastic/elasticsearch-2.2.0/plugins/lang-
javascript
www.EBooksWorld.ir
![Page 565: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/565.jpg)
Ifthepluginisnotavailableatthefirstlocation,itcanbeplacedinoneoftheApacheMavenrepositories:MavenCentral(https://search.maven.org/)orMavenSonatype(https://oss.sonatype.org/).Inthiscase,thepluginnameforinstallationshouldbeequaltogroupId/artifactId/version,justaseverylibraryforMaven(http://maven.apache.org/).Forexample:
bin/plugininstallorg.elasticsearch/elasticsearch-mapper-attachments/3.0.1
ThethirdsourcearetheGitHub(https://github.com/)repositories.Theplugintoolassumesthatthegivenpluginaddresscontainstheorganizationnamefollowedbythepluginnameand,optionally,theversionnumber.Let’slookatthefollowingcommandexample:
bin/plugininstallmobz/elasticsearch-head
Ifyouwriteyourownpluginandyouhavenoaccesstotheearlier-mentionedsites,thereisnoproblem.Theplugintoolacceptstheurlpropertyfromwherethepluginshouldbedownloaded(insteadofspecifyingthenameoftheplugin).Thisoptionallowsustosetanylocationfortheplugins,includingthelocalfilesystem(usingthefile://prefix)orremotefile(usingthehttp://prefix).Forexample,thefollowingcommandwillresultintheinstallationofapluginarchivedonthelocalfilesysteminthe/tmp/elasticsearch-lang-javascript-3.0.0.RC1.zipdirectory:
bin/plugininstallfile:///tmp/elasticsearch-lang-javascript-3.0.0.RC1.zip
www.EBooksWorld.ir
![Page 566: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/566.jpg)
RemovingpluginsRemovingapluginisassimpleasremovingitsdirectory.Youcanalsodothisbyusingtheplugintool.Forexample,toremovethepreviouslyinstalledJavaScriptplugin,werunacommandasfollows:
bin/pluginremovelang-javascript
Theoutputfromthecommandjustconfirmsthatthepluginwasremoved:
->Removinglang-javascript…
Removedlang-javascript
NoteYouneedtorestarttheElasticsearchnodefortheplugininstallationorremovaltotakeeffect.
www.EBooksWorld.ir
![Page 567: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/567.jpg)
www.EBooksWorld.ir
![Page 568: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/568.jpg)
ElasticsearchcachesUntilnowwehaven’tmentionedElasticsearchcachesmuchinthebook.However,asmostcommonsystemsElasticsearchusersavarietyofcachestoperformmorecomplicatedoperationsortospeedupperformanceofheavydataretrievalfromdiskbasedLuceneindices.Inthissection,wewilllookatthemostcommoncachesofElasticsearch,whattheyareusedfor,whataretheperformanceimplicationsofusingthem,andhowtoconfigurethem.
www.EBooksWorld.ir
![Page 569: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/569.jpg)
FielddatacacheInthebeginningofthebook,wediscussedthatElasticsearchusesthesocalledinvertedindexdatastructuretoquicklyandefficientlysearchthroughthedocuments.Thisisverygoodwhensearchingandfilteringthedata,butforfeaturessuchasaggregations,sorting,orscriptusage,Elasticsearchneedsanun-inverteddatastructure,becausethesefunctionsrelyonperdocumentdatainformation.
Becauseoftheneedforuninverteddata,whenElasticsearchwasfirstreleaseditcontainedandstillcontainsaninmemorydatastructurecalledfielddata.Fielddataisusedtostoreallthevaluesofagivenfieldtomemorytoprovideveryfastdocumentbasedlookup.However,thecostofusingfielddataismemoryandincreasedgarbagecollection.Becauseofmemoryandperformancecost,startingfromElasticsearch2.0,eachindexed,notanalyzedfieldusesdocvaluesbydefault.Otherfields,suchasanalyzedtextfields,stillusefielddataandbecauseofthatitisgoodtoknowhowtohandlefielddata.
FielddatasizeElasticsearchallowsustocontrolhowmuchmemorythefielddatacacheuses.Bydefault,thecacheisunbounded,whichisverydangerous.Ifyouhavelargeindices,youmayrunintomemoryissues,wherethefielddatacachewilleatmostofthememorygiventoElasticsearchandwillresultinnodefailure.Weareallowedtoconfigurethesizeofthefielddatacachebyusingthestaticindices.fielddata.cache.sizepropertysettoanexplicitvalue(like10GB)ortoapercentageofthewholememorygiventoElasticsearch(like20%).
Rememberthatthefielddatacacheisveryexpensivetobuildasitneedstoloadallthevaluesofagivenfieldtomemory.Thiscantakealotoftimeresultingindegradationintheperformanceofthequeries.Becauseofthis,itisadvisedtohaveenoughmemorytokeeptheneededcachepermanentlyinElasticsearchmemory.However,weunderstandthatthisisnotalwayspossiblebecauseofhardwarecosts.
CircuitbreakersThenicethingaboutElasticsearchisthatitallowsustoachieveasimilarthinginmultiplewaysandwehavethesamesituationwhenitcomestofielddataandlimitingthememoryusage.Elasticsearchallowsustouseafunctionalitycalledcircuitbreakers,whichcanestimatehowmuchmemoryarequestoraquerywilluse,andifitisaboveadefinedthreshold,itwon’tbeexecutedatall,resultinginnomemoryusageandanexceptionthrown.Thisisverynicewhenwedon’twanttolimitthesizeofthefielddatacachebutwealsodon’twantasinglequerytocausememoryissuesandmaketheclusterunstable.Therearetwomaincircuitbreakers:thefielddatacircuitbreakerandtherequestcircuitbreaker.
Thefirstcircuitbreaker,thefielddataone,estimatestheamountofmemorythatwillneedtobeusedtoloaddatatothefielddatacacheforagivenquery.Wecanconfigurethelimitbyusingtheindices.breaker.fielddata.limitproperty,whichisbydefaultsetto60%,whichmeansthatafielddatacacheforasinglequerycan’tusemorethan60percent
www.EBooksWorld.ir
![Page 570: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/570.jpg)
ofthememorygiventoElasticsearch.
Thesecondcircuitbreaker,therequestone,estimatesthememoryusedbyperrequestdatastructuresandpreventsthemfromusingmorethantheamountspecifiedbytheindices.breaker.request.limitproperty.Bydefault,thementionedpropertyissetto40%,whichmeansthatsinglerequestdatastructures,suchastheonesusedforaggregationcalculation,can’tusemorethan40%ofthememorygiventoElasticsearch.
Finally,thereisonemorecircuitbreakerthatisdefinedbytheindices.breaker.limit.totalproperty(bydefaultsetto70%).Thiscircuitbreakerdefinesthetotalamountofmemorythatcanbeusedbyboththeperrequestdatastructuresandfielddata.
Rememberthatthesettingsforcircuitbreakersaredynamicandcanbeupdatedusingclusterupdatesettings.
www.EBooksWorld.ir
![Page 571: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/571.jpg)
FielddataanddocvaluesAswealreadydiscussed,insteadoffielddatacache,docvaluescanbeused.Ofcourse,thisisonlytruefornotanalyzedfieldsandonesusingnumericdatatypesandnotmultivaluedones.Thiswillsavememoryandshouldbefasterthanthefielddatacacheduringquerytime,atthecostofslightindexingspeeddegradations(verysmall)andaslightlylargerindex.Ifyoucanusedocvalues,dothat–itwillhelpyourElasticsearchclustertomaintainstabilityandrespondtoqueriesquickly.
www.EBooksWorld.ir
![Page 572: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/572.jpg)
ShardrequestcacheThefirstofthecachesthatoperatesonthequeries.Theshardrequestcachecachestheaggregationsandsuggestionsresultedbythequery,but,whenwritingthisbook,itwasnotcachingqueryhits.WhenElasticsearchexecutesthequery,thiscachecansavetheresourceconsumingaggregationsforthequeryandspeedupthesubsequentqueriesbyretrievingtheaggregationsorsuggestionsfrommemory.
NoteDuringthewritingofthisbook,theshardrequestcachewasonlyusedwhenthesize=0parameterwassetforthequery.Thismeansthatonlythetotalnumberofhits,aggregationresults,andsuggestionswillbecached.Rememberthatwhenrunningquerieswithdatesandusingthenowconstant,theshardquerycachewon’talsobeused.
Theshardrequestcache,asitsnamesays,cachestheresultsofthequeriesoneachshard,beforetheyarereturnedtothenodethataggregatestheresults.Thiscanbeverygoodwhenyouraggregationsareheavy,liketheonesthatdoalotofcomputationonthedatareturnedbythequery.Ifyourunalotofaggregationswithyourqueriesandthequeriescanberepeated,thinkaboutusingtheshardrequestcacheasitshouldhelpyouwithquerieslatency.
EnablingandconfiguringtheshardrequestcacheTheshardrequestcacheisdisabledbydefault,butcanbeeasilyenabled.Toenableit,weshouldsettheindex.requests.cache.enablepropertytotruewhencreatingtheindex.Forexample,toenabletheshardrequestcacheforanindexcallednew_library,weusethefollowingcommand:
curl-XPUT'localhost:9200/new_library'-d'{
"settings":{
"index.requests.cache.enable":true
}
}'
Onethingtorememberisthatthementionedsettingisnotdynamicallyupdatable.Weneedtoincludeitintheindexcreationcommandorwecanupdateitwhentheindexisclosed.
Themaximumsizeofthecacheisspecifiedusingtheindices.requests.cache.sizepropertyandissetto1%bydefault(whichmeans1%ofthetotalmemorygiventoElasticsearch).Wecanalsospecifyhowlongeachentryshouldbekeptbyusingtheindices.requests.cache.expireproperty,butitisnotsetbydefault.Also,thecacheisinvalidatedoncetheindexisrefreshed(duringindexsearcherreopening),whichmakesthesettinguselessmostofthetime.
NoteNotethatintheearlierversionsofElasticsearch,forexampleinthe1.xbranch,toenableordisablethiscache,theindex.cache.query.enablepropertywasused.Thismaybe
www.EBooksWorld.ir
![Page 573: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/573.jpg)
importantwhenmigratingfromolderElasticsearchversions.
PerrequestshardrequestcachedisablingElasticsearchallowsustocontroltherequestshardcacheusedonaperrequestbasis.Ifwehavethementionedcacheenabled,wecanstillforcethesearchenginetoomitcachingforsuchrequests.Thisisdonebyusingtherequest_cacheparameter.Ifsettotrue,therequestwillbecachedand,ifsettofalse,therequestwon’tbecached.Thisisespeciallyusefulwhenwewanttocacheourrequestsingeneralbutomitcachingforsomequeriesthatarerareandnotusedoften.Itisalsowiseforrequeststhatusenon-deterministicscriptsandtimerangestonotbecached.
ShardrequestcacheusagemonitoringIfwedon’tuseanymonitoringsoftwarethatallowsmonitoringthecachesusage,wecanuseElasticsearchAPItocheckthemetricsaroundtheshardrequestcache.Thiscanbedonebothattheindicesleveloratthenodeslevel.
Tocheckthemetricsfortheshardrequestcacheforalltheindices,weshouldusetheindicesstatsAPIandrunthefollowingcommand:
curl'localhost:9200/_stats/request_cache?pretty'
Tochecktherequestcachemetrics,butinpernodeview,werunthefollowingcommand:
curl'localhost:9200/_nodes/stats/indices/request_cache?pretty'
www.EBooksWorld.ir
![Page 574: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/574.jpg)
NodequerycacheThenodequerycacheisresponsibleforholdingtheresultsofqueriesforthewholenode.Itssizeisdefinedusingindices.queries.cache.size,defaultingto10%,andissharableacrossalltheshardspresentonthenode.WecansetitbothtothepercentageoftheheapmemorygiventoElasticsearch,likethedefaultone,ortoanexplicitvalue,like1024mb.Onethingtorememberaboutthecacheisthatitsconfigurationisstatic,itcan’tbeupdateddynamicallyandshouldbesetintheelasticsearch.ymlfile.Thenodequerycacheusestheleastrecentusedevictionpolicy,whichmeansthat,whenfull,itremovesthedatathatwasusedtheleast.
Thiscacheisveryusefulwhenyourunqueriesthatarerepetetiveandheavy,suchastheonesusedtogeneratecategorypagesorthemainpageinane-commerceapplication.
www.EBooksWorld.ir
![Page 575: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/575.jpg)
IndexingbuffersThelastcachewewanttodiscussistheindexingbufferthatallowsustoimproveindexingthroughput.Theindexingbufferisdividedbetweenalltheshardsonthenodeandisusedtostorenewlyindexeddocuments.Oncethecachefillsup,Elasticsearchflushesthedatafromthecachetodisk,creatinganewLucenesegmentintheindex.
Therearefourstaticpropertiesthatallowustoconfiguretheindexingbuffersize.Theyneedtobesetintheelasticsearch.ymlfileandcan’tbechangeddynamicallyusingtheSettingsAPI.Thesepropertiesare:
indices.memory.index_buffer_size:Thispropertydefinestheamountofmemoryusedbyanodefortheindexingbuffer.Itacceptsbothapercentagevalueaswellasanexplicitvalueinbytes.Itdefaultsto10%,whichmeansthat10%oftheheapmemorygiventoanodewillbeusedastheindexingbuffer.indices.memory.min_index_buffer_size:Thispropertydefaultsto48mbandspecifiestheminimummemorythatwillbeusedbytheindexingbuffer.Itisusefulwhenindices.memory.index_buffer_sizeisdefinedasapercentagevalue,sothattheindexingbufferisneversmallerthanthevaluedefinedbythisproperty.indices.memory.max_index_buffer_size:Thispropertyspecifiesthemaximummemorythatwillbeusedbytheindexingbuffer.Itisusefulwhenindices.memory.index_buffer_sizeisdefinedasapercentagevalue,sothattheindexingbuffernevercrossesacertainamountofmemoryusage.indices.memory.min_shard_index_buffer_size:Thispropertydefaultsto4mbandsetsthehardminimumlimitoftheindexingbufferthatisgiventoeachshardonanode.Theindexingbufferforeachshardwillnotbelowerthanthevaluesetbythisproperty.
Whenitcomestoindexingperformance,ifyouneedhigherindexingthroughput,considersettingtheindexingbuffersizetoavaluehigherthanthedefaultsize.ItwillallowElasticsearchtoflushthedatatodisklessoftenandcreatefewersegments.Thiswillresultinlessmerges,thuslessI/OandCPUintensiveoperations.Becauseofthat,Elasticsearchwillbeabletousemoreresourcesforindexingpurposes.
www.EBooksWorld.ir
![Page 576: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/576.jpg)
WhencachesshouldbeavoidedTheusualquestionthatmaybeaskedbyusersisiftheyshouldreallycachealltheirrequests.Theanswerisobvious–ofcourse,cachesarenotthetoolforeveryone.Usingcachingisnotfree–itrequiresmemoryandadditionaloperationstoputthedatatocacheorgetthedataoutofthere.
What’smore,youshouldrememberthatElasticsearchroundrobinsqueriesbetweenprimaryshardsarereplicas,so,ifyouhavereplicas,noteveryrequestafterthefirstonewillusethecache.Imaginethatyouhaveanindexwhichhasasingleprimaryshardandtworeplicas.Whenthefirstrequestcomes,itwillhitarandomshard,butthenextrequest,evenwiththesamequery,willhitanothershard,notthesameone(unlessroutingisused).Youshouldtakethisintoconsiderationwhenusingcaches,becauseifyourqueriesarenotrepeated,youmayhavethemrunninglongerbecauseofacachebeingused.
Sotoanswerthequestionifyoushouldusecachingornot,wewouldadvisetakingyourdata,takingyourqueries,andrunningperformancetestsusingtoolssuchasJMeter(http://jmeter.apache.org).Thiswillletyouseehowyourclusterbehaveswithrealdataunderatestloadandseeifthequeriesareactuallyfasterwithorwithoutthecaches.
www.EBooksWorld.ir
![Page 577: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/577.jpg)
www.EBooksWorld.ir
![Page 578: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/578.jpg)
TheupdatesettingsAPIElasticsearchletsustuneitselfbyspecifyingthevariousparametersintheelasticsearch.ymlfile.ButyoushouldtreatthisfileasthesetofdefaultvaluesthatcanbechangedintheruntimeusingtheElasticsearchRESTAPI.Wecanchangeboththeperindexsettingandtheclusterwidesettings.However,youshouldrememberthatnotallpropertiescanbedynamicallychanged.Ifyoutrytoaltertheseparameters,Elasticsearchwillrespondwithapropererror.
www.EBooksWorld.ir
![Page 579: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/579.jpg)
TheclustersettingsAPIInordertosetoneoftheclusterproperties,weneedtousetheHTTPPUTmethodandsendaproperrequesttothe_cluster/settingsURI.However,wehavetwooptions:addingthechangesastransientorpermanent.
Thefirstone,transient,willsetthepropertyonlyuntilthefirstrestart.Inordertodothis,wesendthefollowingcommand:
curl-XPUT'localhost:9200/_cluster/settings'-d'{
"transient":{
"PROPERTY_NAME":"PROPERTY_VALUE"
}
}'
Asyoucansee,intheprecedingcommand,weusedtheobjectnamedtransientandweaddedourpropertydefinitionthere.Thismeansthatthepropertywillbevalidonlyuntiltherestart.Ifwewantourpropertysettingstopersistbetweenrestarts,insteadofusingtheobjectnamedtransient,weusetheonenamedpersistent.
Atanymoment,youcanfetchthesesettingsusingthefollowingcommand:
curl-XGETlocalhost:9200/_cluster/settings
www.EBooksWorld.ir
![Page 580: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/580.jpg)
TheindicessettingsAPITochangetheindicesrelatedsettings,Elasticsearchprovidesthe/_settingsendpointforchangingtheparametersforalltheindicesandthe/index_name/_settingsendpointformodifyingthesettingsofasingleindex.Whencomparedtotheclusterwidesettings,allthechangesdonetoindicesusingtheAPIarealwayspersistentandvalidafterElasticsearchrestarts.Tochangethesettingsforalltheindices,wesendthefollowingcommand:
curl-XPUT'localhost:9200/_settings'-d'{
"index":{
"PROPERTY_NAME":"PROPERTY_VALUE"
}
}'
Thecurrentsettingsforalltheindicescanbelistedusingthefollowingcommand:
curl-XGETlocalhost:9200/_settings
Tosetapropertyforasingleindex,werunthefollowingcommand:
curl-XPUT'localhost:9200/index_name/_settings'-d'{
"index":{
"PROPERTY_NAME":"PROPERTY_VALUE"
}
}'
Thegetthesettingsforthelibraryindex,werunthefollowingcommand:
curl-XGETlocalhost:9200/library/_settings
www.EBooksWorld.ir
![Page 581: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/581.jpg)
www.EBooksWorld.ir
![Page 582: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/582.jpg)
SummaryInthechapterwejustfinished,welearnedafewveryimportantthingsaboutElasticsearch.Firstofall,welearnedhowwecanconfigurethenodediscoverymechanism.Inadditiontothat,welearnedtocontrolwhathappensaftertheclusterisinitiallyformedusingtherecoveryandgatewaymodules.Weuseddynamicandnon-dynamictemplatestohandleourindicesmoreeasily,andwelearnedwhattypeofcachesElasticsearchhasandhowtocontrolthem.Finally,weusedtheupdatesettingsAPItoupdatethevariousElasticsearchconfigurationvariablesonanalreadylivecluster.
Inthenextchapter,wewillfocusonclusteradministration.Wewillstartwithlearninghowtobackupourdataandhowtomonitorthekeyclustermetrics.We’llseethewaytocontrolclusterrebalancingandshardallocation,andwewilluseahumanfriendlyCatAPIthatallowsustogetvariedinformationaboutthecluster.Finally,wewilllearnaboutwarmingupourindicesandaliasing.
www.EBooksWorld.ir
![Page 583: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/583.jpg)
www.EBooksWorld.ir
![Page 584: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/584.jpg)
Chapter10.AdministratingYourClusterInthepreviouschapter,wefocusedonElasticsearchnodesandclusterconfiguration.Westartedbydiscussingthenodediscoveryprocess,whatitisandhowtoconfigureit.We’vediscussedgatewayandrecoverymodulesandtunedthemtomatchourneeds.We’veusedtemplatesanddynamictemplatestomanagedatastructureeasilyandlearnedhowtoinstallpluginstoextendthefunctionalitiesofElasticsearch.Finally,we’velearnedaboutthecachesofElasticsearchandhowtoupdateindicesandclustersettingsusingadedicatedAPI.Bytheendofthischapter,youwillhavelearnedthefollowingtopics:
BackingupyourindicesinElasticsearchMonitoringyourclustersControllingshardsandrebalancingreplicasControllingshardsandallocatingreplicasUsingCATAPItolearnaboutclusterstateWarmingupAliasing
www.EBooksWorld.ir
![Page 585: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/585.jpg)
ElasticsearchtimemachineAgoodpieceofsoftwareisaonethatcanmanageexceptionalsituationssuchashardwarefailureorhumanerror.Eventhoughaclusterofafewserversislessdependentonhardwareproblems,badthingscanstillhappen.Forexample,let’simaginethatyouneedtorestoreyourindices.OnepossiblesolutionistoreindexallyourdatafromaprimarydatastoresuchasaSQLdatabase.Butwhatwillyoudoifittakestoolongor,evenworse,theonlydatastoreisElasticsearch?BeforeElasticsearch1.0,creatingbackupsofindiceswasnoteasy.Theprocedureincludedstoppingindexation,flushingthedatatodisk,shuttingdownthecluster,and,finally,copyingthedatatoabackupdevice.
Fortunately,nowwecantakesnapshotsandthissectionwillguideyouandshowhowthisfunctionalityworks.
www.EBooksWorld.ir
![Page 586: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/586.jpg)
CreatingasnapshotrepositoryAsnapshotkeepsallthedatarelatedtotheclusterfromthetimethesnapshotcreationstartsanditincludesinformationabouttheclusterstateandindices.Beforewecreatesnapshots,atleastthefirstone,asnapshotrepositorymustbecreated.Eachrepositoryisrecognizedbyitsnameandshoulddefinethefollowingaspects:
name:Auniquenameoftherepository;wewillneeditlater.type:Thetypeoftherepository.Thepossiblevaluesarefs(arepositoryonasharedfilesystem)andurl(aread-onlyrepositoryavailableviaURL)settings:Additionalinformationneededdependingontherepositorytype
Now,let’screateafilesystemrepository.Beforethis,wehavetomakesurethatthedirectoryforourbackupsfulfilstworequirements.Thefirstisrelatedtosecurity.EveryrepositoryhastobeplacedinthepathdefinedintheElasticsearchconfigurationfileaspath.repo.Forexample,ourelasticsearch.ymlincludesalinesimilartothefollowingone:
path.repo:["/tmp/es_backup_folder","/tmp/backup/es"]
Thesecondrequirementsaysthateverynodeintheclustershouldbeabletoaccessthedirectorywesetfortherepository.
Sonow,let’screateanewfilesystemrepositorybyrunningthefollowingcommand:
curl-XPUTlocalhost:9200/_snapshot/backup-d'{
"type":"fs",
"settings":{
"location":"/tmp/es_backup_folder/cluster1"
}
}'
Theprecedingcommandcreatesarepositorynamedbackup,whichstoresthebackupfilesinthedirectorygivenbythelocationattribute.Elasticsearchrespondswiththefollowinginformation:
{"acknowledged":true}
Atthesametime,es_backup_folderonthelocalfilesystemiscreated—withoutanycontentyet.
NoteYoucanalsosetarelativepathwiththelocationparameter.Inthiscase,Elasticsearchdeterminestheabsolutepathbyfirstgettingthedirectorydefinedinpath.repo.
Aswesaid,thesecondrepositorytypeisurl.Itrequiresaurlparameterinsteadofthelocation,whichpointstotheaddresswheretherepositoryresides,forexample,theHTTPaddress.Asinthepreviouscase,theaddressshouldbedefinedintherepositories.url.allowed_urlsparameterintheElasticsearchconfiguration.Theparameterallowstheuseofwildcardsintheaddress.
www.EBooksWorld.ir
![Page 587: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/587.jpg)
NoteNotethatfile://addressesarecheckedagainstthepathsdefinedinthepath.repoparameter.
YoucanalsostoresnapshotsinAmazonS3,HDFS,orAzureusingtheadditionalpluginsavailable.Tolearnaboutthese,pleasevisitthefollowingpages:
https://github.com/elastic/elasticsearch-cloud-aws#s3-repositoryhttps://github.com/elastic/elasticsearch-hadoop/tree/master/repository-hdfshttps://github.com/elastic/elasticsearch-cloud-azure#azure-repository
Nowthatwehaveourfirstrepository,wecanseeitsdefinitionusingthefollowingcommand:
curl-XGETlocalhost:9200/_snapshot/backup?pretty
Wecanalsocheckalltherepositoriesbyrunningacommandlikethefollowing:
curl-XGETlocalhost:9200/_snapshot/_all?pretty
Orsimply,wecanusethis:
curl-XGETlocalhost:9200/_snapshot/_all?pretty
curl-XGETlocalhost:9200/_snapshot/?pretty
Ifyouwanttodeleteasnapshotrepository,thestandardDELETEcommandhelps:
curl-XDELETElocalhost:9200/_snapshot/backup?pretty
www.EBooksWorld.ir
![Page 588: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/588.jpg)
CreatingsnapshotsBydefault,Elasticsearchtakesalltheindicesandclustersettings(exceptthetransientones)whencreatingsnapshots.Youcancreateanynumberofsnapshotsandeachwillholdinformationavailablerightfromthetimewhenthesnapshotwascreated.Thesnapshotsarecreatedinasmartway;onlynewinformationiscopied.ThismeansthatElasticsearchknowswhichsegmentsarealreadystoredintherepositoryanddoesn’thavetosavethemagain.
Tocreateanewsnapshot,weneedtochooseauniquenameandusethefollowingcommand:
curl-XPUT'localhost:9200/_snapshot/backup/bckp1'
Theprecedingcommanddefinesanewsnapshotnamedbckp1(youcanonlyhaveonesnapshotwithagivenname;Elasticsearchwillcheckitsuniqueness)anddataisstoredinthepreviouslydefinedbackuprepository.Thecommandreturnsanimmediateresponse,whichlooksasfollows:
{"accepted":true}
Theprecedingresponsemeansthattheprocessofsnapshot-inghasstartedandcontinuesinthebackground.Ifyouwouldliketheresponsetobereturnedonlywhentheactualsnapshotiscreated,youcanaddthewait_for_completion=trueparameterasshowninthefollowingexample:
curl-XPUT'localhost:9200/_snapshot/backup/bckp2?
wait_for_completion=true&pretty'
Theresponsetotheprecedingcommandshowsthestatusofacreatedsnapshot:
{
"snapshot":{
"snapshot":"bckp2",
"version_id":2000099,
"version":"2.2.0",
"indices":["news"],
"state":"SUCCESS",
"start_time":"2016-01-07T21:21:43.740Z",
"start_time_in_millis":1446931303740,
"end_time":"2016-01-07T21:21:44.750Z",
"end_time_in_millis":1446931304750,
"duration_in_millis":1010,
"failures":[],
"shards":{
"total":5,
"failed":0,
"successful":5
}
}
}
Asyoucansee,Elasticsearchpresentsinformationaboutthetimetakenbythesnapshot-
www.EBooksWorld.ir
![Page 589: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/589.jpg)
ingprocess,itsstatus,andtheindicesaffected.
AdditionalparametersThesnapshotcommandalsoacceptsthefollowingadditionalparameters:
indices:Thenamesoftheindicesofwhichwewanttotakesnapshots.ignore_unavailable:Whenthisissettofalse(thedefault),Elasticsearchwillreturnanerrorifanyindexlistedusingtheindicesparameterismissing.Whensettotrue,Elasticsearchwilljustignorethemissingindicesduringbackup.include_global_state:Whenthisissettotrue(thedefault),theclusterstateisalsowrittentothesnapshot(exceptforthetransientsettings).partial:Thesnapshotoperationsuccessdependsontheavailabilityofalltheshards.Ifanyoftheshardsisnotavailable,thesnapshotoperationwillfail.SettingpartialtotruecausesElasticsearchtosaveonlytheavailableshardsandomitthelostones.
Anexampleofusingadditionalparameterscanlookasfollows:
curl-XPUT'localhost:9200/_snapshot/backup/bckp3?
wait_for_completion=true&pretty'-d'{
"indices":"b*",
"include_global_state":"false"
}'
www.EBooksWorld.ir
![Page 590: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/590.jpg)
RestoringasnapshotNowthatwehaveoursnapshotsdone,wewillalsolearnhowtorestoredatafromagivensnapshot.Aswesaidearlier,asnapshotcanbeaddressedbyitsname.Wecanlistallthesnapshotsusingthefollowingcommand:
curl-XGET'localhost:9200/_snapshot/backup/_all?pretty'
TheresponsereturnedbyElasticsearchtotheprecedingcommandshowsthelistofallavailablebackups.Everylistitemissimilartothefollowing:
{
"snapshot":{
"snapshot":"bckp2",
"version_id":2000099,
"version":"2.2.0",
"indices":["news"],
"state":"SUCCESS",
"start_time":"2016-01-07T21:21:43.740Z",
"start_time_in_millis":1446931303740,
"end_time":"2016-01-07T21:21:44.750Z",
"end_time_in_millis":1446931304750,
"duration_in_millis":1010,
"failures":[],
"shards":{
"total":5,
"failed":0,
"successful":5
}
}
}
Therepositorywecreatedearlieriscalledbackup.Torestoreasnapshotnamedbckp1fromoursnapshotrepository,runthefollowingcommand:
curl-XPOST'localhost:9200/_snapshot/backup/bckp1/_restore'
Duringtheexecutionofthiscommand,Elasticsearchtakestheindicesdefinedinthesnapshotandcreatesthemwiththedatafromthesnapshot.However,iftheindexalreadyexistsandisnotclosed,thecommandwillfail.Inthiscase,youmayfinditconvenienttoonlyrestorecertainindices,forexample:
curl-XPOST'localhost:9200/_snapshot/backup/bckp1/_restore?pretty'-d'{
"indices":"c*"}'
Theprecedingcommandrestoresonlytheindicesthatbeginwiththeletterc.Theotheravailableparametersareasfollows:
ignore_unavailable:Thisparameterwhensettofalse(thedefaultbehavior),willcauseElasticsearchtofailtherestoreprocessifanyoftheexpectedindicesisnotavailable.include_global_state:ThisparameterwhensettotruewillcauseElasticsearchtorestoretheglobalstateincludedinthesnapshot,whichisalsothedefaultbehavior.
www.EBooksWorld.ir
![Page 591: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/591.jpg)
rename_pattern:Thisparameterallowstherenamingoftheindexduringarestoreoperation.Thankstothis,therestoredindexwillhaveadifferentname.Thevalueofthisparameterisaregularexpressionthatdefinesthesourceindexname.Ifapatternmatchesthenameoftheindex,namesubstitutionwilloccur.Inthepattern,youshouldusegroupslimitedbyparenthesesusedintherename_replacementparameter.rename_replacement:Thisparameteralongwithrename_patterndefinesthetargetindexname.Usingthedollarsignandnumber,youcanrecalltheappropriategroupfromrename_pattern.
Forexample,duetorename_pattern=products_(.*),onlytheindiceswithnamesthatbeginwithproducts_willberestored.Therestoftheindexnamewillbeusedduringreplacement.rename_pattern=products_(.*)togetherwithrename_replacement=items_$1causestheproducts_carsindextoberestoredtoanindexcalleditems_cars.
www.EBooksWorld.ir
![Page 592: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/592.jpg)
Cleaningup–deletingoldsnapshotsElasticsearchleavessnapshotrepositorymanagementuptoyou.Currently,thereisnoautomaticclean-upprocess.Butdon’tworry;thisissimple.Forexample,let’sremoveourpreviouslytakensnapshot:
curl-XDELETE'localhost:9200/_snapshot/backup/bckp1?pretty'
Andthat’sall.Thecommandcausesthesnapshotnamedbckp1fromthebackuprepositorytobedeleted.
www.EBooksWorld.ir
![Page 593: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/593.jpg)
www.EBooksWorld.ir
![Page 594: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/594.jpg)
Monitoringyourcluster’sstateandhealthMonitoringisessentialwhenitcomestohandlingyourclusterandensuringitisinahealthystate.Itallowsadministratorsanddevelopstodetectpossibleproblemsandpreventthembeforetheyoccurortoactassoonastheystartshowing.Intheworstcase,monitoringallowsustodoapostmortemanalysisofwhathappenedtotheapplication—inthiscase,ourElasticsearchclusterandeachofthenodes.
Elasticsearchprovidesverydetailedinformationthatallowsustocheckandmonitorournodesortheclusterasawhole.Thisincludesstatisticsandinformationabouttheservers,nodes,indices,andshards.Ofcourse,wearealsoabletogetinformationabouttheentireclusterstate.BeforewegetintothedetailsaboutthementionedAPI,pleaserememberthattheAPIiscomplexandweareonlydescribingthebasics.Wewilltrytoshowyouwheretostartsoyou’llbeabletoknowwhattolookforwhenyouneedverydetailedinformation.
www.EBooksWorld.ir
![Page 595: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/595.jpg)
ClusterhealthAPIOneofthemostbasicAPIsistheclusterhealthAPI,whichallowsustogetinformationabouttheentireclusterstatewithasingleHTTPcommand.Forexample,let’srunthefollowingcommand:
curl-XGET'localhost:9200/_cluster/health?pretty'
AsampleresponsereturnedbyElasticsearchfortheprecedingcommandlooksasfollows:
{
"cluster_name":"elasticsearch",
"status":"yellow",
"timed_out":false,
"number_of_nodes":1,
"number_of_data_nodes":1,
"active_primary_shards":11,
"active_shards":11,
"relocating_shards":0,
"initializing_shards":0,
"unassigned_shards":11,
"delayed_unassigned_shards":0,
"number_of_pending_tasks":0,
"number_of_in_flight_fetch":0,
"task_max_waiting_in_queue_millis":0,
"active_shards_percent_as_number":50.0
}
Themostimportantinformationisaboutthestatusofthecluster.Inourexample,weseethattheclusterisinyellowstatus.Thismeansthatalltheprimaryshardshavebeenallocatedproperly,butthereplicaswerenot(becauseofasinglenodeinthecluster,butthatdoesn’tmatterfornow).
Ofcourse,apartfromtheclusternameandstatus,wecanseehowtherequestwastimedout,howmanynodesthereare,howmanydatanodes,primaryshards,initializingshards,unassignedones,andsoon.
Let’sstophereandtalkabouttheclusterandwhenthecluster,asawhole,isfullyoperational.ClusterisfullyoperationalwhenElasticsearchisabletoallocatealltheshardsandreplicasaccordingtotheconfiguration.Thisiswhentheclusterisinthegreenstate.Theyellowstatemeansthatwearereadytohandlerequestsbecausetheprimaryshardsareallocated,butsome(orall)replicasarenot.Thelaststate,theredone,meansthatatleastoneprimaryshardwasnotallocatedandbecauseofthis,theclusterisnotreadyyet.Thatmeansthatthequeriesmayreturnerrorsornotcompleteresults.
Theprecedingcommandcanalsobeexecutedtocheckthehealthstateofcertainindices.Forexample,ifwewouldliketocheckthehealthofthelibraryandmapindices,wewouldrunthefollowingcommand:
curl-XGET'localhost:9200/_cluster/health/library,map/?pretty'
Controllinginformationdetails
www.EBooksWorld.ir
![Page 596: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/596.jpg)
Elasticsearchallowsustospecifyaspeciallevelparameter,whichcantakethevalueofcluster(default),indices,orshards.ThisallowsustocontrolthedetailsofinformationreturnedbythehealthAPI.We’vealreadyseenthedefaultbehavior.Whensettingthelevelparametertoindices,apartfromtheclusterinformation,wewillalsogetperindexhealth.SettingthementionedparametertoshardstellsElasticsearchtoreturnpershardinformationinadditiontowhatwe’veseenintheexample.
AdditionalparametersInadditiontothelevelparameter,wehaveafewadditionalparametersthatcancontrolthebehaviorofthehealthAPI.
Thefirstofthementionedparametersistimeoutandallowsustocontrolhowlongatthemost,thecommandexecutionwillwaitwhenoneofthefollowingparametersisused:wait_for_status,wait_for_nodes,wait_for_relocating_shards,andwait_for_active_shards.Bydefault,itissetto30sandmeansthatthehealthcommandwillwait30secondsmaximumandreturntheresponsebythen.
Thewait_for_statusparameterallowsustotellElasticsearchwhichhealthstatustheclustershouldbeattoreturnthecommand.Itcantakethevaluesofgreen,yellow,andred.Forexample,whensettogreen,thehealthAPIcallwillreturntheresultsuntilthegreenstatusortimeoutisreached.
Thewait_for_nodesparameterallowsustosettherequirednumberofnodesavailabletoreturnthehealthcommandresponse(oruntiladefinedtimeoutisreached).Itcanbesettoanintegernumberlike3ortoasimpleequationlike>=3(means,greaterthanorequaltothreenodes)or<=3(meanslessthanorequaltothreenodes).
Thewait_for_active_shardsparametermeansthatElasticsearchwillwaitforaspecifiednumberofactiveshardstobepresentbeforereturningtheresponse.
Thelastparameteristhewait_for_relocating_shard,whichisbydefaultnotspecified.ItallowsustotellElasticsearchhowmanyrelocatingshardsitshouldwaitfor(oruntilthetimeoutisreached).Settingthisparameterto0meansthatElasticsearchshouldwaitforalltherelocatingshards.
Anexampleusageofthehealthcommandwithsomeofthementionedparametersisasfollows:
curl-XGET'localhost:9200/_cluster/health?
wait_for_status=green&wait_for_nodes=>=3&timeout=100s'
www.EBooksWorld.ir
![Page 597: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/597.jpg)
IndicesstatsAPIElasticsearchindexistheplacewhereourdatalivesanditisacrucialpartformostdeployments.WiththeuseoftheindicesstatsAPIavailableusingthe_statsendpoint,wecangetalotofinformationabouttheindiceslivinginsideourcluster.Ofcourse,aswithmostoftheAPI’sinElasticsearch,wecansendacommandtogettheinformationaboutalltheindices(usingthepure_statsendpoint),aboutoneparticularindex(forexamplelibrary/_stats)orseveralindicesatthesametime(forexamplelibrary,map/_stats).Forexample,tocheckthestatisticsforthemapandlibraryindiceswe’veusedinthebook,wecouldrunthefollowingcommand:
curl-XGET'localhost:9200/library,map/_stats?pretty'
Theresponsetotheprecedingcommandhasmorethan700lines,soweonlydescribeitsstructureomittingtheresponseitself.Apartfromtheinformationabouttheresponsestatusandtheresponsetime,wecanseethreeobjectsnamedprimaries,total(in_allobject),andindices.Theindicesobjectcontainsinformationaboutthelibraryandmapindices.Theprimariesobjectcontainsinformationabouttheprimaryshardsallocatedtothecurrentnode,andthetotalobjectcontainsinformationaboutalltheshardsincludingreplicas.Alltheseobjectscancontainobjectsdescribingaparticularstatisticsuchasthefollowing:docs,store,indexing,get,search,merges,refresh,flush,warmer,query_cache,fielddata,percolate,completion,segments,translog,suggest,request_cache,andrecovery.
WecanlimittheamountofinformationthatwegetfromtheindicesstatsAPIbyprovidingthetypeofdataweareinterestedinusingthenamesofthestatisticsmentionedpreviously.Forexample,ifwewanttogetinformationaboutindexingandsearching,wecanrunthefollowingcommand:
curl-XGET'localhost:9200/library,map/_stats/indexing,search?pretty'
Let’sdiscusstheinformationstoredinthoseobjects.
DocsThedocssectionoftheresponseshowsinformationaboutindexeddocuments.Forexample,itcouldlookasfollows:
"docs":{
"count":4,
"deleted":0
}
Themaininformationisthecount,indicatingthenumberofdocumentsinthedescribedindex.Whenwedeletedocumentsfromtheindex,Elasticsearchdoesn’tremovethesedocumentsimmediatelyandonlymarksthemasdeleted.Documentsarephysicallydeletedduringthesegmentmergeprocess.Thenumberofdocumentsmarkedasdeletedispresentedbythedeletedattributeandshouldbe0rightafterthemerge.
Store
www.EBooksWorld.ir
![Page 598: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/598.jpg)
Thenextstatistic,thestoreone,providesinformationregardingstorage.Forexample,suchasectioncouldlookasfollows:
"store":{
"size_in_bytes":6003,
"throttle_time_in_millis":0
}
Themaininformationisabouttheindex(orindices)size.Wecanalsolookatthrottlingstatistics.ThisinformationisusefulwhenthesystemhasproblemswiththeI/Operformanceandhasconfiguredlimitsonaninternaloperationduringsegmentmerging.
Indexing,get,andsearchTheindexing,get,andsearchsectionsoftheresponseprovideinformationaboutdatamanipulationindexingwithdeleteoperations,usingreal-timegetandsearching.Let’slookatthefollowingexamplereturnedbyElasticsearch:
"indexing":{
"index_total":0,
"index_time_in_millis":0,
"index_current":0,
"delete_total":0,
"delete_time_in_millis":0,
"delete_current":0,
"noop_update_total":0,
"is_throttled":false,
"throttle_time_in_millis":0
},
"get":{
"total":0,
"time_in_millis":0,
"exists_total":0,
"exists_time_in_millis":0,
"missing_total":0,
"missing_time_in_millis":0,
"current":0
},
"search":{
"open_contexts":0,
"query_total":0,
"query_time_in_millis":0,
"query_current":0,
"fetch_total":0,
"fetch_time_in_millis":0,
"fetch_current":0,
"scroll_total":0,
"scroll_time_in_millis":0,
"scroll_current":0
}
Asyoucansee,allofthesestatisticshavesimilarstructures.Wecanreadthetotaltimespentinvariousrequesttypes(inmilliseconds),thenumberofrequests(whichwiththetotaltimeallowsustocalculatetheaveragetimeofasinglequery).Inthecaseofgetrequests,valuableinformationishowmanyfetcheswereunsuccessful(missing
www.EBooksWorld.ir
![Page 599: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/599.jpg)
documents);anindexingrequesthasinformationaboutthrottling,andsearchincludesinformationregardingscrolling.
AdditionalinformationInadditiontothepreviouslydescribedsection,Elasticsearchprovidesthefollowinginformation:
merges:ThissectioncontainsinformationaboutLucenesegmentmergesrefresh:Thissectioncontainsinformationabouttherefreshoperationflush:Thissectioncontainsinformationaboutflusheswarmer:Thissectioncontainsinformationaboutwarmersandforhowlongtheywereexecutedquery_cache:Thisquerycachesstatisticsfielddata:Thisfielddatacachesstatisticspercolate:Thissectioncontainsinformationaboutthepercolatorusagecompletion:Thissectioncontainsinformationaboutthecompletionsuggestersegments:ThissectioncontainsinformationaboutLucenesegmentstranslog:Thissectioncontainsinformationaboutthetransactionlogscountandsizesuggest:Thissectioncontainssuggesters-relatedstatisticsrequest_cache:Thiscontainsshardrequestcachesstatisticsrecovery:Thiscontainsshardsrecoveryinformation
www.EBooksWorld.ir
![Page 600: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/600.jpg)
NodesinfoAPIThenodesinfoAPIprovidesuswithinformationaboutthenodesinthecluster.TogetinformationfromthisAPI,weneedtosendtherequesttothe_nodesRESTendpoints.ThesimplestcommandtoretrievenodesrelatedinformationfromElasticsearchwouldbeasfollows:
curl-XGET'localhost:9200/_nodes?pretty'
ThisAPIcanbeusedtofetchinformationaboutparticularnodesorasinglenodeusingthefollowing:
Nodename:IfwewouldliketogetinformationaboutthenodenamedPulse,wecouldrunacommandtothefollowingRESTendpoint:_nodes/PulseNodeidentifier:Ifwewouldliketogetinformationaboutthenodewithanidentifierequaltony4hftjNQtuKMyEvpUdQWg,wecouldrunacommandtothefollowingRESTendpoint:_nodes/ny4hftjNQtuKMyEvpUdQWgIPaddress:WecanuseIPaddressestogetinformationaboutthenodes.Forexample,ifwewouldliketogetinformationaboutthenodewithanIPaddressequalto192.168.1.103,wecouldrunacommandtothefollowingRESTendpoint:_nodes/192.168.1.103
ParametersfromtheElasticsearchconfiguration:Ifwewouldliketogetinformationaboutallthenodeswiththenode.rackpropertysetto2,wecouldrunacommandtothefollowingRESTendpoint:/_nodes/rack:2
ThisAPIalsoallowsustogetinformationaboutseveralnodesatonceusingthese:
Patterns,forexample:_nodes/192.168.1.*or_nodes/P*Nodesenumeration,forexample:_nodes/Pulse,SlabBothpatternsandenumerations,forexample:/_nodes/P*,S*
ReturnedinformationBydefault,thenodesAPIwillreturnextensiveinformationabouteachnodealongwiththename,identifier,andaddresses.Thisextensiveinformationincludesthefollowing:
settings:TheElasticsearchconfigurationos:Informationabouttheserversuchasprocessor,RAM,andswapspaceprocess:Processidentifierandrefreshintervaljvm:InformationaboutJavaVirtualMachinesuchasmemorylimits,memorypools,andgarbagecollectorsthread_pool:Theconfigurationofthreadpoolsforvariousoperationstransport:Listeningaddressesforthetransportprotocolhttp:InformationaboutlisteningaddressesforanHTTP-basedAPIplugins:Informationaboutthepluginsinstalledbytheusermodules:Informationaboutthebuilt-inplugins
AnexampleusageofthisAPIcanbeillustratedbythefollowingcommand:
curl'localhost:9200/_nodes/Pulse/os,jvm,plugins'
www.EBooksWorld.ir
![Page 601: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/601.jpg)
TheprecedingcommandwillreturnthebasicinformationaboutthenodenamedPulseand,inadditiontothis,itwillincludetheoperatingsysteminformation,javavirtualmachineinformation,andplugins-relatedinformation.
www.EBooksWorld.ir
![Page 602: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/602.jpg)
NodesstatsAPIThenodesstatsAPIissimilartothenodesinfoAPIdescribedintheprecedingsection.ThemaindifferenceisthatthepreviousAPIprovidedinformationabouttheenvironmentinwhichthenodeisrunning,whiletheonewearecurrentlydiscussingtellsusaboutwhathappenedwiththeclusterduringitswork.TousethenodesstatsAPI,youneedtosendacommandtothe/_nodes/statsRESTendpoint.However,similartothenodesinfoAPI,wecanalsoretrieveinformationaboutspecificnodes(forexample:_nodes/Pulse/stats).
ThesimplestcommandtoretrievenodesrelatedinformationfromElasticsearchwouldbeasfollows:
curl-XGET'localhost:9200/_nodes/stats?pretty'
Bydefault,Elasticsearchreturnsalltheavailablestatisticsbutwecanlimittheonesweareinterestedin.Theavailableoptionsareasfollows:
indices:Informationabouttheindicesincludingsize,documentcount,indexingrelatedstatistics,searchandgettime,caches,segmentmerges,andsoonos:Operatingsystemrelatedinformationsuchasfreediskspace,memory,swapusage,andsoonprocess:Memory,CPU,andfilehandlerusagerelatedtotheElasticsearchprocessjvm:Javavirtualmachinememoryandgarbagecollectorstatisticstransport:Informationaboutdatasentandreceivedbythetransportmodulehttp:Informationabouthttpconnectionsfs:InformationaboutavailablediskspaceandI/Ooperationsstatisticsthread_pool:Informationaboutthestateofthethreadsassignedtovariousoperationsbreakers:Informationaboutcircuitbreakersscript:Scriptingenginerelatedinformation
AnexampleusageofthisAPIcanbeillustratedbythefollowingcommand:
curl'localhost:9200/_nodes/Pulse/stats/os,jvm,breaker'
www.EBooksWorld.ir
![Page 603: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/603.jpg)
ClusterstateAPIAnotherAPIprovidedbyElasticsearchistheclusterstateAPI.Asitsnamesuggests,itallowsustogetinformationabouttheentirecluster(wecanalsolimitthereturnedinformationtoalocalnodebyaddingthelocal=trueparametertotherequest).ThebasiccommandusedtogetalltheinformationreturnedbythisAPIlooksasfollows:
curl-XGET'localhost:9200/_cluster/state?pretty'
Wecanalsolimittheprovidedinformationtothegivenmetricsincomma–separatedform,specifiedafterthe_cluster/statepartoftheRESTcall.Forexample:
curl-XGET'localhost:9200/_cluster/state/version,nodes?pretty'
Wecanalsolimittheinformationtothegivenmetricsandindices.Forexample,ifwewouldliketogetthemetadataforthelibraryindex,wecouldrunthefollowingcommand:
curl-XGET'localhost:9200/_cluster/state/metadata/library?pretty'
Thefollowingmetricsareallowedtobeused:
version:Thisreturnsinformationabouttheclusterstateversion.master_node:Thisreturnsinformationabouttheelectedmasternode.nodes:Thisreturnsnodesinformation.routing_table:Thisreturnsroutingrelatedinformation.metadata:Thisreturnsmetadatarelatedinformation.Whenspecifyingretrievingthemetadatametricwecanalsoincludeanadditionalparametersuchasindex_templates=true,whichwillresultinincludingthedefinedindextemplates.blocks:Thisreturnstheblockspartoftheresponse.
www.EBooksWorld.ir
![Page 604: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/604.jpg)
ClusterstatsAPITheclusterstatsAPIallowsustogetstatisticsabouttheindicesandnodesfromtheclusterwideperspective.TousethisAPI,weneedtoruntheGETrequesttothe/_cluster/statsRESTendpoint,forexample:
curl-XGET'localhost:9200/_cluster/stats?pretty'
Theresponsesizedependsonthenumberofshards,indices,andnodesinthecluster.Itwillincludebasicindicesinformationsuchasshards,theirstate,recoveryinformation,cachesinformation,andnoderelatedinformation.
www.EBooksWorld.ir
![Page 605: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/605.jpg)
PendingtasksAPIOneoftheAPI’sthathelpsusinseeingwhatElasticsearchisdoing;itallowsustocheckwhichtasksarewaitingtobeexecuted.Toretrievethisinformation,weneedtosendarequesttothe/_cluster/pending_tasksRESTendpoint.Inthisresponse,wewillseeanarrayoftaskswithinformationaboutthem,suchastaskpriorityandtimeinqueue.
www.EBooksWorld.ir
![Page 606: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/606.jpg)
IndicesrecoveryAPITherecoveryAPIgivesusinsightabouttherecoverystatusoftheshardsthatarebuildingindicesinourcluster(learnmoreaboutrecoveryinThegatewayandrecoverymodulessectionofChapter9,ElasticsearchClusterinDetail).
Thesimplestcommandthatwouldreturntheinformationabouttherecoveryofalltheshardsintheclusterwouldlookasfollows:
curl-XGET'http://localhost:9200/_recovery?pretty'
Wecanalsogetinformationaboutrecoveryforparticularindices,suchasthelibraryindexforexample:
curl-XGET'http://localhost:9200/library/_recovery?pretty'
TheresponsereturnedbyElasticsearchisdividedbyindicesandshards.Aresponseforasingleshardcouldlookasfollows:
{
"id":2,
"type":"STORE",
"stage":"DONE",
"primary":true,
"start_time_in_millis":1446132761730,
"stop_time_in_millis":1446132761734,
"total_time_in_millis":4,
"source":{
"id":"DboTibRlT1KJSQYnDPxwZQ",
"host":"127.0.0.1",
"transport_address":"127.0.0.1:9300",
"ip":"127.0.0.1",
"name":"Plague"
},
"target":{
"id":"DboTibRlT1KJSQYnDPxwZQ",
"host":"127.0.0.1",
"transport_address":"127.0.0.1:9300",
"ip":"127.0.0.1",
"name":"Plague"
},
"index":{
"size":{
"total_in_bytes":156,
"reused_in_bytes":156,
"recovered_in_bytes":0,
"percent":"100.0%"
},
"files":{
"total":1,
"reused":1,
"recovered":0,
"percent":"100.0%"
},
"total_time_in_millis":0,
www.EBooksWorld.ir
![Page 607: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/607.jpg)
"source_throttle_time_in_millis":0,
"target_throttle_time_in_millis":0
},
"translog":{
"recovered":0,
"total":-1,
"percent":"-1.0%",
"total_on_start":-1,
"total_time_in_millis":3
},
"verify_index":{
"check_index_time_in_millis":0,
"total_time_in_millis":0
}
}
Intheprecedingresponse,wecanseeinformationabouttheshardidentifier,thestageofrecovery,informationwhethertheshardisaprimaryorareplica,thetimestampsofthestartandendofrecovery,andthetotaltimetherecoveryprocesstook.Wecanseethesourcenode,targetnode,andinformationabouttheshard’sphysicalstatistics,suchassize,numberoffiles,transactionlog-relatedstatistics,andindexverificationtime.
Itisworthknowingtheinformationaboutthestagesofrecoveryandtypes.Whenitcomestothetypesofrecovery(thetypeattributeintheresponse),wecanexpectthefollowing:theSTORE,SNAPSHOT,REPLICA,andRELOCATINGvalues.Whenitcomestothestageofrecovery(thestageattributeintheresponse),wecanexpectvaluessuchasINIT(recoveryhasnotstarted),INDEX(Elasticsearchcopiesmetadatainformationanddatafromsourcetodestination),START(Elasticsearchisopeningtheshardforuse),FINALIZE(finalstage,whichcleansupgarbage),andDONE(recoveryhasended).
WecanlimittheresponsereturnedbytheindicesrecoveryAPItoonlytheshardsthatarecurrentlyinactiverecoverybyincludingtheactive_only=trueparameterintherequest.Finally,wecanrequestmoredetailedinformationbyaddingthedetailed=trueparameterintheAPIcall.
www.EBooksWorld.ir
![Page 608: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/608.jpg)
IndicesshardstoresAPITheindicesshardstoresAPIgivesusinformationaboutthestorefortheshardsofourindices.WeusethisAPIbyrunningasimplecommandtothe/_shard_storesRESTendpointandprovidingornotprovidingthecomma-separatedindicesnames.
Forexample,togetinformationaboutalltheindices,wewouldrunthefollowingcommand:
curl-XGET'http://localhost:9200/_shard_stores?pretty'
Wecanalsogetinformationaboutparticularindices,suchasthelibraryandmapones:
curl-XGET'http://localhost:9200/library,map/_shard_stores?pretty'
TheresponsereturnedbyElasticsearchcontainsinformationaboutthestoreforeachshard.Forexample,thisiswhatElasticsearchreturnedforoneoftheshardsofthelibraryindex:
"0":{
"stores":[{
"DboTibRlT1KJSQYnDPxwZQ":{
"name":"Plague",
"transport_address":"127.0.0.1:9300",
"attributes":{}
},
"version":6,
"allocation":"primary"
}]
}
Wecanseeinformationaboutthenodeinthestoresarrays.Eachentrycontainsnoderelatedinformation(thenodewheretheshardisphysicallylocated),theversionofthestorecopy,andtheallocation,whichcantakethevaluesofprimary(forprimaryshards),replica(forreplicas),andunused(forunassignedshards).
www.EBooksWorld.ir
![Page 609: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/609.jpg)
IndicessegmentsAPIThelastAPIwewanttomentionistheLucenesegmentsAPIthatcanbeavailedbyusingthe/_segmentsendpoint.Wecaneitherrunitfortheentirecluster,forexamplelikethis:
curl-XGET'localhost:9200/_segments?pretty'
Wecanalsorunthecommandforindividualindices.Forexample,ifwewouldliketogetsegmentsrelatedinformationforthemapandlibraryindices,wewouldusethefollowingcommand:
curl-XGET'localhost:9200/library,map/_segments?pretty'
ThisAPIprovidesinformationaboutshards,theirplacements,andinformationaboutsegmentsconnectedwiththephysicalindexmanagedbytheApacheLucenelibrary.
www.EBooksWorld.ir
![Page 610: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/610.jpg)
www.EBooksWorld.ir
![Page 611: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/611.jpg)
ControllingtheshardandreplicaallocationTheindicesthatliveinsideyourElasticsearchclustercanbebuiltfrommanyshardsandeachshardcanhavemanyreplicas.Theabilitytodivideasingleindexintomultipleshardsgivesusthepossibilityofdividingthedataintomultiplephysicalinstances.Thereasonswhywewanttodothismaybedifferent.Wemaywanttoparallelizeindexingtogetmorethroughput,orwemaywanttohavesmallershardssothatourqueriesarefaster.Ofcourse,wemayhavetoomanydocumentstofitthemonasinglemachineandwemaywantashardbecauseofthis.Withreplicas,wecanparallelizethequeryloadbyhavingmultiplephysicalcopiesofeachshard.Wecansaythat,usingshardsandreplicas,wecanscaleoutElasticsearch.However,Elasticsearchhastofigureoutwhereintheclusteritshouldplaceshardsandreplicas.Itneedstofigureoutonwhichserver/nodeseachshardorreplicashouldbeplaced.
www.EBooksWorld.ir
![Page 612: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/612.jpg)
ExplicitlycontrollingallocationOneofthemostcommonusecasesthatuseexplicitcontrollingofshardsandreplicasallocationinElasticsearchistime-baseddata,thatis,logs.Eachlogeventhasatimestampassociatedwithit;however,theamountoflogsinmostorganizationsisjustenormous.Thethingisthatyouneedalotofprocessingpowertoindexthem,butyoudon’tusuallysearchhistoricaldata.Ofcourse,youmaywanttodothat,butitwillbedonelessfrequentlythanthequeriesforthemostrecentdata.
Becauseofthis,wecandividetheclusterintosocalledtwotiers—thecoldandthehottier.Thehottiercontainsmorepowerfulnodes,onesthathaveveryfastdisks,lotsofCPUprocessingpower,andmemory.Thesenodeswillhandlebothalotofindexingaswellasqueriesforrecentdata.Thecoldtier,ontheotherhand,willcontainnodesthathaveverylargedisks,butarenotveryfast.Wewon’tbeindexingintothecoldtier;wewillonlystoreourhistoricalindiceshereandsearchthemfromtimetotime.WiththedefaultElasticsearchbehavior,wecan’tbesurewheretheshardsandreplicaswillbeplaced,butluckilyElasticsearchallowsustocontrolthis.
NoteThemainassumptionwhenitcomestotimeseriesdataisthatoncetheyareindexed,theyarenotbeingupdated.ThisistrueforlogindexingusecasesandweassumewecreateElasticsearchdeploymentforsuchausecase.
Theideaistocreatetheindicesthatindextoday’sdataonthehotnodesand,whenwestopusingit(whenanotherdaystarts),weupdatetheindexsettingssothatitismovedtothetiercalledcold.Let’snowseehowwecandothis.
SpecifyingnodeparametersSolet’sdivideourclusterintotwotiers.Wesaytiers,buttheycanbeanynameyouwant,wejustliketheterm“tier”anditiscommonlyused.Weassumethatwehavesixnodes.Wewantourmorepowerfulnodesnumbered1and2tobeplacedinthetiercalledhotandthenodesnumbered3,4,5,and6,whicharesmallerintermsofCPUandmemory,butverylargeintermsofdiskspace,tobeplacedinatiercalledcold.
ConfigurationToconfigure,weaddthefollowingpropertytotheelasticsearch.ymlconfigurationfileonnodes1and2(theonesthataremorepowerful):
node.tier:hot
Ofcourse,wewilladdasimilarpropertytotheelasticsearch.ymlconfigurationfileonnodes3,4,5,and6(thelesspowerfulones):
node.tier:cold
IndexcreationNowlet’screateourdailyindexfortoday’sdata,onecalledlogs_2015-12-10.Aswesaid
www.EBooksWorld.ir
![Page 613: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/613.jpg)
earlier,wewantthistobeplacedonthenodesinthehottier.Wedothisbyrunningthefollowingcommands:
curl-XPUT'http://localhost:9200/logs_2015-12-10'-d'{
"settings":{
"index":{
"routing.allocation.include.tier":"hot"
}
}
}'
Theprecedingcommandwillresultinthecreationofthelogs_2015-12-10indexandspecificationoftheindex.routing.allocation.include.tierpropertytoit.Wesetthispropertytothehotvalue,whichmeansthatwewanttoplacethelogs_2015-12-10indexonthenodesthathavethenode.tierpropertysettohot.
Now,whenthedayendsandweneedtocreateanewindex,weagainputitonthehotnodes.Wedothisbyrunningthefollowingcommand:
curl-XPUT'http://localhost:9200/logs_2015-12-11'-d'{
"settings":{
"index":{
"routing.allocation.include.tier":"hot"
}
}
}'
Finally,weneedtotellElasticsearchtomovetheindexholdingthedataforthepreviousdaytothecoldtier.Wedothisbyupdatingtheindexsettingsandsettingtheindex.routing.allocation.include.tierpropertytocold.Thisisdoneusingthefollowingcommand:
curl-XPUT'http://localhost:9200/logs_2015-12-10/_settings'-d'{
"index.routing.allocation.include.tier":"cold"
}'
Afterrunningtheprecedingcommand,Elasticsearchwillstartrelocatingtheindexcalledlogs_2015-12-10tothenodesthathavethenode.tierpropertysettocoldintheelasticsearch.ymlfilewithoutanymanualworkneededfromus.
ExcludingnodesfromallocationInthesamemanneraswespecifiedonwhichnodestheindexshouldbeplaced,wecanalsoexcludenodesfromindexallocation.Referringtothepreviouslyshownexample.ifwewanttheindexcalledlogs_2015-12-10tonotbeplacedonthenodeswiththenode.tierpropertysettocold,wewouldrunthefollowingcommand:
curl-XPUT'localhost:9200/logs_2015-12-10/_settings'-d'{
"index.routing.allocation.exclude.tier":"cold"
}'
Noticethatinsteadoftheindex.routing.allocation.include.tierproperty,we’veusedtheindex.routing.allocation.exclude.tierproperty.
www.EBooksWorld.ir
![Page 614: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/614.jpg)
RequiringnodeattributesInadditiontoinclusionandexclusionrules,wecanalsospecifytherulesthatmustmatchinorderforashardtobeallocatedtoagivennode.Thedifferenceisthatwhenusingtheindex.routing.allocation.includeproperty,theindexwillbeplacedonanynodethatmatchesatleastoneoftheprovidedpropertyvalues.Usingindex.routing.allocation.require,Elasticsearchwillplacetheindexonanodethathasallthedefinedvalues.Forexample,let’sassumethatwe’vesetthefollowingsettingsforthelogs_2015-12-10index:
curl-XPUT'localhost:9200/logs_2015-12-10/_settings'-d'{
"index.routing.allocation.require.tier":"hot",
"index.routing.allocation.require.disk_type":"ssd"
}'
Afterrunningtheprecedingcommand,Elasticsearchwouldonlyplacetheshardsofthelogs_2015-12-10indexonanodewiththenode.tierpropertysettohotandthenode.disk_typepropertysettossd.
UsingtheIPaddressforshardallocationInsteadofaddingaspecialparametertothenodesconfiguration,weareallowedtouseIPaddressestospecifywhichnodeswewanttoincludeorexcludefromtheshardsandreplicasallocation.Inordertodothis,insteadofusingthetierpartoftheindex.routing.allocation.include.tierorindex.routing.allocation.exclude.tierproperties,weshouldusethe_ip.Forexample,ifwewouldlikeourlogs_2015-12-10indextobeplacedonlyonthenodeswiththe10.1.2.10and10.1.2.11IPaddresses,wewouldrunthefollowingcommand:
curl-XPUT'localhost:9200/logs_2015-12-10/_settings'-d'{
"index.routing.allocation.include._ip":"10.1.2.10,10.1.2.11"
}'
NoteInadditionto_ip,Elasticsearchalsoallowsustouse_nametospecifyallocationrulesusingnodenamesand_hosttospecifyallocationrulesusinghostnames.
Disk-basedshardallocationInadditiontothealreadydescribedallocationfilteringmethods,Elasticsearchgivesusdisk-basedshardallocationrules.Itallowsustosetallocationrulesbasedonthenodes’diskusage.
Configuringdiskbasedshardallocation
Therearefourpropertiesthatcontrolthebehaviorofadisk-basedshardallocation.Allofthemcanbeupdateddynamicallyorsetintheelasticsearch.ymlconfigurationfile.
Thefirstoftheseiscluster.info.update.interval,whichisbydefaultsetto30secondsanddefineshowoftenElasticsearchupdatesinformationaboutdiskusageonnodes.
www.EBooksWorld.ir
![Page 615: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/615.jpg)
Thesecondpropertyisthecluster.routing.allocation.disk.watermark.low,whichisbydefaultsetto0.85.ThismeansthatElasticsearchwillnotallocatenewshardstoanodethatusesmorethan85%ofitsdiskspace.
Thethirdpropertyisthecluster.routing.allocation.disk.watermark.high,whichcontrolswhenElasticsearchwillstartrelocatingshardsfromagivennode.Itdefaultsto0.90andmeansthatElasticsearchwillstartreallocatingshardswhenthediskusageonagivennodeisequaltoormorethan90%.
Boththecluster.routing.allocation.disk.watermark.lowandcluster.routing.allocation.disk.watermark.highpropertiescanbesettoapercentagevalue(suchas0.60,meaning60%)andtoanabsolutevalue(suchas600mb,meaning600megabytes).
Finally,thelastpropertyiscluster.routing.allocation.disk.include_relocations,whichbydefaultissettotrue.IttellsElasticsearchtotakeintoaccounttheshardsthatarenotyetcopiedtothenodebutElasticsearchisintheprocessofdoingthat.Havingthisbehaviorturnedonbydefaultmeansthatthedisk-basedallocationmechanismwillbemorepessimisticwhenitcomestoavailablediskspaces(whenshardsarerelocating),butwewon’trunintosituationswhereshardscan’tberelocatedbecausetheassumptionsaboutdiskspacewerewrong.
Disablingdiskbasedshardallocation
Thediskbasedshardallocationisenabledbydefault.Wecandisableitbyspecifyingthecluster.routing.allocation.disk.threshold_enabledpropertyandsettingittofalse.Wecandothisintheelasticsearch.ymlfileordynamicallyusingtheclustersettingsAPI:
curl-XPUTlocalhost:9200/_cluster/settings-d'{
"transient":{
"cluster.routing.allocation.disk.threshold_enabled":false
}
}'
www.EBooksWorld.ir
![Page 616: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/616.jpg)
ThenumberofshardsandreplicaspernodeInadditiontospecifyingshardsandreplicasallocation,wearealsoallowedtospecifythemaximumnumberofshardsthatcanbeplacedonasinglenodeforasingleindex.Forexample,ifwewouldlikeourlogs_2015-12-10indextohaveonlyasingleshardpernode,wewouldrunthefollowingcommand:
curl-XPUT'localhost:9200/logs_2015-12-10/_settings'-d'{
"index.routing.allocation.total_shards_per_node":1
}'
Thispropertycanbeplacedintheelasticsearch.ymlfileorcanbeupdatedonliveindicesusingtheprecedingcommand.PleaserememberthatyourclustercanstayintheredstateifElasticsearchwon’tbeabletoallocatealltheprimaryshards.
www.EBooksWorld.ir
![Page 617: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/617.jpg)
AllocationthrottlingTheElasticsearchallocationmechanismcanbethrottled,whichmeansthatwecancontrolhowmuchresourcesElasticsearchwilluseduringtheshardallocationandrecoveryprocess.Wearegivenfivepropertiestocontrol,whichareasfollows:
cluster.routing.allocation.node_concurrent_recoveries:Thispropertydefineshowmanyconcurrentshardrecoveriesmaybehappeningatthesametimeonanode.Thisdefaultsto2andshouldbeincreasedifyouwouldlikemoreshardstoberecoveredatthesametimeonasinglenode.However,increasingthisvaluewillresultinmoreresourceconsumptionduringrecovery.Also,pleaserememberthatduringthereplicarecoveryprocess,datawillbecopiedfromtheothernodesoverthenetwork,whichcanbeslow.cluster.routing.allocation.node_initial_primaries_recoveries:Thispropertydefaultsto4anddefineshowmanyprimaryshardsarerecoveredatthesametimeonagivennode.Becauseprimaryshardrecoveryusesdatafromlocaldisks,thisprocessshouldbeveryfast.cluster.routing.allocation.same_shard.host:ABooleanpropertythatdefaultstofalseandisapplicableonlywhenmultipleElasticsearchnodesarestartedonthesamemachine.Whensettotrue,thiswillforceElasticsearchtocheckwhetherphysicalcopiesofthesameshardarepresentonasinglephysicalmachine.Thedefaultfalsevaluemeansnocheckisdone.indices.recovery.concurrent_streams:Thisisthenumberofnetworkstreamsusedtocopydatafromothernodesthatcanbeusedconcurrentlyonasinglenode.Themorethestreams,thefasterthedatawillbecopied,butthiswillresultinmoreresourceconsumption.Thispropertydefaultsto3.indices.recovery.concurrent_small_file_streams:Thisissimilartotheindices.recovery.concurrent_streamsproperty,butdefineshowmanyconcurrentdatastreamsElasticsearchwillusetocopysmallfiles(onesthatareunder5mbinsize).Thispropertydefaultsto2.
Thisallowsustoperformachecktopreventtheallocationofmultipleinstancesofthesameshardonasinglehost,basedonhostnameandhostaddress.Thisdefaultstofalse,meaningthatnocheckisperformedbydefault.Thissettingonlyappliesifmultiplenodesarestartedonthesamemachine.
www.EBooksWorld.ir
![Page 618: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/618.jpg)
Cluster-wideallocationInadditiontotheperindicesallocationsettings,Elasticsearchalsoallowsustocontrolshardandindicesallocationonacluster-widebasis—socalledshardallocationawareness.Thisisespeciallyusefulwhenwehavenodesindifferentphysicalracksandwewouldliketoplaceshardsandreplicasindifferentphysicalnodes.
Let’sstartwithasimpleexample.Weassumethatwehaveaclusterbuiltoffournodes.Eachnodeinadifferentphysicalrack.Thesimplegraphicthatillustratesthisisasfollows:
Asyoucansee,ourclusterisbuiltfromfournodes.EachnodewasboundtoaspecificIPaddressandeachnodewasgiventhetagpropertyandagroupproperty(addedtoelasticsearch.ymlasthenode.tagandnode.groupproperties).Thisclusterwillservethepurposeofshowinghowshardallocationfilteringworks.Thegroupandtagpropertiescanbegivenwhatevernamesyouwant,youjustneedtoprefixyourdesiredpropertynamewiththenodename,forexample,ifyouwouldliketouseapartypropertyname,youwouldjustaddnode.party:party1toyourelasticsearch.yml.
AllocationawarenessAllocationawarenessallowsustoconfigureshardsandtheirreplicasallocationwiththeuseofgenericparameters.Inordertoillustratehowallocationawarenessworks,wewilluseourexamplecluster.Fortheexampletowork,weshouldaddthefollowingpropertytotheelasticsearch.ymlfile:
www.EBooksWorld.ir
![Page 619: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/619.jpg)
cluster.routing.allocation.awareness.attributes:group
ThiswilltellElasticsearchtousethenode.grouppropertyastheawarenessparameter.
NoteYoucanspecifymultipleattributeswhensettingthecluster.routing.allocation.awareness.attributesproperty.Forexample:cluster.routing.allocation.awareness.attributes:group,node
Afterthis,let’sstartthefirsttwonodes,theoneswiththenode.groupparameterequaltogroupA,andlet’screateanindexbyrunningthefollowingcommand:
curl-XPOST'localhost:9200/awarness'-d'{
"settings":{
"index":{
"number_of_shards":1,"number_of_replicas":1
}
}
}'
Afterthiscommand,ourtwo-nodeclusterwilllookmoreorlesslikethis:
Asyoucansee,theindexwasdividedbetweenthetwonodesevenly.Nowlet’sseewhathappenswhenwelaunchtherestofthenodes(theoneswithnode.groupsettogroupB):
www.EBooksWorld.ir
![Page 620: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/620.jpg)
Noticethedifference—theprimaryshardswerenotmovedfromtheiroriginalallocationnodes,butthereplicashardsweremovedtothenodeswithadifferentnode.groupvalue.That’sexactlyright;whenusingshardallocationawareness,Elasticsearchwon’tallocatetheprimaryshardsandreplicasofthesameindextothenodeswiththesamevalueofthepropertyusedtodeterminetheallocationawareness(whichinourcaseisthenode.group).
NotePleaserememberthatwhenusingallocationawareness,shardswillnotbeallocatedtothenodethatdoesn’thavetheexpectedattributesset.Soinourexample,anodewithoutthenode.grouppropertysetwillnotbetakenintoconsiderationbytheallocationmechanism.
ForcingallocationawarenessForcingallocationawarenesscancomeinhandywhenweknow,inadvance,howmanyvaluesourawarenessattributescantakeandwedon’twantmorereplicasthanneededtobeallocatedinourcluster,forexample,nottooverloadourclusterwithtoomanyreplicas.Forthis,wecanforcetheallocationawarenesstobeactiveonlyforcertainattributes.Wecanspecifythesevaluesusingthecluster.routing.allocation.awareness.force.zone.valuespropertyandprovidingalistofcomma-separatedvaluestoit.Forexample,ifwewouldliketheallocationawarenesstouseonlythegroupAandgroupBvaluesofthenode.groupproperty,wewouldaddthefollowingtotheelasticsearch.ymlfile:
www.EBooksWorld.ir
![Page 621: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/621.jpg)
cluster.routing.allocation.awareness.attributes:group
cluster.routing.allocation.awareness.force.zone.values:groupA,groupB
FilteringElasticsearchallowsustoconfigureallocationfortheentireclusterorfortheindexlevel.Inthecaseofclusterallocation,wecanusethepropertiesprefixes:
cluster.routing.allocation.include
cluster.routing.allocation.require
cluster.routing.allocation.exclude
Whenitcomestoindex-specificallocation,wecanusethefollowingpropertiesprefixes:
index.routing.allocation.include
index.routing.allocation.require
index.routing.allocation.exclude
Thepreviouslymentionedprefixescanbeusedwiththepropertiesthatwe’vedefinedintheelasticsearch.ymlfile(ourtagandgroupproperties)andwithaspecialpropertycalled_ipthatallowsustomatchorexcludetheuseofthenodes’IPaddresses,forexample,likethis:
cluster.routing.allocation.include._ip:192.168.2.1
IfwewouldliketoincludenodeswithagrouppropertymatchingthegroupAvalue,wewouldsetthefollowingproperty:
cluster.routing.allocation.include.group:groupA
Noticethatwe’veusedthecluster.routing.allocation.includeprefixandwe’veconcatenateditwiththenameoftheproperty,whichisgroupinourcase.
Whatdoinclude,exclude,andrequiremean
Ifyoulookcloselyattheprecedingparameters,youwillnoticethattherearethreekinds:
include:Thistypewillresultinincludingallthenodeswiththisparameterdefined.Ifmultipleincludeconditionsarevisiblethanallthenodesthatmatchatleastaoneoftheseconditionswillbetakenintoconsiderationwhenallocatingshards.Forexample,ifweaddtwocluster.routing.allocation.include.tagparameterstoourconfiguration,onewithapropertywiththevalueofnode1andsecondwiththenode2value,wewouldendupwithindices(actuallytheirshards)beingallocatedtothefirstandsecondnode(countingfromlefttoright).TosumupthenodesthathavetheincludeallocationparametertypewillbetakenintoconsiderationbyElasticsearchwhenchoosingthenodestoplaceshardson,butthisdoesn’tmeanthatElasticsearchwillputshardsinthem.require:Thisparameter,whichwasintroducedintheElasticsearch0.90typeofallocationfilter,requiresallthenodestohaveavaluethatmatchesthevalueofthisproperty.Forexample,ifweaddonecluster.routing.allocation.require.tagparametertoourconfigurationwiththevalueofnode1andacluster.routing.allocation.require.groupparameterwiththevalueofgroupA,
www.EBooksWorld.ir
![Page 622: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/622.jpg)
wewouldendupwithshardsallocatedonlytothefirstnode(theonewithanIPaddressof192.168.2.1).exclude:Thisparameterallowsustoexcludenodeswithgivenpropertiesfromtheallocationprocess.Forexample,ifwesetcluster.routing.allocation.include.tagtogroupA,wewouldendupwithindicesbeingallocatedonlytothenodeswithIPaddresses192.168.3.1and192.168.3.2(thethirdandfourthnodesinourexample).
NoteThepropertyvaluecanusesimplewildcardcharacters.Forexample,ifwewanttoincludeallthenodesthathavethegroupparametervaluebeginningwithgroup,wecouldsetthecluster.routing.allocation.include.grouppropertytogroup*.Intheexampleclustercase,thiswouldresultinmatchingnodeswiththegroupAandgroupBgroupparametervalues.
www.EBooksWorld.ir
![Page 623: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/623.jpg)
ManuallymovingshardsandreplicasThelastthingwewantedtodiscussistheabilitytomanuallymoveshardsbetweennodes.Elasticsearchexposesthe_cluster/rerouteRESTend-point,whichallowsustocontrolthat.Thefollowingoperationsareavailable:
MovingashardfromnodetonodeCancellingshardallocationForcingshardallocation
Nowlet’slookcloselyatalloftheprecedingoperations.
MovingshardsLet’ssaywehavetwonodescalledes_node_oneandes_node_two,andwehavetwoshardsoftheshopindexplacedbyElasticsearchonthefirstnodeandwewouldliketomovethesecondshardtothesecondnode.Inordertodothis,wecanrunthefollowingcommand:
curl-XPOST'localhost:9200/_cluster/reroute'-d'{
"commands":[{
"move":{
"index":"shop",
"shard":1,
"from_node":"es_node_one",
"to_node":"es_node_two"
}
}]
}'
We’vespecifiedthemovecommand,whichallowsustomoveshards(andreplicas)oftheindexspecifiedbytheindexproperty.Theshardpropertyisthenumberofshardswewanttomove.And,finally,thefrom_nodepropertyspecifiesthenameofthenodewewanttomovetheshardfromandtheto_nodepropertyspecifiesthenameofthenodewewanttheshardtobeplacedon.
CancelingshardallocationIfwewouldliketocancelanon-goingallocationprocess,wecanrunthecancelcommandandspecifytheindex,node,andshardwewanttocanceltheallocationfor.Forexample:
curl-XPOST'localhost:9200/_cluster/reroute'-d'{
"commands":[{
"cancel":{
"index":"shop",
"shard":0,
"node":"es_node_one"
}
}]
}'
Theprecedingcommandwouldcanceltheallocationofshard0oftheshopindexonthe
www.EBooksWorld.ir
![Page 624: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/624.jpg)
es_node_onenode.
ForcingshardallocationInadditiontocancellingandmovingshardsandreplicas,wearealsoallowedtoallocateanunallocatedshardtoaspecificnode.Forexample,ifwehaveanunallocatedshardnumbered0fortheusersindexandwewouldlikeittobeallocatedtoes_node_twobyElasticsearch,wewouldrunthefollowingcommand:
curl-XPOST'localhost:9200/_cluster/reroute'-d'{
"commands":[{
"allocate":{
"index":"users",
"shard":0,
"node":"es_node_two"
}
}]
}'
MultiplecommandsperHTTPrequestWecan,ofcourse,includemultiplecommandsinasingleHTTPrequest.Forexample:
curl-XPOST'localhost:9200/_cluster/reroute'-d'{
"commands":[
{"move":{"index":"shop","shard":1,"from_node":"es_node_one",
"to_node":"es_node_two"}},
{"cancel":{"index":"shop","shard":0,"node":"es_node_one"}}
]
}'
AllowingoperationsonprimaryshardsThecancelandallocatecommandsacceptanadditionalallow_primaryparameter.Ifsettotrue,ittellsElasticsearchthattheoperationcanbeperformedontheprimaryshard.Pleasebeadvisedthatoperationswiththeallow_primaryparametersettotruemayresultindataloss.
www.EBooksWorld.ir
![Page 625: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/625.jpg)
HandlingrollingrestartsThereisonemorethingthatwewouldliketodiscusswhenitcomestoshardandreplicaallocation—handlingrollingrestarts.WhenElasticsearchisrestarted,itmaytakesometimetogetitbacktothecluster.Duringthistime,therestoftheclustermaydecidetodorebalancingandmoveshardsaround.Whenweknowwearedoingrollingrestarts,forexample,toupdateElasticsearchtoanewversionorinstallaplugin,wemaywanttotellthistoElasticsearch.Theprocedureforrestartingeachnodeshouldbeasfollows:
First,beforeyoudoanymaintenance,youshouldstoptheallocationbysendingthefollowingcommand:
curl-XPUT'localhost:9200/_cluster/settings'-d'{
"transient":{
"cluster.routing.allocation.enable":"none"
}
}'
ThiswilltellElasticsearchtostopallocation.Afterthis,wewillstopthenodewewanttodomaintenanceonandstartitagain.Afteritjoinsthecluster,wecanenabletheallocationagainbyrunningthefollowing:
curl-XPUT'localhost:9200/_cluster/settings'-d'{
"transient":{
"cluster.routing.allocation.enable":"all"
}
}'
Thiswillenabletheallocationagain.Thisprocedureshouldberepeatedforeachnodewewanttoperformmaintenanceon.
www.EBooksWorld.ir
![Page 626: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/626.jpg)
www.EBooksWorld.ir
![Page 627: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/627.jpg)
ControllingclusterrebalancingBydefault,Elasticsearchtriestokeeptheshardsandtheirreplicasevenlybalancedacrossthecluster.Suchbehaviorisgoodinmostcases,buttherearetimeswhenwewanttocontrolthisbehavior—forexample,duringrollingrestarts.Wedon’twanttorebalancetheentireclusterwhenoneortwonodesarerestarted.Inthissection,wewilllookathowtoavoidclusterrebalanceandcontrolthisprocess’behaviorindepth.
Imagineasituationwhereyouknowthatyournetworkcanhandleveryhighamountsoftrafficortheoppositeofthis—yournetworkisusedextensivelyandyouwanttoavoidtoomuchloadonit.TheotherexampleisthatyoumaywanttodecreasethepressurethatisputonyourI/Osubsystemafterafull-clusterrestartandyouwanttohavelessshardsandreplicasbeinginitializedatthesametime.Theseareonlytwoexampleswhererebalancecontrolmaybehandy.
www.EBooksWorld.ir
![Page 628: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/628.jpg)
UnderstandingrebalanceRebalancingistheprocessofmovingshardsbetweendifferentnodesinourcluster.Aswehavealreadymentioned,itisfineinmostsituations,butsometimesyoumaywanttocompletelyavoidthis.Forexample,ifwedefinehowourshardsareplacedandwewanttokeepitthisway,wemaywanttoavoidrebalancing.However,bydefault,ElasticsearchwilltrytorebalancetheclusterwhenevertheclusterstatechangesandElasticsearchthinksarebalanceisneeded(andthedelayedtimeouthaspassedasdiscussedinThegatewayandrecoverymodulessectionofChapter9,ElasticsearchClusterinDetail).
www.EBooksWorld.ir
![Page 629: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/629.jpg)
ClusterbeingreadyWealreadyknowthatourindicesarebuiltfromshardsandreplicas.Primaryshardsorjustshardsaretheonesthatgetthedatafirst.Thereplicasarephysicalcopiesoftheprimariesandgetthedatafromthem.Youcanthinkoftheclusterasbeingreadytobeusedwhenalltheprimaryshardsareassignedtotheirnodesinyourcluster–assoonastheyellowhealthstateisachieved.However,Elasticsearchmaystillinitializeothershards–thereplicas.However,youcanuseyourclusterandbesurethatyoucansearchyourentiredatasetandsendindexchangecommands.Thenthecommandswillbeprocessedproperly.
www.EBooksWorld.ir
![Page 630: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/630.jpg)
TheclusterrebalancesettingsElasticsearchletsuscontroltherebalanceprocesswiththeuseofafewpropertiesthatcanbesetintheelasticsearch.ymlfileorbyusingtheElasticsearchRESTAPI(asdescribedinTheupdatesettingsAPIsectionofChapter9,ElasticsearchClusterinDetail).
ControllingwhenrebalancingwillbeallowedThecluster.routing.allocation.allow_rebalancepropertyallowsustospecifywhenrebalancingisallowed.Thispropertycantakethefollowingvalues:
always:Rebalancingwillbeallowedassoonasit’sneededindices_primaries_active:Rebalancingwillbeallowedwhenalltheprimaryshardsareinitializedindices_all_active:Thedefaultone,whichmeansthatrebalancingwillbeallowedwhenalltheshardsandreplicasareinitialized
Thecluster.routing.allocation.allow_rebalancepropertycanbesetintheelasticsearch.ymlconfigurationfileandupdateddynamicallyaswell.
ControllingthenumberofshardsbeingmovedbetweennodesconcurrentlyThecluster.routing.allocation.cluster_concurrent_rebalancepropertyallowsustospecifyhowmanyshardscanbemovedbetweennodesatonceintheentirecluster.Ifyouhaveaclusterthatisbuiltfrommanynodes,youcanincreasethisvalue.Thisvaluedefaultsto2.Youcanincreasethedefaultvalueifyouwouldliketherebalancingtobeperformedfaster,butthiswillputmorepressureonyourclusterresourcesandwillaffectindexingandquerying.Thecluster.routing.allocation.cluster_concurrent_rebalancepropertycanbesetintheelasticsearch.ymlconfigurationfileandupdateddynamicallyaswell.
ControllingwhichshardsmayberebalancedThecluster.routing.allocation.enablepropertyallowsustospecifywhenwhichshardswillbeallowedtoberebalancedbyElasticsearch.Thispropertycantakethefollowingvalues:
all:Thedefaultbehavior,whichtellsElasticsearchtorebalancealltheshardsintheclusterprimaries:Thisvalueallowstherebalancingoftheprimaryshardsonlyreplicas:Thisvalueallowstherebalancingofthereplicashardsonlynone:Thisvaluedisablestherebalancingofalltypeofshardsforallindicesinthecluster
Thecluster.routing.allocation.enablepropertycanbesetintheelasticsearch.ymlconfigurationfileandupdateddynamicallyaswell.
www.EBooksWorld.ir
![Page 631: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/631.jpg)
www.EBooksWorld.ir
![Page 632: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/632.jpg)
TheCatAPITheElasticsearchAdminAPIisquiteextensiveandcoversalmosteverypartofElasticsearcharchitecture:fromlow-levelinformationaboutLucenetohigh-levelonesabouttheclusternodesandtheirhealth.AllthisinformationisavailableusingtheElasticsearchJavaAPIaswellastheRESTAPI.However,thereturneddata,eventhoughitisaJSONdocument,isnotveryreadablebyauser,atleastwhenitcomestotheamountofinformationgiven.
Becauseofthis,Elasticsearchprovidesuswithamorehuman-friendlyAPI–theCatAPI.ThespecialCatAPIreturnsdatainasimpletext,tabularformatandwhat’smore–itprovidesaggregateddatathatisusuallyusablewithoutanyfurtherprocessing.
www.EBooksWorld.ir
![Page 633: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/633.jpg)
ThebasicsThebaseendpointfortheCatAPIisquiteobvious:itis/_cat.Withoutanyparameters,itshowsalltheavailableendpointsforthisAPI.Wecancheckthisbyrunningthefollowingcommand:
curl-XGET'localhost:9200/_cat'
TheresponsereturnedbyElasticsearchshouldbesimilaroridentical(dependingonyourElasticsearchversion)tothefollowingone:
=^.^=
/_cat/allocation
/_cat/shards
/_cat/shards/{index}
/_cat/master
/_cat/nodes
/_cat/indices
/_cat/indices/{index}
/_cat/segments
/_cat/segments/{index}
/_cat/count
/_cat/count/{index}
/_cat/recovery
/_cat/recovery/{index}
/_cat/health
/_cat/pending_tasks
/_cat/aliases
/_cat/aliases/{alias}
/_cat/thread_pool
/_cat/plugins
/_cat/fielddata
/_cat/fielddata/{fields}
/_cat/nodeattrs
/_cat/repositories
/_cat/snapshots/{repository}
SolookingfromthetopElasticsearchallowsustogetthefollowinginformationusingtheCatAPI:
Shardallocation-relatedinformationAllshards-relatedinformation(alsoonelimitedtoagivenindex)InformationaboutthemasternodeNodesinformationIndicesstatistics(alsoonelimitedtoagivenindex)Segmentsstatistics(alsoonelimitedtoagivenindex)Documentscount(alsoonelimitedtoagivenindex)Recoveryinformation(alsoonelimitedtoagivenindex)ClusterhealthTaskspendingforexecutionIndexaliasesandindicesforagivenaliasThreadpoolconfiguration
www.EBooksWorld.ir
![Page 634: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/634.jpg)
PluginsinstalledoneachnodeFielddatacachesizeandfielddatacachesizesforindividualfieldsNodeattributesinformationDefinedbackuprepositoriesSnapshotscreatedinthebackuprepository
www.EBooksWorld.ir
![Page 635: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/635.jpg)
UsingCatAPIUsingtheCatAPIisassimpleasrunningtheGETrequesttotheoneofthepreviouslymentionedRESTend-points.Forexample,togetinformationabouttheclusterstate,wecouldrunthefollowingcommand:
curl-XGET'localhost:9200/_cat/health'
TheresponsereturnedbyElasticsearchfortheprecedingcommandshouldbesimilartothefollowingone,but,ofcourse,willbedependentonyourcluster:
144629204112:47:21elasticsearchyellow11212100210-50.0%
Thisiscleanandnice.Becauseitisintabularformat,itisalsoeasytousetheresponseintoolssuchasgrep,awk,orsed–astandardsetoftoolsforeveryadministrator.Itisalsomorereadableonceyouknowwhatitisallabout.
Toaddaheaderdescribingeachcolumnpurpose,wejustneedtoaddanadditionalvparameter,justlikethis:
curl-XGET'localhost:9200/_cat/health?v'
CommonargumentsEveryCatAPIendpointhasitsownarguments,butthereareafewcommonoptionsthataresharedamongallofthem:
v:Thisaddsaheaderlinetotheresponsewiththenamesofpresenteditems.h:Thisallowsustoshowonlythechosencolumns,forexampleh=status,node.total,shards,pri.help:Thislistsallthepossiblecolumnsthatthisparticularendpointisabletoshow.Thecommandshowsthenameoftheparameter,itsabbreviation,anddescription.bytes:Thisistheformatfortheinformationrepresentingthevaluesinbytes.Aswesaidearlier,theCatAPIisdesignedtobeusedbyhumansandbecauseofthis,bydefault,thesevaluesarerepresentedinhuman-readableform,forexample:3.5kBor40GB.Thebytesoptionallowsthesettingofthesamebaseforallthenumbers,sosortingornumericalcomparisonwillbeeasier.Forexample,bytes=bpresentsallvaluesinbytes,bytes=kinkilobytes,andsoon.
NoteForthefulllistofargumentsforeachCatAPIendpoint,pleaserefertotheofficialElasticsearchdocumentationavailableat:https://www.elastic.co/guide/en/elasticsearch/reference/2.2/cat.html.
www.EBooksWorld.ir
![Page 636: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/636.jpg)
TheexamplesWhenwewrotethisbook,theCatAPIhadtwenty-twoendpoints.Wedon’twanttodescribethemall–itwouldbearepeatofinformationcontainedinthedocumentationanditdoesn’tmakesense.However,wedidn’twanttoleavethissectionwithoutanexampleregardingtheusageoftheCatAPI.Becauseofthis,wedecidedtoshowhoweasilyyoucangetinformationusingtheCatAPIcomparedtothestandardJSONAPIexposedbyElasticsearch.
GettinginformationaboutthemasternodeThefirstexampleshowshoweasyitistogetinformationaboutwhichnodeinourclusteristhemasternode.Bycallingthe/_cat/masterRESTendpointwecangetinformationaboutthenodesandwhichoneofthemiscurrentlybeingelectedasamaster.Forexample,let’srunthefollowingcommand:
curl-XGET'localhost:9200/_cat/master?v'
TheresponsereturnedbyElasticsearchformylocaltwo-nodeclusterlooksasfollows:
idhostipnode
Cfj3tzqpSNi5SZx4g8osAg127.0.0.1127.0.0.1Skin
Asyoucanseeinresponse,we’vegottheinformationaboutwhichnodeiscurrentlyelectedasthemaster:wecanseeitsidentifier,IPaddress,andname.
GettinginformationaboutthenodesThe/_cat/nodesRESTendpointprovidesinformationaboutallthenodesinthecluster.Let’sseewhatElasticsearchwillreturnafterrunningthefollowingcommand:
curl-XGET'localhost:9200/_cat/nodes?v&h=name,node.role,load,uptime'
Intheprecedingexample,wehaveusedthepossibilityofchoosingwhatinformationwewanttogetfromtheapproximatelyseventyoptionsofthisendpoint.Wehavechosentogetonlythenodename,itsrole—whetherthenodeisadataorclientnode-,nodeload,anditsuptime.
AndtheresponsereturnedbyElasticsearchlooksasfollows:
namenode.roleloaduptime
Skind2.001.3h
Asyoucansee,the/_cat/nodesRESTendpointprovidesalltherequestedinformationaboutthenodesinthecluster.
RetrievingrecoveryinformationforanindexAnotherniceexampleofusingtheCatAPIisgettinginformationabouttherecoveryofasingleindexoralltheindices.Inourcase,wewillretrieverecoveryinformationforasinglelibraryindexbyrunningthefollowingcommand:
curl-XGET'localhost:9200/_cat/recovery/library?
www.EBooksWorld.ir
![Page 637: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/637.jpg)
v&h=index,shard,time,type,stage,files_percent'
Theresponsefortheprecedingcommandlooksasfollows:
indexshardtimetypestagefiles_percent
library075storedone100.0%
library183storedone100.0%
library288storedone100.0%
library379storedone100.0%
library45storedone100.0%
www.EBooksWorld.ir
![Page 638: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/638.jpg)
www.EBooksWorld.ir
![Page 639: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/639.jpg)
WarmingupSometimes,theremaybeaneedtoprepareElasticsearchtohandleyourqueries.Maybeit’sbecauseyouheavilyrelyonthefielddatacacheandyouwantittobeloadedbeforeyourproductionqueriesarrive,ormaybeyouwanttowarmupyouroperatingsystem’sI/Ocachesothatthedataindicesfilesarereadfromthecache.Whateverthereason,Elasticsearchallowsustousesocalledwarmingqueriesforourtypesandindices.
www.EBooksWorld.ir
![Page 640: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/640.jpg)
DefininganewwarmingqueryAwarmingqueryisnothingmorethantheusualquerystoredinaspecialtypecalled_warmerinElasticsearch.Let’sassumethatwehavethefollowingquerythatwewanttouseforwarmingup:
curl-XGETlocalhost:9200/library/_search?pretty-d'{
"query":{
"match_all":{}
},
"aggs":{
"warming_aggs":{
"terms":{
"field":"tags"
}
}
}
}'
Tostoretheprecedingqueryasawarmingqueryforourlibraryindex,wewillrunthefollowingcommand:
curl-XPUT'localhost:9200/library/_warmer/tags_warming_query'-d'{
"query":{
"match_all":{}
},
"aggs":{
"warming_aggs":{
"terms":{
"field":"tags"
}
}
}
}'
Theprecedingcommandwillregisterourqueryasawarmingquerywiththetags_warming_queryname.Youcanhavemultiplewarmingqueriesforyourindex,buteachofthesequeriesneedstohaveauniquename.
Wecannotonlydefinewarmingqueriesfortheentireindex,butalsoforthespecifictypeinit.Forexample,tostoreourpreviouslyshownqueryasthewarmingqueryonlyforthebooktypeinthelibraryindex,runtheprecedingcommandnottothe/library/_warmerURIbutto/library/book/_warmer.So,theentirecommandwillbeasfollows:
curl-XPUT'localhost:9200/library/book/_warmer/tags_warming_query'-d'{
"query":{
"match_all":{}
},
"aggs":{
"warming_aggs":{
"terms":{
"field":"tags"
}
}
www.EBooksWorld.ir
![Page 641: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/641.jpg)
}
}'
Afteraddingawarmingquery,beforeElasticsearchallowsanewsegmenttobesearchedon,itwillbewarmedupbyrunningthedefinedwarmingqueriesonthatsegment.ThisallowsElasticsearchandtheoperatingsystemtocachedataand,thus,speedupsearching.
JustaswereadintheFulltextsearchingsectionofChapter1,GettingStartedwithElasticsearchCluster,Lucenedividestheindexintopartscalledsegments,whichoncewrittencan’tbechanged.Everynewcommitoperationcreatesanewsegment(whichiseventuallymergedifthenumberofsegmentsistoohigh),whichLuceneusesforsearching.
NotePleasenotethattheWarmerAPIwillberemovedinthefutureversionsofElasticsearch.
www.EBooksWorld.ir
![Page 642: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/642.jpg)
RetrievingthedefinedwarmingqueriesInordertogetaspecificwarmingqueryforourindex,wejustneedtoknowitsname.Forexample,ifwewanttogetthewarmingquerynamedastags_warming_queryforourlibraryindex,wewillrunthefollowingcommand:
curl-XGET'localhost:9200/library/_warmer/tags_warming_query?pretty'
TheresultreturnedbyElasticsearchwillbeasfollows:
{
"library":{
"warmers":{
"tags_warming_query":{
"types":["book"],
"source":{
"query":{
"match_all":{}
},
"aggs":{
"warming_aggs":{
"terms":{
"field":"tags"
}
}
}
}
}
}
}
}
Wecanalsogetallthewarmingqueriesfortheindexandtypeusingthefollowingcommand:
curl-XGET'localhost:9200/library/_warmer?pretty'
Andfinally,wecanalsogetallthewarmingqueriesthatstartwithagivenprefix.Forexample,ifwewanttogetallthewarmingqueriesforthelibraryindexthatstartwiththetagsprefix,wewillrunthefollowingcommand:
curl-XGET'localhost:9200/library/_warmer/tags*?pretty'
www.EBooksWorld.ir
![Page 643: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/643.jpg)
DeletingawarmingqueryDeletingawarmingqueryisverysimilartogettingone;wejustneedtousetheDELETEHTTPmethod.Todeleteaspecificwarmingqueryfromourindex,wejustneedtoknowitsname.Forexample,ifwewanttodeletethewarmingquerynamedtags_warming_queryforourlibraryindex,wewillrunthefollowingcommand:
curl-XDELETE'localhost:9200/library/_warmer/tags_warming_query'
Wecanalsodeleteallthewarmingqueriesfortheindexusingthefollowingcommand:
curl-XDELETE'localhost:9200/library/_warmer/_all'
Andfinally,wecanalsoremoveallthewarmingqueriesthatstartwithagivenprefix.Forexample,ifwewanttoremoveallthewarmingqueriesforthelibraryindexthatstartwiththetagsprefix,wewillrunthefollowingcommand:
curl-XDELETE'localhost:9200/library/_warmer/tags*'
www.EBooksWorld.ir
![Page 644: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/644.jpg)
DisablingthewarmingupfunctionalityTodisablethewarmingqueriestotallybuttosavetheminthe_warmerindex,youshouldsettheindex.warmer.enabledconfigurationpropertytofalse(settingthispropertytotruewillresultinenablingthewarmingupfunctionality).Thissettingcanbeeitherputintheelasticsearch.ymlfileorjustsetusingtheRESTAPIonalivecluster.
Forexample,ifwewanttodisablethewarmingupfunctionalityforthelibraryindex,wewillrunthefollowingcommand:
curl-XPUT'localhost:9200/library/_settings'-d'{
"index.warmer.enabled":false
}'
www.EBooksWorld.ir
![Page 645: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/645.jpg)
ChoosingqueriesforwarmingFinally,weshouldaskourselvesonequestion:whichqueriesshouldbeconsideredascandidatesforwarming.Typically,you’llwanttochooseonesthatareexpensivetoexecuteandonesthatrequirecachestobepopulated.Soyou’llprobablywanttochoosequeriesthatincludeaggregationsandsortingbasedonthefieldsinyourindex.Thiswillforcetheoperatingsystemtoloadthepartoftheindicesthatholdthedatarelatedtosuchqueriesandimprovetheperformanceofconsecutivequeriesthatarerun.Inadditiontothis,parent-childqueriesandnestedqueriesarealsopotentialcandidatesforwarming.Youmayalsochooseotherqueriesbylookingatthelogs,andfindingwhereyourperformanceisnotasgreatasyouwantittobe.Suchqueriesmayalsobeperfectcandidatesforwarmingup.
Forexample,let’ssaythatwehavethefollowingloggingconfigurationsetintheelasticsearch.ymlfile:
index.search.slowlog.threshold.query.warn:10s
index.search.slowlog.threshold.query.info:5s
index.search.slowlog.threshold.query.debug:2s
index.search.slowlog.threshold.query.trace:1s
Andwehavethefollowinglogginglevelsetinthelogging.ymlconfigurationfile:
logger:
index.search.slowlog:TRACE,index_search_slow_log_file
Noticethattheindex.search.slowlog.threshold.query.tracepropertyissetto1sandtheindex.search.slowloglogginglevelissettoTRACE.Thismeansthatwheneveraqueryisexecutedforlongerthanonesecond(onashard,notintotal),itwillbeloggedintotheslowlogfile(thenameofwhichisspecifiedbytheindex_search_slow_log_fileconfigurationsectionofthelogging.ymlconfigurationfile).Forexample,thefollowingcanbefoundinaslowlogfile:
[2015-11-2519:53:00,248][TRACE][index.search.slowlog.query]
took[340000.2ms],took_millis[3400],types[],stats[],
search_type[QUERY_THEN_FETCH],total_shards[5],source[{"query":
{"match_all":{}},"aggs":{"warming_aggs":{"terms":{"field":"tags"}}}}],
extra_source[],
Asyoucansee,intheprecedinglogline,wehavethequerytime,searchtype,andthequerysource,whichshowsustheexecutedquery.
Ofcourse,thevaluescanbedifferentinyourconfigurationbuttheslowlogcanbeavaluablesourceofthequeriesthathavebeenrunningtoolongandmayneedtohavesomewarmupdefined;maybetheseareparent-childqueriesandneedsomeidentifierstobefetchedtoperformbetter,ormaybeyouareusingafilterthatisexpensivewhenyouexecuteitforthefirsttime.
Thereisonethingyoushouldremember:don’toverloadyourElasticsearchclusterwithtoomanywarmingqueriesbecauseyoumayendupspendingtoomuchtimeinwarmingupinsteadofprocessingyourproductionqueries.
www.EBooksWorld.ir
![Page 646: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/646.jpg)
www.EBooksWorld.ir
![Page 647: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/647.jpg)
www.EBooksWorld.ir
![Page 648: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/648.jpg)
IndexaliasingandusingittosimplifyyoureverydayworkWhenworkingwithmultipleindicesinElasticsearch,youcansometimeslosetrackofthem.Imagineasituationwhereyoustorelogsinyourindicesortime-baseddataingeneral.Usually,theamountofdatainsuchcasesisquitelargeand,therefore,itisagoodsolutiontohavethedatadividedsomehow.Alogicaldivisionofsuchdataisobtainedbycreatingasingleindexforasingledayoflogs(ifyouareinterestedinanopensourcesolutionusedtomanagelogs,lookattheLogstashfromtheElasticsearchsuiteathttps://www.elastic.co/products/logstash).
However,aftersometime,ifwekeepalltheindices,wewillstarthavingaproblemintakingcareofallthat.Anapplicationneedstotakecareofalltheinformation,suchaswhichindextosenddatato,whichtoquery,andsoon.Withthehelpofaliases,wecanchangethistoworkwithasinglenamejustaswewoulduseasingleindex,butwewillworkwithmultipleindices.
www.EBooksWorld.ir
![Page 649: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/649.jpg)
AnaliasWhatisanindexalias?It’sanadditionalnameforoneormoreindicesthatallowsustousetheseindicesbyreferringtothemwiththoseadditionalnames.Asinglealiascanhavemultipleindicesaswellastheotherwayround;asingleindexcanbeapartofmultiplealiases.
However,pleaserememberthatyoucan’tuseanaliasthathasmultipleindicesforindexingorforreal-timeGEToperations.Elasticsearchwillthrowanexceptionifyoudothis.Wecanstilluseanaliasthatlinkstoonlyasingleindexforindexing,though.ThisisbecauseElasticsearchdoesn’tknowinwhichindexthedatashouldbeindexedorfromwhichindexthedocumentshouldbefetched.
www.EBooksWorld.ir
![Page 650: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/650.jpg)
CreatinganaliasTocreateanindexalias,weneedtoruntheHTTPPOSTmethodtothe_aliasesRESTend-pointwithadefinedaction.Forexample,thefollowingrequestwillcreateanewaliascalledweek12thatwillincludetheindicesnamedday10,day11,andday12(weneedtocreatethoseindicesfirst):
curl-XPOST'localhost:9200/_aliases'-d'{
"actions":[
{"add":{"index":"day10","alias":"week12"}},
{"add":{"index":"day11","alias":"week12"}},
{"add":{"index":"day12","alias":"week12"}}
]
}'
Iftheweek12aliasisn’tpresentinourElasticsearchcluster,theprecedingcommandwillcreateit.Ifitispresent,thecommandwilljustaddthespecifiedindicestoit.
Wewouldrunasearchacrossthethreeindicesasfollows:
curl-XGET'localhost:9200/day10,day11,day12/_search?q=test'
Ifeverythinggoeswell,wecaninsteadrunitasfollows:
curl-XGET'localhost:9200/week12/_search?q=test'
Isn’tthisbetter?
Sometimeswehaveasetofindiceswhereeveryindexservesindependentinformationbutsomequeriesshouldgoacrossallofthem;forexample,wehavededicatedindicesforcountries(country_en,country_us,country_de,andsoon).Inthiscase,wewouldcreatethealiasbygroupingthemall:
curl-XPOST'localhost:9200/_aliases'-d'{
"actions":[
{"add":{"index":"country_*","alias":"countries"}}
]
}'
Thelastcommandcreatedonlyonealias.Elasticsearchallowsyoutorewritethistosomethinglessverbose:
curl-XPUT'localhost:9200/country_*/_alias/countries'
www.EBooksWorld.ir
![Page 651: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/651.jpg)
ModifyingaliasesOfcourse,youcanalsoremoveindicesfromanalias.Wecandothissimilarlytohowweaddindicestoanalias,butinsteadoftheaddcommand,weusetheremoveone.Forexample,toremovetheindexnamedday9fromtheweek12index,wewillrunthefollowingcommand:
curl-XPOST'localhost:9200/_aliases'-d'{
"actions":[
{"remove":{"index":"day9","alias":"week12"}}
]
}'
www.EBooksWorld.ir
![Page 652: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/652.jpg)
CombiningcommandsTheaddandremovecommandscanbesentasasinglerequest.Forexample,ifyouwouldliketocombineallthepreviouslysentcommandsintoasinglerequest,youwillhavetosendthefollowingcommand:
curl-XPOST'localhost:9200/_aliases'-d'{
"actions":[
{"add":{"index":"day10","alias":"week12"}},
{"add":{"index":"day11","alias":"week12"}},
{"add":{"index":"day12","alias":"week12"}},
{"remove":{"index":"day9","alias":"week12"}}
]
}'
www.EBooksWorld.ir
![Page 653: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/653.jpg)
RetrievingaliasesInadditiontoaddingorremovingindicestoorfromaliases,weandourapplicationsthatuseElasticsearchmayneedtoretrieveallthealiasesavailableintheclusterorallthealiasesthatanindexisconnectedto.Toretrievethesealiases,wesendarequestusingtheHTTPGETcommand.Forexample,thefollowingcommandgetsallthealiasesfortheday10indexandthesecondonewillgetalltheavailablealiases:
curl-XGET'localhost:9200/day10/_aliases'
curl-XGET'localhost:9200/_aliases'
Theresponsefromthesecondcommandisasfollows:
{
"day12":{
"aliases":{
"week12":{}
}
},
"library":{
"aliases":{}
},
"day11":{
"aliases":{
"week12":{}
}
},
"day9":{
"aliases":{}
},
"day10":{
"aliases":{
"week12":{}
}
}
}
Youcanalsousethe_aliasendpointtogetallaliasesfromthegivenindex:
curl-XGET'localhost:9200/day10/_alias/*'
Togetaparticularaliasdefinition,youcanusethefollowing:
curl-XGET'localhost:9200/day10/_alias/day12'
www.EBooksWorld.ir
![Page 654: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/654.jpg)
RemovingaliasesYoucanalsoremoveanaliasusingthe_aliasendpoint.Forexample,sendingthefollowingcommandwillremovetheclientaliasfromthedataindex:
curl-XDELETElocalhost:9200/data/_alias/client
www.EBooksWorld.ir
![Page 655: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/655.jpg)
FilteringaliasesAliasescanbeusedinawaysimilartohowviewsareusedinSQLdatabases.YoucanuseafullQueryDSL(discussedindetailinChapter3,SearchingYourData)andhaveyourfilterappliedtoallcount,search,deletebyquery,andsoon.
Let’slookatanexample.Imaginethatwewanttohavealiasesthatreturndataforacertainclientsowecanuseitinourapplication.Let’ssaythattheclientidentifierweareinterestedinisstoredintheclientIdfieldandweareinterestedinthe12345client.So,let’screatethealiasnamedclientwithourdataindex,whichwillapplyaqueryforclientIdautomatically:
curl-XPOST'localhost:9200/_aliases'-d'{
"actions":[
{
"add":{
"index":"data",
"alias":"client",
"filter":{"term":{"clientId":12345}}
}
}
]
}'
Sowhenusingthedefinedalias,youwillalwaysgetyourrequestfilteredbyatermquerythatensuresthatallthedocumentshavethe12345valueintheclientIdfield.
www.EBooksWorld.ir
![Page 656: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/656.jpg)
AliasesandroutingIntheIntroductiontoroutingsectionofChapter2,IndexingYourData,wetalkedaboutrouting.Similartoaliasesthatusefiltering,wecanaddroutingvaluestothealiases.Imaginethatweareusingroutingonthebasisofuseridentifierandwewanttousethesameroutingvalueswithouraliases.So,forthealiasnamedclient,wewillusetheroutingvaluesof12345,12346,and12347forquerying,andonly12345forindexing.Todothis,wewillcreateanaliasusingthefollowingcommand:
curl-XPOST'localhost:9200/_aliases'-d'{
"actions":[
{
"add":{
"index":"data",
"alias":"client",
"search_routing":"12345,12346,12347",
"index_routing":"12345"
}
}
]
}'
Thisway,whenweindexourdatausingtheclientalias,thevaluesspecifiedbytheindex_routingpropertywillbeused.Atthetimeofquerying,thevaluesspecifiedbythesearch_routingpropertywillbeused.
Thereisonemorething.Pleaselookatthefollowingquerysenttothepreviouslydefinedalias:
curl-XGET'localhost:9200/client/_search?q=test&routing=99999,12345'
Thevalueusedasaroutingvaluewillbe12345.ThisisbecauseElasticsearchwilltakethecommonvaluesofthesearch_routingattributeandthequeryroutingparameter,whichinourcaseis12345.
www.EBooksWorld.ir
![Page 657: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/657.jpg)
ZerodowntimereindexingandaliasesOneofthegreatestadvantagesofusingaliasesistheabilitytore-indexthedatawithoutanydowntimefromthesystemusingElasticsearch.Toachievethis,youwouldneedtointeractwithyourindicesonlythroughaliases—bothforindexingandquerying.Insuchacase,youcanjustcreateanewindex,indexthedatahere,andswitchaliaseswhenneeded.Duringindexing,aliaseswouldstillpointtotheoldindex,sotheapplicationcouldworkasusual.
www.EBooksWorld.ir
![Page 658: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/658.jpg)
www.EBooksWorld.ir
![Page 659: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/659.jpg)
SummaryInthischapter,wediscussedElasticsearchadministration.WestartedbylearninghowtoperformbackupsofourindicesandhowtomonitorourclusterhealthandstateusingitsAPI.Wecontrolledclustershardrebalancingandlearnedhowtoadjustshardallocationaccordingtoourneeds.We’veusedtheCATAPItogetinformationaboutElasticsearchinhuman-readableformandwe’vewarmedupourqueriestomakethemfaster.Finally,we’veusedaliasestoallowabettermanagementofourindicesandtohavemoreflexibility.
Inthenextandfinalchapterofthebook,wewillfocusonahypotheticalonlinelibrarystoretoseehowtomakeElasticsearchworkinpractice.Wewillstartwithabriefintroductionandhardwareconsiderations.WewilltuneasingleinstanceofElasticsearchandproperlyconfigureourclusterbydiscussingeachofitspartsandprovidingaproperarchitecture.Wewillverticallyexpandtheclusterandprepareitforbothhighqueryingandhighindexingload.Finally,wewilllearnhowtomonitorsuchapreparedcluster.
www.EBooksWorld.ir
![Page 660: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/660.jpg)
www.EBooksWorld.ir
![Page 661: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/661.jpg)
Chapter11.ScalingbyExampleInthepreviouschapter,wediscussedElasticsearchadministration.WestartedwithdiscussionaboutbackupsandhowwecandothembyusingavailableAPI.Wemonitoredthehealthandstateofourclustersandnodesandwelearnedhowtocontrolshardrebalancing.WecontrolledtheshardandreplicasallocationandusedhumanfriendlyCatAPItogetinformationabouttheclusterandnodes.Wesawhowtousewarmerstospeeduppotentiallyheavyqueriesandweusedindexaliasingtomanageourindicesmoreeasily.Bytheendofthischapter,youwillhavelearnedthefollowingtopics:
HardwarepreparationsforrunningElasticsearchTuningasingleElasticsearchnodePreparinghighlyavailableandfaulttolerantclustersExpandingElasticsearchverticallyPreparingElasticsearchforhighqueryandindexingthroughputMonitoringElasticsearch
www.EBooksWorld.ir
![Page 662: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/662.jpg)
HardwareOneofthefirstdecisionsthatweneedtomakewhenstartingeveryserioussoftwareprojectisasetchoicesrelatedtohardware.Andbelieveus,thisisnotonlyaveryimportantchoice,butalsooneofthemostdifficultones.Oftenthedecisionsaremadeatearlyprojectstages,whenonlythebasicarchitectureisknownandwedon’thavepreciseinformationregardingthequeries,dataload,andsoon.Projectarchitecthastobalanceprecautionandprojectedcostofthewholesolution.Toomanytimesitisanintersectionofexperienceandclairvoyance,whichcanleadtoeithergreatorterribleresults.
www.EBooksWorld.ir
![Page 663: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/663.jpg)
PhysicalserversoracloudLet’sstartwithadecision:acloud,virtual,orphysicalmachines.Nowadays,theseareallvalidoptions,butitwasnotalwaysthecase.Sometimeagotheonlyoptionwastobuynewserversforeachenvironmentpartorshareresourceswiththeotherapplicationsonthesamemachine.Thesecondoptionmakesperfectsenseasitismorecost-effectivebutintroducesrisk.Problemswithoneapplication,especiallywhentheyarehardwarerelated,willresultinproblemsforanotherapplication.YoucanimagineoneofyourapplicationsusingmostoftheI/OsubsystemofthephysicalmachineandalltheotherapplicationsstrugglingwithlotsofI/Owaitsandperformanceproblemsbecauseofthat.Virtualizationpromisesapplicationseparationandamoreconvenientwayofmanagingresources,butyouarestilllimitedbytheunderlyinghardware.Everyunexpectedtrafficcouldbeaproblemandaffectserviceavailability.Imaginethatyourecommercesitesuddenlygainsmassivenumberofcustomers.Insteadofbeinggladthatthespikeappearedandyouhavemorepotentialcustomers,yousearchforaplacewhereyoucanbuyadditionalhardwarethatwillbesuppliedassoonaspossible.
Cloudcomputingontheotherhandmeansamoreflexiblecostmodel.Wecaneasilyaddnewmachineswheneverweneed.Wecanaddthemtemporarilywhenweexpectagreaterload(forexample,beforeChristmasforanecommercesite)andpayonlyfortheactuallyusedprocessingpower.Itisjustafewclicksintheadminpanel.Evenmore,wecanalsosetupautomaticscaling,thatisnewvirtualmachinescanappearautomaticallywhenweneedthem.Cloud-basedsoftwarecanalsoshutthemdownwhenwedonotneedthemanymore.Thecloudhasmanyadvantages,suchaslowerinitialcost,abilitytoeasilygrowyourbusiness,andinsensitivitytotemporalfluctuationsofresourcerequirements,butitalsohasseveralflaws.Thecostsofcloudserversrisefasterthanthatofphysicalmachines.Also,massstorage,althoughpracticallyunlimited,hasworsecharacteristics(numberofoperationsperseconds)thanphysicalservers.Thisissometimesagreatproblemforus,especiallywithdiskbasedstoragesuchasElasticsearch.
Inpractice,asusual,thechoicecanbehardbutgoingthroughafewpointscanhelpyouwithyourdecision:
Businessrequirementsmaydirectlypointforyourownservers;forexample,someproceduresrelatedtofinancialormedicaldataautomaticallyexcludecloudsolutionshostedbythird-partyvendorsForproofofconceptandlow/mediumloadservices,thecloudcanbeagoodchoicebecauseofsimplicity,scalability,andlowcostSolutionswithstrongrequirementsconnectedwithI/OsubsystemswillprobablyworkbetteronbaremetalmachineswhereyouhavegreaterinfluencewhatstoragetypeisavailabletoyouWhenthetrafficcangreatlychangewithinashorttime,thecloudisaperfectplaceforyou
Forthepurposeoffurtherdiscussion,let’sassumethatwewanttobuyourownservers.Weareinthecomputerstorenowandlet’sbuysomething!
www.EBooksWorld.ir
![Page 664: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/664.jpg)
CPUInmostcases,thisistheleastimportantpart.YoucanchooseanymodernCPUmodelbutyoushouldknowthatmorenumberofcoresmeansahighernumberofconcurrentqueriesandindexingthreads.Thatwillleadtobeingabletoindexdatafaster,especiallywithcomplicatedanalysisandlotsofmerges.
www.EBooksWorld.ir
![Page 665: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/665.jpg)
RAMmemoryMoregigabytesofRAMisalwaysbetterthanlessgigabytesofRAM.Memoryisnecessary,especiallyforaggregationandsorting.Itislessofaproblemnow,withElasticsearch2.0anddocvalues,butstillcomplicatedquerieswithlotsofaggregationrequirememorytoprocessthedata.Memoryisalsousedforindexingbuffersandcanleadtoindexingspeedimprovements,becausemoredatacanbebufferedinmemoryandthusdiskswillbeusedlessfrequently.Ifyoutrytousemorememorythanavailable,theoperatingsystemwillusetheharddisksastemporaryspace(itstartsswapping)andyoushouldavoidthisatallcost.NotethatyoushouldnevertrytoforceElasticsearchtouseasmuchaspossiblememory.ThefirstreasonisJavagarbagecollector–lessmemoryismoreGCfriendly.Thesecondreasonisthattheunusedmemoryisactuallyusedbytheoperatingsystemforbuffersanddiskcache.Infact,whenyourindexcanfitinthisspace,alldataisreadfromthesecachesandnotfromthedisksdirectly.Thiscandrasticallyimprovetheperformance.Bydefault,ElasticsearchandtheI/OsubsystemsharethesameI/Ocache,whichgivesanotherreasontoleaveevenmorememoryfortheoperatingsystemitself.
Inpractice,8GBisthelowestrequirementformemory.ItdoesnotmeanthatElasticsearchwillneverworkwithlessmemory,butformostsituationsanddataintensiveapplications,itisthereasonableminimum.Ontheotherhand,morethan64GBisrarelyneeded.Inlieu,thinkaboutscalingthesystemhorizontallyinsteadofassigningsuchamountsofmemorytoasingleElasticsearchnode.
www.EBooksWorld.ir
![Page 666: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/666.jpg)
MassstorageWesaidthatweareinagoodsituationwhenthewholeindexfitsintomemory.Inpracticethiscanbedifficulttoachieve,sogoodandfastdisksareveryimportant.Itisevenmoreimportantifoneoftherequirementsishighindexingthroughput.Insuchacase,youmayconsiderfastSSDdisks.Unfortunately,thesedisksareexpensiveifyourdatavolumeisbig.YoucanimprovethesituationbyavoidingusingRAID(seehttps://en.wikipedia.org/wiki/RAID),exceptRAID0.Inmostcases,whenyouhandlefaulttolerancebyhavingmultipleservers,theadditionallevelofsecurityontheRAIDlevelisunnecessary.Thelastthingistoavoidusingexternalstorage,suchasnetworkattachedstorage(NAS)orNFSvolumes.Thenetworklatencyinsuchcasesalwayskillsalltheadvantagesofthesesolutions.
www.EBooksWorld.ir
![Page 667: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/667.jpg)
ThenetworkWhenyouuseElasticsearchcluster,eachnodeopensseveralconnectionstoothernodesforvarioususes.Whenyouindex,thedataisforwardedtodifferentshardsandreplicas.Whenyouqueryfordata,thenodeusedforqueryingcanrunmultiplepartialqueriestotheothernodesandcomposereplyfromthedatafetchedfromtheothernodes.Thisiswhyyoushouldmakesurethatyournetworkisnotthebottleneck.Inpractice,useonenetworkforalltheserversintheclusterandavoidsolutionsinwhichthenodesintheclusterarespreadbetweendatacenters.
www.EBooksWorld.ir
![Page 668: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/668.jpg)
HowmanyserversTheanswerisalwaysthesame,asitdepends.Itdependsonmanyfactors:thenumberofrequestperseconds,thedatavolume,thelevelofthequery’scomplexity,theaggregationsandsortingusage,thenumberofnewdocumentsperunitoftime,howfastnewdatashouldbeavailableforsearching(therefreshtime),theaveragedocumentsize,andtheanalyzersused.Inpractice,thehandiestansweris-testitandapproximate.
Theonethingthatisoftenunderestimatedisdatasecurity.Whenyouthinkaboutfaulttoleranceandavailability,youshouldstartfromthreeservers.Why?WetalkedaboutthesplitbrainsituationintheMasterelectionconfigurationsectionofChapter9,ElasticsearchClusterinDetail.StartingfromthreeserversweareabletohandleasingleElasticsearchnodefailurewithouttakingdownthewholecluster.
www.EBooksWorld.ir
![Page 669: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/669.jpg)
CostcuttingYoudidsometests,consideredcarefullyplannedfunctionalities,estimatedvolumesandload,andwenttotheprojectownerwithanarchitecturedraft.“Itstooexpensive”,hesaidandaskedyoutothinkaboutserversonceagain.Whatcanwedo?
Let’sthinkaboutserverrolesandtrytointroducesomedifferencesbetweenthem.Ifoneoftherequirementsisindexingmassiveamountsofdataconnectedwithtime(maybelogs),thepossiblewayishavingtwogroupsofservers:hotnodes,whennewdataarrives,andcoldnodes,whenolddataismoved.Thankstothisapproach,hotnodesmayhavefasterbutsmallerdisks(thatis,solidstatedrives)inoppositetothecoldnodes,whenfastdisksarenotsoimportantbutspaceis.Youcanalsodivideyourarchitectureintoseveralgroupsasmasterservers(lesspowerful,withrelativlysmalldisks),datanodes(biggerdisks),andqueryaggregatornodes(moreRAM).Wewilltalkaboutthisinthefollowingsections.
www.EBooksWorld.ir
![Page 670: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/670.jpg)
www.EBooksWorld.ir
![Page 671: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/671.jpg)
PreparingasingleElasticsearchnodeWhenwetalkaboutverticalscaling,weoftenmeanaddingmoreresourcestotheserverElasticsearchisrunningon.WecanaddmemoryorwecanswitchtoamachinewithabetterCPUorfasterdiskstorage.Ofcourse,withbettermachineswecanexpectanincreaseinperformance;dependingonourdeploymentanditsbottlenecks,itcanbeasmallorlargeimprovement.However,therearelimitationswhenitcomestoverticalscaling.Forexample,oneofthelimitationsisthemaximumamountofphysicalmemoryavailableforyourserversorthetotalmemoryrequiredbytheJVMtooperate.Whenhavinglargedataandcomplicatedqueries,youcanverysoonrunintomemoryissuesandaddingnewmemorymaynothelpatall.Inthissection,wewilltrytogiveyougeneraladviceonwheretolookandwhattotunewhenitcomestoasingleElasticsearchnode.
Thethingtorememberwhentuningyoursystemisperformancetests,onesthatcanberepeatedunderthesamecircumstances.Onceyoumakeachange,youneedtobeabletoseehowitaffectstheoverallperformance.Inadditiontothat,Elasticsearchscalesgreat.Usingthatknowledge,wecanrunperformancetestsonasinglemachine(orafewofthem)andextrapolatetheresults.Suchobservationsmaybeagoodstartingpointforfurthertuning.
Alsokeepinmindthatthissectiondoesn’tcontainadeepdiveintoallperformancerelatedtopics,butisdedicatedtoshowingyouthemostcommonthings.
www.EBooksWorld.ir
![Page 672: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/672.jpg)
ThegeneralpreparationsApartfromallthethingswewilldiscussinthissection,therearethreemajor,operatingsystemrelatedthingsyouneedtoremember:thenumberofallowedfiledescriptors,thevirtualmemory,andavoidingswapping.
NotethatthefollowingsectioncontainsinformationforLinuxoperatingsystems,butyoucanalsoachievesimilaroptionsonMicrosoftWindows.
AvoidingswappingLet’sstartwiththethirdone.ElasticsearchandJavaVirtualMachinebasedapplications,ingeneral,don’tliketobeswapped.Thismeansthattheseapplicationsworkbestiftheoperatingsystemdoesn’tputthememorythattheyuseintheswapspace.Thisisverysimple,because,toaccesstheswappedmemory,theoperatingsystemwillhavetoreaditfromthedisk,whichisslowandwhichwouldaffecttheperformanceinaverybadway.
Ifwehaveenoughmemory,andweshouldhaveifwewantourElasticsearchinstancetoperformwell,wecanconfigureElasticsearchtoavoidswapping.Todothat,wejustneedtomodifytheelasticsearch.ymlfileandincludethefollowingproperty:
bootstrap.mlockall:true
Thisisoneoftheoptions.Thesecondoneistosetthepropertyvm.swappinessinthe/etc/sysctl.conffileto0(forcompleteswapdisabling)or1forswappingonlyinemergency(forKernelversions3.5andabove).
Thethirdoptionistodisableswappingbyediting/etc/fstabandremovingthelinesthatcontaintheswapword.Thefollowingisanexample/etc/fstabcontent:
LABEL=cloudimg-rootfs/ext4defaults,discard00
/dev/xvdbswapswapdefaults00
Todisableswappingwewouldjustremovethesecondlinefromtheabovecontents.Wecouldalsorunthefollowingcommandtodisableswapping:
sudoswapoff-a
However,rememberthatthiseffectwon’tpersistbetweenloggingoffandbackintothesystem,sothisisonlyatemporarysolution.
Also,rememberthatifyoudon’thaveenoughmemorytorunElasticsearch,theoperatingsystemwilljustkilltheprocesswhenswappingisdisabled.
FiledescriptorsMakesureyouhaveenoughlimitsrelatedtofiledescriptorsfortheuserrunningElasticsearch(wheninstallingfromofficialpackages,thatuserwillbecalledelasticsearch).Ifyoudon’t,youmayendupwithproblemswhenElasticsearchtriestoflushthedataandcreatenewsegmentsormergesegmentstogether,whichcanresultinindexcorruption.
Toadjustthenumberofallowedfiledescriptors,youwillneedtoadjustthewww.EBooksWorld.ir
![Page 673: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/673.jpg)
/etc/security/limits.conffile(atleastonmostcommonLinuxsystems)andadjustoraddanentryrelatedtoagivenuser(forbothsoftandhardlimits).Forexample:
elasticsearchsoftnofile65536
elasticsearchhardnofile65536
Itisadvisedtosetthenumberofallowedfiledescriptorstoatleast65536,butevenmorecanbeneeded,dependingonyourindexsize.
OnsomeLinuxsystems,youmayalsoneedtoloadanappropriatelimitsmodulefortheprecedingsettingtotakeeffect.Toloadthatmodule,youneedtoadjustthe/etc/pam.d/loginfileandaddoruncommentthefollowingline:
sessionrequiredpam_limits.so
ThereisalsoapossibilitytodisplaythenumberoffiledescriptorsavailableforElasticsearchbyaddingthe-Des.max-open-files=trueparametertoElasticsearchstartupparameters.Forexample,likethis:
bin/elasticsearch-Des.max-open-files=true
Whendoingthat,Elasticsearchwillincludeinformationaboutthefiledescriptorsinthelogs:
[2015-12-2000:22:19,869][INFO][bootstrap]max_open_files
[10240]
VirtualmemoryElasticsearch2.2useshybriddirectoryimplementation,whichisacombinationofmmapfsandniofsdirectories.Becauseofthat,especiallywhenyourindicesarelarge,youmayneedalotofvirtualmemoryonyoursystem.Bydefault,theoperatingsystemlimitstheamountofmemorymappedfilesandthatcancauseerrorswhenrunningElasticsearch.Becauseofthat,werecommendincreasingthedefaultvalues.Todothat,youjustneedtoeditthe/etc/sysctl.conffileandsetthevm.max_map_countproperty;forexample,toavalueequalto262144.
Youcanalsochangethevaluetemporarilybyrunningthefollowingcommand:
sysctl-wvm.max_map_count=262144
www.EBooksWorld.ir
![Page 674: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/674.jpg)
ThememoryBeforethinkingaboutElasticsearchconfigurationrelatedthings,weshouldrememberaboutgivingenoughmemorytoElasticsearch.Ingeneral,weshouldn’tgivemorethan50-60percentofthetotalavailablememorytotheJVMprocessrunningElasticsearch.WedothatbecausewewanttoleavememoryfortheoperatingsystemandfortheoperatingsystemI/Ocache.However,weneedtorememberthatthe50-60percentfigureisnotalwaystrue.Youcanimaginehavingnodeswith256GBofRAMandhavingindicesof30GBintotalonsuchanode.Insuchcircumstances,evenassigningmorethan60percentofphysicalRAMtoElasticsearchwouldleaveplentyofRAMfortheoperatingsystem.ItisalsoagoodideatosettheXmxandXmspropertiestothesamevaluestoavoidJVMheapsizeresizing.
Anotherthingtorememberarethesocalledcompressedoops(http://docs.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html#compressedOop),theordinaryobjectpointers.Javavirtualmachinecanbetoldtousethembyaddingthe-XX:+UseCompressedOopsswitch.ThisallowsJavavirtualmachinetouselessmemorytoaddresstheobjectsontheheap.However,thisisonlytrueforheapsizeslessthanorequalto31GB.Goingforalargerheapmeansnocompressedoopsandhighermemoryusageforaddressingtheobjectsontheheap.
www.EBooksWorld.ir
![Page 675: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/675.jpg)
FielddatacacheandbreakingthecircuitAsweknow,bydefaultthefielddatacacheinElasticsearchisunbounded.Thiscanbeverydangerous,especiallywhenyouareusingaggregationsandsortingonmanyfieldsthatareanalysed,becausetheydon’tusedocvaluesbydefault.Ifthosefieldsarehighcardinalityones,thenyoucanrunintoevenmoretrouble.Bytroublewemeanrunningoutofmemory.
Wehavetwodifferentfactorswecantunetobesurethatwedon’trunintooutofmemoryerrors.Firstofall,wecanlimitthesizeofthefielddatacacheandweshoulddothat.Thesecondthingisthecircuitbreaker,whichwecaneasilyconfiguretojustthrowexceptionsinsteadofloadingtoomuchdata.Combiningthesetwothingstogetherwillensurethatwedon’trunintomemoryissues.
However,weshouldalsorememberthatElasticsearchwillevictdatafromthefielddatacacheifitssizeisnotenoughtohandleaggregationrequestsorsorting.Thiswillaffectthequeryperformancebecauseloadingthefielddatainformationisnotveryefficientandisresourceintensive.However,inouropinion,itisbettertohaveourqueriesslowerthanhavingourclusterblownupbecauseofoutofmemoryerrors.
ThefielddatacacheandcachesingeneralwerediscussedintheElasticsearchcachessectionofChapter9,ElasticsearchClusterinDetail.
www.EBooksWorld.ir
![Page 676: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/676.jpg)
UsedocvaluesWheneveryouplantousesorting,aggregations,orscriptingheavily,youshouldusedocvalueswheneveryoucan.Thiswillnotonlysaveyouthememoryneededforthefielddatacache,becauseoffewerobjectsproduced,itwillalsomaketheJavavirtualmachineworkbetterwithlowergarbagecollectortime.DocvalueswerediscussedintheMappingsConfigurationsectionofChapter2,IndexingYourData.
www.EBooksWorld.ir
![Page 677: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/677.jpg)
RAMbufferforindexingIntheElasticsearchcachessectionofChapter9,ElasticsearchClusterinDetail,wealsodiscussed.Thereareafewthingswewouldliketomention.Firstofall,themoreRAMfortheindexingbuffer,themoredocumentsElasticsearchwillbeabletoholdinmemory.Sothemorememorywehaveforindexing,thelessoftentheflushtodiskwillhappenandfewersegmentswillbecreated.Becauseofthat,yourindexingwillbefaster.Butofcourse,wedon’twantElasticsearchtooccupy100percentoftheavailablememory.KeepinmindthattheRAMbuffersaresetpershard,sotheamountofmemorythatwillbeuseddependsonthenumberofshardsandreplicasthatareassignedonthegivennodeandonthenumberofdocumentsyouindex.Youshouldsettheupperlimitssoyournodedoesn’tblowupwhenithasmultipleshardsassigned.
www.EBooksWorld.ir
![Page 678: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/678.jpg)
IndexrefreshrateElasticsearchusesLuceneandweknowitbynow.ThethingwithLuceneisthattheviewoftheindexisnotrefreshedwhennewdataisindexedorsegmentsarecreated.Toseethenewlyindexeddata,weneedtorefreshtheindex.Bydefault,Elasticsearchdoesthatonceeverysecondandtheperiodofrefreshiscontrolledbyusingtheindex.refresh_intervalproperty,specifiedperindex.Thelowertherefreshrate,thesoonerthedocumentswillbevisibleforsearchoperations.However,thatalsomeansthatElasticsearchwillneedtoputmoreresourcesintorefreshingtheindexview,meaningthattheindexingandsearchingoperationswillbeslower.Thehighertherefreshrate,themoretimeyouwillhavetowaitbeforebeingabletoseethedatainthesearchresults,butyourindexingandqueryingwillbefaster.
www.EBooksWorld.ir
![Page 679: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/679.jpg)
ThreadpoolsWehaven’ttalkedaboutthreadpoolsuntilnow,butwewouldliketomentionthemnow.EachElasticsearchnodeholdsseveralthreadpoolsthatcontroltheexecutionqueuesforoperationssuchasindexingorquerying.Elasticsearchusesseveralpoolstoallowcontroloverhowthethreadsarehandledandmuchthememoryconsumptionisallowedforuserrequests.
NoteJavavirtualmachineallowsapplicationstousemultiplethreads-concurrentlyrunningmultipleapplicationtasks.FormoreinformationaboutJavathreads,refertohttp://docs.oracle.com/javase/7/docs/api/java/lang/Thread.html.
Therearemanythreadpools(wecanspecifythetypeweareconfiguringbyspecifyingthetypeproperty).However,forperformance,themostimportantare:
generic:Thisisthethreadpoolforgenericoperations,suchasnodediscovery.Bydefault,thegenericthreadpoolisoftypecached.index:Thisisthethreadpoolusedforindexinganddeletingoperations.Itstypedefaultstofixed,itssizetothenumberofavailableprocessors,andthesizeofthequeueto200.search:Thisisthethreadpoolusedforsearchandcountrequests.Itstypedefaultstofixedanditssizetothenumberofavailableprocessorsmultipliedby3anddividedby2,withthesizeofthequeuedefaultingto1000.suggest:Thisisthethreadpoolusedforsuggestrequests.Itstypedefaultstofixed,itssizetothenumberofavailableprocessors,andthesizeofthequeueto1000.get:Thisisthethreadpoolusedforrealtimegetrequests.Itstypedefaultstofixed,itssizetothenumberofavailableprocessors,andthesizeofthequeueto1000.bulk:Asyoucanguess,thisisthethreadpoolusedforbulkoperations.Itstypedefaultstofixed,itssizetothenumberofavailableprocessors,andthesizeofthequeueto50.percolate:Thisisthethreadpoolforpercolationrequests.Itstypedefaultstofixed,itssizetothenumberofavailableprocessors,andthesizeofthequeueto1000.
NoteBeforeElasticsearch2.1,wecouldcontrolthetypeofthethreadpool.StartingwithElasticsearch2.1wecannolongerdothat.Formoreinformationpleaserefertotheofficialdocumentation-https://www.elastic.co/guide/en/elasticsearch/reference/2.1/breaking_21_removed_features.html
Forexample,ifwewanttoconfigurethethreadpoolforindexingoperationstohaveasizeof100andaqueueof500,wewillsetthefollowingintheelasticsearch.ymlconfigurationfile:
threadpool.index.size:100
threadpool.index.queue_size:500
www.EBooksWorld.ir
![Page 680: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/680.jpg)
AlsorememberthatthethreadpoolconfigurationcanbeupdatedusingtheclusterupdateAPI.Forexample,likethis:
curl-XPUT'localhost:9200/_cluster/settings'-d'{
"transient":{
"threadpool.index.size":100,
"threadpool.index.queue_size":500
}
}'
Ingeneral,youdon’tneedtoworkwiththethreadpoolsandtheirconfiguration.However,whenconfiguringyourcluster,youmaywanttoputmoreemphasisonindexingorqueryingand,insuchcases,givingmorethreadsorlargerqueuestotheprioritizedoperationmayresultinmoreresourcesbeingusedforsuchoperations.
www.EBooksWorld.ir
![Page 681: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/681.jpg)
www.EBooksWorld.ir
![Page 682: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/682.jpg)
HorizontalexpansionElasticsearchisahighlyscalablesearchandanalyticsplatform.Wecanscaleitbothhorizontallyandvertically.WediscussedhowtotuneasinglenodeinthePreparingasingleElasticsearchnodesectionearlierinthischapterandwewouldliketofocusonhorizontalscalingnow;howtohandlemultiplenodesinthesamecluster,whatrolesshouldtheyhave,andhowtotunetheconfigurationtohaveahighlyreliable,available,andfaulttolerantcluster.
Youcanimagineverticalscalinglikebuildingaskyscrapper–wehavelimitedspaceavailableandweneedtogoashighaswecan.Ofcourse,thatisexpensiveandrequiresalotofengineeringdoneright.Ontheotherhand,wehavehorizontalscaling,whichislikehavingmanyhousesinaresidentialarea.Insteadofinvestingintohardwareandhavingpowerfulmachines,wechoosetohavemultiplemachinesandourdatasplitbetweenthem.Horizontalscalinggivesusvirtuallyunlimitedscalingpossibilities.Evenwiththemostpowerfulhardware,thetimecomeswhenasinglemachineisnotenoughtohandlethedata,thequeries,orbothofthem.Insuchcases,spreadingthedataamongmultipleserversiswhatsavesusandallowsustohaveterabytesofdatainmultipleindicesspreadacrossthewholecluster,justliketheoneinthefollowingimage:
Wehaveour4nodesclusterwiththelibraryindexcreatedandbuiltoffourshards.
Ifwewanttoincreasethequeryingcapabilitiesofourcluster,wecanjustaddadditionalnodes,forexample,fourofthem.Afteraddingnewnodestothecluster,wecaneithercreatenewindicesthatwillbebuiltofmoreshardstospreadtheloadmoreevenlyoraddreplicastothealreadyexistingshards.Bothoptionsareviable.Thisisbecausewedon’thavethepossibilityofsplittingshardsoraddingmoreprimaryshardstoanexistingindex.Weshouldgoforhavingmoreprimaryshardswhenourhardwareisnotenoughtohandletheamountofdataitholds.Insuchcases,weusuallyrunintooutofmemorysituations,longshardqueryexecutiontime,swapping,orhighI/Owaits.Thesecondoption,thatishavingreplicas,isthewaytogowhenourhardwareishappilyhandlingthedatawehavebutthetrafficissohighthatthenodesjustcan’tkeepup.
www.EBooksWorld.ir
![Page 683: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/683.jpg)
Thefirstoptionissimple,butlet’slooksatthesecondcase-havingmorereplicas.Sowithfouradditionalnodes,ourclusterwouldlookasfollows:
Now,let’srunthefollowingcommandtoaddasinglereplica:
curl-XPUT'localhost:9200/library/_settings'-d'{
"index":{
"number_of_replicas":1
}
}'
Ourclusterviewwouldlookmoreorlessasfollows:
Asyoucansee,eachoftheinitialshardsbuildingthelibraryindexhasasinglereplicastoredonanothernode.ThenicethingaboutshardsandtheirreplicasisthatElasticsearchissmartenoughtobalancetheshardsinasingleindexandputthemonseparatenodes.Forexample,youwon’teverendupinasituationwhereyouhaveashardanditsreplicasonthesamenode.Also,Elasticsearchisabletoroundrobinthequeriesbetweentheshardsandtheirreplicas,whichmeansthatallthenodeswillbehitbythequeriesandwedon’thavetocareaboutthat.Becauseofthat,weareabletohandlealmostdoublethequeryloadcomparedtoourinitialdeployment.
www.EBooksWorld.ir
![Page 684: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/684.jpg)
AutomaticallycreatingthereplicasLet’sstayabitlongeraroundreplicas.Elasticsearchallowsustoautomaticallyexpandreplicaswhentheclusterisbigenough.Thismeansthatthereplicascanbecreatedautomaticallywhennewnodesareaddedtothecluster.Youcanwonderwheresuchfunctionalitycanbeuseful.Imagineasituationwhereyouhaveasmallindexthatyouwouldliketobepresentoneverynodesothatyourpluginsdon’thavetorundistributedqueriesjusttogetthedatafromit.Inadditiontothat,yourclusterisdynamicallychanging,thatisyouaddandremovenodesfromit.ThesimplestwaytoachievesuchfunctionalityistoallowElasticsearchtoautomaticallyexpandthereplicas.Todothat,weneedtosetindex.auto_expand_replicasto0-all,whichmeansthattheindexcanhave0replicasorbepresentonallthenodes.SoifoursmallindexiscalledshopsandwewouldlikeElasticsearchtoautomaticallyexpanditsreplicastoallthenodesinthecluster,wewouldusethefollowingcommandtocreatetheindex:
curl-XPOST'localhost:9200/shops/'-d'{
"settings":{
"index":{
"auto_expand_replicas":"0-all"
}
}
}'
Wecanalsoupdatethesettingsofthatindexifitisalreadycreatedbyrunningthefollowingcommand:
curl-XPUT'localhost:9200/shops/_settings'-d'{
"index":{
"auto_expand_replicas":"0-all"
}
}'
www.EBooksWorld.ir
![Page 685: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/685.jpg)
RedundancyandhighavailabilityTheElasticsearchreplicationmechanismnotonlygivesusabilitytohandlehigherquerythroughput,butalsogivesusredundancyandhighavailability.ImagineanElasticsearchclusterhostingasingleindexcalledlibrarythatisbuiltof2shardsand0replicas.Suchaclusterwouldlookasfollows:
Nowwhathappenswhenoneofthenodesfail?Thesimplestansweristhatweloseabout50percentofthedataand,ifthefailureisfatal,welosethatdataforever.Evenwhenhavingbackups,wewouldneedtospinupanothernodeandrestorethebackupandthattakestime.Duringthattime,yourapplication,orpartsofitthatarebasedonElasticsearch,can’tworkcorrectly.IfyourbusinessreliesonElasticsearch,downtimemeansmoneyloss.Ofcourse,wecanusereplicastocreatemorereliableclustersthatcanhandlethehardwareandsoftwarefailures.Andonethingtorememberisthateverythingwillfaileventually–ifthesoftwarewon’t,hardwarewill.Forexample,sometimeagoGooglesaidthatineachoftheirclusters,duringthefirstyearatleast1000machineswillfail(youcanreadmoreonthattopicathttp://www.cnet.com/news/google-spotlights-data-center-inner-workings/).Becauseofthat,weneedtobereadytohandlesuchcases.
Let’slookatthesameclusterbutwithonereplica:
NowlosingasingleElasticsearchnodemeansthatwestillhavethewholedataavailableandwecanworkonrestoringthefullclusterstructurewithoutdowntime.Ofcourse,thisisonlyaverysmallclusterbuiltoftwoElasticsearchnodesclusters.Thelargerthecluster,themorereplicas,themorefailureyouwillbeabletohandlewithoutworryingaboutthedataloss.Ofcourseyouwillhavelowerperformance,dependingonthepercentageofnodesthatfail,butthedatawillstillbethereandtheclusterwillbeoperational.
That’swhy,whendesigningyourarchitectureanddecidingonthenumberofnodesandindicesandtheirarchitecture,youshouldtakeintoconsiderationhowmanynodes,failure
www.EBooksWorld.ir
![Page 686: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/686.jpg)
youwanttolivewith.Ofcourse,youcan’tforgetabouttheperformancepartoftheequation,butredundancyandhighavailabilityshouldbeoneofthefactorsofthescalingequation.
www.EBooksWorld.ir
![Page 687: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/687.jpg)
CostandperformanceflexibilityThedefaultdistributednatureofElasticsearchanditsabilitytoscalehorizontallyallowsustobeflexiblewhenitcomestoperformanceandcoststhatwehavewhenrunningourenvironment.Firstofall,highendserverswithhighperformancedisks,numerousCPUcores,andalotofRAMarestillexpensive.Inadditiontothat,cloudcomputingisgettingmoreandmorepopularandifyouneedalotofflexibilityanddon’twanttohaveyourownhardware,youcanchoosesolutionssuchasAmazon(http://aws.amazon.com/),Rackspace(http://www.rackspace.com/),DigitalOcean(https://www.digitalocean.com/),andsoon.Theydonotonlyallowustorunoursoftwareonrentedmachines,butalsoallowustoscaleondemand.Wejustneedtoaddmoremachineswhichisafewclicksawayorcanevenbeautomatedwithsomedegreeofwork.
Usingahostedsolutionwithoneclickmachinerentingallowshavingatrulyhorizontallyscalablesolution.Ofcourse,that’snotcheap–youpayfortheflexibility.Butwecaneasilysacrificeperformanceifcostsarethemostcrucialfactorinourbusinessplan.Ofcourse,wecanalsogotheotherway.Ifwecanaffordlargebaremetalmachines,Elasticsearchclusterscanbepushedtohundredsofterabytesofdatastoredintheindicesandstillgetdecentperformance(ofcoursewithaproperhardwareandpropertydistributed).
www.EBooksWorld.ir
![Page 688: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/688.jpg)
ContinuousupgradesHighavailability,costandperformanceflexibility,andvirtuallyendlessgrowtharenottheonlythingsworthtalkingaboutwhendiscussingthescalabilitysideofElasticsearch.Atsomepointintime,youwillwanttohaveyourElasticsearchclusterupgradedtoanewversion.Itcanbebecauseofbugfixes,performanceimprovements,newfeatures,oranythingthatyoucanthinkof.Thethingisthatwhenyouhaveasingleinstanceofeachshard,withoutreplicas,anupgrademeansunavailabilityofElasticsearch(oratleastitsparts)andthatmaymeandowntimeoftheapplicationsthatuseElasticsearch.Thisisanotherreasonwhyhorizontalscalingissoimportant;youcanperformupgrades,atleasttothepointwheresoftwaresuchasElasticsearchsupports.Forexample,youcantakeElasticsearch2.0andupgradetoElasticsearch2.1withonlyrollingrestarts(gettingonenodeoutofthecluster,upgradingit,bringingitback,andcontinuingwiththenextnodeuntilallthenodesaredone),thushavingallthedatastillavailableforsearchingandindexinghappeningatthesametime.
www.EBooksWorld.ir
![Page 689: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/689.jpg)
MultipleElasticsearchinstancesonasinglephysicalmachineHavingalargephysicalmachinewithlotofmemoryandCPUcoreshasadvantagesandsomechallenges.Firstofall,ifyoudecidetorunasingleElasticsearchnodeonthatmachine,youwillsoonerorlaterrunintogarbagecollectionissues,youwillhavelotsofshardsonasinglenodewhichwillrequireahighnumberofI/OoperationsfortheinternalElasticsearchcommunication(retrievingclusterstatistics),andsoso.What’smore,youusuallyshouldn’tgoabove31GBofheapmemoryforasingleJVMprocessbecauseyoucan’tusecompressedordinaryobjectpointers(https://docs.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html).
Insuchcases,youcaneitherrunmultipleElasticsearchinstancesonthesamebaremetalmachine,runmultiplevirtualmachinesandasingleElasticsearchinsideeachone,orrunElasticsearchinacontainer,suchasDocker(http://www.docker.com/).Thisisoutofthescopeofthebook,but,becausewearetalkingaboutscaling,wethoughtitmaybeagoodthingtomentionwhatcanbedoneinsuchcases.
NoteThereisalsothepossibilityofrunningmultipleElasticsearchserversonasinglephysicalmachinewithoutrunningmultiplevirtualmachines.Whichroadtotake-virtualmachinesormultipleinstances-isreallyyourchoice.However,weliketokeepthingsseparateandbecauseofthatweusuallygofordividinganylargeserverintomultiplevirtualmachines.Whendividingonelargeserverintomultiplesmallervirtualmachines,rememberthattheI/Osubsystemwillbesharedacrossthosesmallervirtualmachines.Becauseofthat,itmaybegoodtowiselydividethedisksbetweenthevirtualmachines.
PreventingashardanditsreplicasfrombeingonthesamenodeThereisoneadditionalthingworthmentioning.Whenyouhavemultiplephysicalserversdividedintovirtualmachines,itiscrucialtoensurethattheshardanditsreplicadon’tenduponthesamephysicalmachine.Bydefault,ElasticsearchissmartenoughtonotputtheshardanditsreplicaonthesameElasticsearchinstance,butitdoesn’tknowanythingaboutbaremetalmachines,soweneedtotellit.WecantellElasticsearchtoseparatetheshardsandreplicasbyusingclusterallocationawareness.Inourpreviouscase,wehadthreephysicalservers.Let’scallthem:server1,server2,andserver3.
NowforeachElasticsearchonaphysicalserver,wedefinethenode.server_namepropertyandwesetittotheidentifieroftheserver(thenameofthepropertycanbeanythingwewant).Soforexample,forallElasticsearchnodesonthefirstphysicalserver,wewouldsetthefollowingpropertyintheelasticsearch.ymlconfigurationfile:
node.server_name:server1
Inadditiontothat,eachElasticsearchnode(nomatteronwhichphysicalserver)needstohavethefollowingpropertyaddedtotheelasticsearch.ymlconfigurationfile:
www.EBooksWorld.ir
![Page 690: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/690.jpg)
cluster.routing.allocation.awareness.attributes:server_name
IttellsElasticsearchnottoputtheprimaryshardanditsreplicasonthenodeswiththesamevalueinthenode.server_nameproperty.ThisisenoughforusandElasticsearchwilltakecareoftherest.
www.EBooksWorld.ir
![Page 691: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/691.jpg)
DesignatednoderolesforlargerclustersThereisonemorethingthatwewanttodiscussandemphasise.Whenitcomestolargeclusters,itisimportanttoassignrolestoallthenodesinthecluster.ThisallowsforatrulyfullyfaulttolerantandhighlyavailableElasticsearchcluster.TheroleswecanassigntoeachElasticsearchnodeareasfollows:
MastereligiblenodeDatanodeQueryaggregatornode
Bydefault,eachElasticsearchnodeisbothmastereligible(itcanserveasamasternode),canholddata,andworkasaqueryaggregatornode.Youmaywonderwhythatisneeded.Letusgiveyouasimpleexample:ifthemasternodeisunderalotofstress,itmaynotbeabletohandletheclusterstaterelatedcommandfastenoughandtheclustercouldbecomeunstable.Thisisonlyasingle,simpleexampleandyoucanthinkofnumerousothers.
Becauseofthat,mostElasticsearchclustersthatarelargerthanafewnodes,usuallylookliketheonepresentedinthefollowingpicture:
Asyoucansee,ourhypotheticalclustercontainsthreeclientnodes(becauseweknowthattherewillbealotofqueries),alargenumberofdatanodesbecausetheamountofdatawillbelarge,andatleastthreemastereligiblenodesthatshouldn’tbedoinganythingelse.WhythreemasternodeswhenElasticsearchwillonlyuseasingleoneatanygiventime?Again,becauseofredundancyandtobeabletopreventsplitbrainsituationsbysettingdiscovery.zen.minimum_master_nodesto2,whichwouldallowustoeasilyhandlethefailureofasinglemastereligiblenodeinthecluster.
Letusnowgiveyousnippetsoftheconfigurationforeachtypeofnodeinourcluster.WealreadytalkedaboutthatintheUnderstandingnodediscoverysectioninChapter9,ElasticsearchClusterinDetail,butwewouldliketomentionthatonceagain.
www.EBooksWorld.ir
![Page 692: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/692.jpg)
QueryaggregatornodesThequeryaggregatornodesconfigurationisquitesimple.Toconfigurethose,wejustneedtotellElasticsearchthatwedon’twantthosenodestobemastereligibleortoholddata.Thiscorrespondstothefollowingconfigurationsnippetsintheelasticsearch.ymlfile:
node.master:false
node.data:false
DatanodesDatanodesarealsoverysimpletoconfigure.Wejustneedtotellthattheyshouldnotbemastereligible.However,wearenotbigfansofdefaultconfigurations(becausetheytendtochange)andthusourElasticsearchdatanodesconfigurationlooksasfollows:
node.master:false
node.data:true
MastereligiblenodesWe’veleftthemastereligiblenodestotheendofthegeneralscalingsection.Ofcourse,suchElasticsearchnodesshouldn’tbeallowedtoholddata,but,inadditiontothat,itisagoodpracticetodisableHTTPprotocolonsuchnodes.Thisisdonetoavoidaccidentallyqueryingthosenodes.Mastereligiblenodescanuselessresourcesthandataandqueryaggregatornodesandbecauseofthatweshouldensurethattheyareonlyusedformasterrelatedpurpose.Soourconfigurationformastereligiblenodeslooksmoreorlessasfollows:
node.master:true
node.data:false
http.enabled:false
www.EBooksWorld.ir
![Page 693: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/693.jpg)
www.EBooksWorld.ir
![Page 694: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/694.jpg)
PreparingtheclusterforhighindexingandqueryingthroughputUntilthischapter,wemostlytalkedaboutdifferentfunctionalitiesofElasticsearch,bothintermsofhandlingqueries,indexingdata,andtuning.However,runningaclusterinproductionisnotonlyaboutusingthisgreatsearchengine,butalsoaboutpreparingtheclustertohandleboththeindexingandqueryingload.Let’snowsummarizetheknowledgewehaveandseewhatarethethingsweneedtocareaboutwhenitcomestopreparingtheclusterforhighindexingandqueryingthroughput.
www.EBooksWorld.ir
![Page 695: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/695.jpg)
IndexingrelatedadviceInthissection,wewilllookattheindexingrelatedadvicearoundtuningElasticsearch.Eachproductionenvironmentdataisdifferent,indexrateisdifferent,anduser’sbehaviorisdifferent.Takethatintoconsiderationandrunperformancetestsonyourenvironment.Thiswillgiveyouthebestideaaboutwhattoexpectandwhatworksthebestinthecaseofyoursystem.
IndexrefreshrateOneofthegeneralthingsyoushouldpayattentiontoistheindexrefreshrate.Weknowthatrefreshratespecifieshowfastthedocumentswillbevisibleforsearchoperations.Theequationisquitesimple-thefastertherefreshrate,theslowerthequerieswillbeandthelowertheindexingthroughput.Ifwecanallowourselvestohaveaslowerrefreshrate,suchas10sor30s,goforit.ItwillputlesspressureonElasticsearch,Lucene,andhardwareingeneral.Rememberthatbydefaulttherefreshrateissetto1s,whichbasicallymeansthattheindexsearcherobjectisreopenedeverysecond.
Togiveyouabitofinsightintowhatperformancegainswearetalkingabout,wedidsomeperformancetestsincludingElasticsearchanddifferentrefreshrates.Withtherefreshrateof1swewereabletoindexabout1000documentspersecondusingasingleElasticsearchnode.Increasingtherefreshrateto5sgaveusincreaseinindexingthroughputofmorethan25percentandwewereabletoindexabout1250documentspersecond.Settingtherefreshrateto25sgaveusabout70percentofmorethroughputascomparedto1srefreshrate,whichwasabout1700documentspersecondonthesameinfrastructure.Itisalsoworthrememberingthatincreasingthetimeindefinitelydoesn’tmakemuchsense,becauseafteracertainpoint(dependingonyourdataloadandtheamountofdatayouhave)theincreaseofperformanceisnegligible.
Someperformancecomparisonsrelatedtoindexingthroughputandindexrefreshratecanbefoundintheblogpostathttp://blog.sematext.com/2013/07/08/elasticsearch-refresh-interval-vs-indexing-performance/.
ThreadpoolstuningBydefault,Elasticsearchcomeswithverygooddefaultswhenitcomestoallthreadpoolsconfiguration.Youshouldrememberthattuningthedefaultthreadpoolsconfigurationshouldbedoneonlywhenyoureallyseethatyournodesarefillingupthequeuesandtheyhavestillprocessingpowerleftthatcouldbedesignatedtotheprocessingofthewaitingoperationsorwhenyouwanttoincreasethepriorityofoneormoreoperations.
Forexample,ifyoudidyourperformancetestsandyousawyourElasticsearchinstancesnotbeingsaturated100percent,butontheotherhandyouexperiencedarejectedexecutionerror,thenthatisapointwhenyoushouldstartadjustingthethreadpools.Youcaneitherincreasetheamountofthreadsthatareallowedtobeexecutedatthesametimeorincreasethequeue.Ofcourse,youshouldalsorememberthatincreasingthenumberofconcurrentlyrunningthreadstoveryhighnumberswillleadtomanyCPUcontextswitches(http://en.wikipedia.org/wiki/Context_switch)whichwillresultinaperformance
www.EBooksWorld.ir
![Page 696: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/696.jpg)
drop.
AutomaticstorethrottlingBeforeElasticsearch2.0,wehadtocareabouthowoursegmentprocesswasconfiguredandhowmuchdiskI/Omergingcoulduseingeneral,butthatchanged.RightnowElasticsearchlooksathowI/Osubsystembehavesandadjuststhethrottlingandmergingprocessifthemergesarefallingbehindtheindexing.So,wenolongerneedtoautomaticallyadjustthrottlingfordiskbasedoperations.YoucanreadmoreabouttherelatedchangesonGitHubathttps://github.com/elastic/elasticsearch/pull/9243.
Handlingtime-baseddataWhenyouhavetime-baseddata,suchaslogsforexample,thearchitectureofyourindicesplaysaveryimportantrole.Let’sassumethatwehavelogsindexedintoElasticsearch.Theseusuallycomeinlargenumbers,areconstantlyindexed,andaretimerelated(aneventthatisloggedhappenedatacertainpointintime).TheassumptionisthatyouhaveacertainretentiontoyourdataandatimethatyouwouldlikethedatatobepresentandsearchableinElasticsearch.Afterthattime,youjustdeletethedataandforgetaboutit.
Withsuchassumptionsinmind,youcouldjustcreateasingleindexwithlotofshardsandtrytoindexlargeamountsoflogsthere.However,that’snottheperfectsolution.Firstofall,becauseofmerges–thelargertheindexgets,themoreexpensivethemergesare.ElasticsearchneedstomergelargerandlargersegmentsandmoreI/OandCPUisrequiredtohandlethem.Thismeansslowdowns.Inadditiontothat,deleteswillbeexpensivebecauseyouwillhavetodeletethedataeitherbyusingTTLorbyusingdeletebyqueryplugin–bothexpensivetouseintermsofperformanceandwillcauseevenmoremerging.Andthisisnoteverything–duringqueryingyouwillhavetorunthroughthewholeindextogeteventhesmallestsliceofthedata.So,aretherebetterindexarchitecturesfortime-baseddata?
Yes,oneofthemostcommonandbestsolutionsistousetimebasedindices.Dependingonthedatavolume,youcanhavedaily,weekly,monthly,orevenhourlyindices.Thedownsideisthenumberofshardsyouwillhavewhenthenumberofindicesgrow,butapartfromthatthereareonlypros:youcancontroleachindex,changethenumberofshardsifthatisneeded,andhavefastermergingbecausetheindiceswillbesmallercomparedtoonlyonebigindex.What’smore,deletingdatawon’tbepainfulatall–theideaistodeletethewholeindices;forexample,adayworthofdataincaseofdailyindices.Querieswillalsobenefit–youcanjustrunthequeryonasingletimebasedindextonarrowdownthesearchresults.Finally,Elasticsearch,bydefault,willcreatetheindicesforus.Forexample,whenusingdailyindices,wecanhavenamessuchaslogs_2016-01-01,logs_2016-01-02,andsoon.
TheonlythingweneedtocareaboutisprovidingtheindexnameonthebasisofthedateandcreatingtemplatestoconfigureeachnewlycreatedindexandElasticsearchwilldotherest.
Multipledatapaths
www.EBooksWorld.ir
![Page 697: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/697.jpg)
WiththereleaseofElasticsearch2.0,weweregiventheabilitytospecifymultiplepath.datapropertiesinourelasticsearch.ymlpointingtodifferentdirectoriesondifferentphysicaldevices.Elasticsearchcannowleveragethatbyputtingdifferentshardsondifferentdevicesandusingthemultiplepathsinthemostefficientway.Becauseofthat,wecanparallelizewritingtodisksifwehavemorethanasingledisk.Thisisespeciallyusefulforhighindexingusecaseswhereyouindexalotofdata.
DatadistributionAsweknow,eachindexintheElasticsearchworldcanbedividedintomultipleshardsandeachshardcanhavemultiplereplicas.IncaseswhenyouhavemultipleElasticsearchnodes(andyouwillprobablyhaveinproduction),youshouldthinkaboutthenumberofshardsandreplicasandhowthatwillaffectyournodes.Datadistributionmaybecrucialtoeventheloadontheclusterandnothavesomenodesdoingmoreworkthantheotherones.
Let’stakethefollowingexample.Imaginewehaveaclusterthatisbuiltof4nodesandithasasingleindexcalledbookbuiltof3shardsandonereplica.Suchadeploymentwilllookasfollows:
Asyoucansee,thefirsttwonodeshavetwophysicalshardsallocatedtothem,whilethelasttwonodeshaveonlyoneshardallocatedeach.Theactualdataallocationisnoteven.Whensendingthequeriesandindexingdata,wewillhavethefirsttwonodesdomoreworkthantheothertwo-thisiswhatwewanttoavoid.Oneoptionistohavethebookindexhavetwoshardsandonereplica,soitlooksasfollows:
www.EBooksWorld.ir
![Page 698: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/698.jpg)
Thisarchitecturewillworkanditisperfectlyfine.Wedon’thavetohaveprimaryshardsonallournodes,wecanhavereplicas,dependingonwhatbottleneckweexpect.Forqueryingwemaywanttohavemorereplicas,forindexingmoreprimaries.
Wecanalsohaveourprimaryshardssplitevenly,likeinthefollowingimage:
ThethingtorememberthoughisthatinbothcaseswewillendupwithevendistributionofshardsandreplicasandElasticsearchwilldosimilaramountofworkonallthenodes.Ofcourse,withmoreindices(likehavingdailyindices)itmaybetrickiertogetthedataevenlydistributedanditmaynotbepossibletohaveevenlydistributedshards,butweshouldtrytogettosuchpoint.
Onemorethingtorememberwhenitcomestodatadistributionandshardsandreplicasisthatwhendesigningyourindexarchitecture,youshouldrememberwhatyouwanttoachieve.Ifyouaregoingforaveryhighindexingusecase,youmaywanttospreadtheindexintomultipleshardstolowerthepressurethatisputontheCPUandtheI/Osubsystemoftheserver.Thisisalsotrueforrunningexpensivequeries,becausewithmoreshardsyoucanlowertheloadonasingleserver.However,withthequeriesthereis
www.EBooksWorld.ir
![Page 699: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/699.jpg)
onemorething-ifyournodescan’tkeepupwiththeloadcausedbyqueries,youcanaddmoreElasticsearchnodesandincreasethenumberofreplicassothatthephysicalcopiesoftheprimaryshardsareplacedonthosenodes.Thatwillmakeindexingabitslowerbutwillgiveyouthecapacitytohandlemorequeriesatthesametime.
BulkindexingThisisveryobviousadvice,butyouwouldbesurprisedhowmanyElasticsearchusersforgetaboutindexingdatainbulksinsteadofsendingthedocumentsonebyone.Sotheadvicehereistodobulksinsteadofonebyoneindexingwheneverpossible.ThethingtorememberthoughisnottooverloadElasticsearchwithtoomanybulkrequestsandtokeepthemunderareasonablesize(donotpushmillionsofdocumentsinasinglerequest).Rememberaboutthebulkthreadpoolanditssizeandtrytoadjustyourindexersnottogobeyonditoryouwillfirststarttoqueuetheserequestsand,ifElasticsearchwillnotbeabletoprocessthem,youwillquicklystartseeingrejectedexecutionexceptionsandyourdatawon’tbeindexed.
Justasanexample,wewouldliketoshowresultsoftestswedidsometimeagoforthetwotypesofindexing:onebyoneandbulks.Inthefollowingimage,wehavetheindexingthroughputwhenrunningindexationonedocumentbyone:
Inthisnextimage,wedothesame,butinsteadofindexingdocumentsonebyone,weindextheminbatchesof10documents(whichisstillarelativelylownumberofdocumentsinabulk):
www.EBooksWorld.ir
![Page 700: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/700.jpg)
Asyoucansee,whenindexingdocumentsonebyone,wewereabletoindexabout30documentspersecondanditwasstable.Thesituationchangedwithbulkindexingandbatchesof10documents;wewereabletoindexslightlymorethan200documentspersecond.Sothedifferencecanbeclearlyseen.
Ofcoursethisisaverybasiccomparisonofindexingspeed.Toshowtherealdifference,weshouldusedozensofthreadsandpushElasticsearchtoitslimits.However,theprecedingcomparisonshouldgiveyouabasicviewoftheindexingthroughputgainswhenusingbulkindexing.
RAMbufferforindexingRemember,themoreavailableRAMfortheindexingbuffer(theindices.memory.index_buffer_sizeproperty),themoredocumentsElasticsearchcanholdinmemory.However,wedon’twanttohaveElasticsearchoccupy100percentoftheavailablememory.Theindexingbuffercanhelpuswithdelayingtheflushtodisk,whichwillmeanlessI/Opressureandlessmerges.YoucanreadmoreaboutindexingbufferconfigurationinChapter9,ElasticsearchClusterinDetail.
www.EBooksWorld.ir
![Page 701: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/701.jpg)
AdviceforhighqueryratescenariosOneofthegreatfeaturesofElasticsearchisitsabilitytosearchandanalyzethedatathatwasindexed.However,sometimesitisnecessarytoadjustElasticsearchandourqueriestonotonlygettheresultsofthequery,butalsogetthemfast(orinareasonableamountoftime).Inthissection,wewilllookatthepossibilitiesofpreparingElasticsearchforhighquerythroughputusecases,butnotjustthat.Wewillalsolookatgeneralperformancetipswhenitcomestoquerying.
ShardrequestcacheThepurposeoftheshardrequestcacheistocacheaggregations,suggesterresults,andnumbersofhits(itwillnotcachethereturneddocumentsandthusonlyworkswithsize=0).Whenyourqueriesuseaggregationsorsuggestions,itmaybeagoodideatoenablethiscache(itisdisabledbydefault)sothatElasticsearchcanre-usethedatastoredthere.Thebestthingaboutthecacheisthatitpromisesthesamenearreal-timesearchasasearchthatisnotcached.YoucanreadmoreaboutcachesandtheshardrequestcacheinparticularinChapter9,ElasticsearchClusterinDetail.
ThinkaboutthequeriesThisisthemostgeneraladvicewecanactuallygive–youshouldalwaysthinkaboutoptimalquerystructure,filterusage,andsoon.Forexample,let’slookatthefollowingquery:
{
"query":{
"bool":{
"must":[
{
"query_string":{
"query":"masteringANDdepartment:itANDcategory:book",
"default_field":"name"
}
},
{
"term":{
"tag":"popular"
}
},
{
"term":{
"tag":"2014"
}
}
]
}
}
}
Itreturnsthebookmatchingafewconditions.However,thereareafewthingswecanimproveintheprecedingquery.Forexample,wecanmovethestaticthingssuchasthe
www.EBooksWorld.ir
![Page 702: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/702.jpg)
tag,department,andcategoryfieldrelatedconditionstothefiltersectionoftheBooleanquery,sothatthenexttimeweusesomepartsofthequerywesaveCPUcyclesandre-usetheinformationstoredincache.Thatstaticfilteringinformationisalsonotrelevantwhenitcomestoscoring.Becauseofthatwecanmovethosestaticelementstothefiltersectionandomitscoringcalculationforthem.Forexample,thisishowtheoptimizedquerywilllooklike:
{
"query":{
"bool":{
"must":[
{
"query_string":{
"query":"mastering",
"default_field":"name"
}
}
],
"filter":[
{
"term":{
"tag":"popular"
}
},
{
"term":{
"tag":"2014"
}
},
{
"term":{
"department":"it"
}
},
{
"term":{
"category":"book"
}
}
]
}
}
}
Asyoucansee,thereareafewthingsthatwedid.Westillusedtheboolquery,butweintroducedtheuseofthefiltersection.Weusedfilteringforthestatic,non-analyzedfields.Thisallowsustoeasilyre-usethefiltersinthenextqueriesthatweexecute.Becauseofsuchqueryrestructuring,wewereabletosimplifythemainquery.Thisisexactlywhatyoushouldbedoingwhenoptimizingyourqueriesordesigningthem-haveoptimizationandperformanceinmindandtrytokeepthemasoptimalastheycanbe.Thiswillresultinfasterexecutionofthequeries,lowerresourceconsumption,andbetterhealthofthewholeElasticsearchcluster.
www.EBooksWorld.ir
![Page 703: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/703.jpg)
ParallelizeyourqueriesOnethingthatisusuallyforgottenistheneedofparallelizingqueries.Imaginethatyouhaveadozennodesinyourclusterbutyourindexisbuiltofasingleshard.Iftheindexislarge,yourquerieswillperformworsethanyouexpect.Ofcourseyoucanincreasethenumberofreplicas,butthatwon’thelp.Asinglequerywillstillgotoasingleshardinthatindex,becausereplicasarenotmorethanthecopiesoftheprimaryshardandtheycontainthesamedata(oratleasttheyshould).Thisisalsotruenotonlyforindiceshavingoneshardbutalsoifyouhavemorethanoneshard,buttheyareverylarge,youcanstillhaveperformancerelatedproblems.Itissaidthatthequeryisonlyasfastastheslowestpartialqueryresponse.
Ofcourse,theparallelizationalsodependsontheusecase.IfyourunalotofqueriestoElasticsearch,youmaynotneedtoparallelizethequeries,especiallywhentheshardsaresmallenoughandyoudon’tseeproblemsatshardlevel.Ingeneral,lookatyourElasticsearchnodesandseeiftheyhaveunusedCPUcoresand,ifthat’sthecase,youmayhaveroomforimprovementandparallelization.
FielddatacacheandbreakingthecircuitWehavetwodifferentfactorswecantunetobesurethatwedon’trunintooutofmemoryerrors.Firstofall,wecanlimitthesizeofthefielddatacache.Thesecondthingisthecircuitbreaker,whichwecaneasilyconfiguretojustthrowanexceptioninsteadofloadingtoomuchdata.Combiningthesetwothingswillensurethatwedon’trunintomemoryissues.Evenifyouareusingdocvaluesalot,youmaystillrunintooutofmemoryissues.Forexample,foranalysedfields,whichcan’tusedocvaluesandwilluse,fielddatacache–configurethefielddatacacheandcircuitbreakerscorrectly.YoucanreadmoreabouthowtoconfiguretheminChapter9,ElasticsearchClusterinDetail.
KeepsizeandshardsizeundercontrolWhendealingwithsomeofthequeriesthatuseaggregations,wehavethepossibilityofusingtwoproperties:sizeandshard_size.Thesizeparameterdefineshowmanybucketsshouldbereturnedbythefinalaggregationresults;thenodethataggregatesthefinalresultswillgetthetopbucketsfromeachshardthatreturnstheresultandwillonlyreturnthetopsizeofthemtotheclient.Theshard_sizeparametertellsElasticsearchaboutthesamebutattheshardlevel.Increasingthevalueoftheshard_sizeparameterwillleadtomoreaccurateaggregations(likeinthecaseofsignificanttermsaggregation)atthecostofnetworktrafficandmemoryusage.Loweringthatparameterwillcauseaggregationresultstobelessprecise,butwewillbenefitfromlowermemoryconsumptionandlowernetworktraffic.Ifweseethatthememoryusageistoolarge,wecanlowerthesizeandshard_sizepropertiesforproblematicqueriesandseeifthequalityoftheresultsisstillacceptable.
www.EBooksWorld.ir
![Page 704: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/704.jpg)
www.EBooksWorld.ir
![Page 705: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/705.jpg)
MonitoringElasticsearchmonitoringAPIsexposealotofinformation,bothaboutthesearchengineitselfaswellasabouttheenvironment,suchastheoperatingsystem.WesawthatinChapter10,AdministratingYourCluster.Becauseofthisandtheeaseofretrievingthisinformation,numerousapplicationswerebuilt–onesthatallowustodomonitoringandbeyond.Someoftheseapplicationsaresimpleandjustreadthedatainrealtimewithoutanypersistentstorage,whileothersallowustoreadhistoricaldataaboutourclusterbehavior.Inthischapter,wewillonlyslightlytouchthetopofthepileofinformationaboutsuchapplications,butwestronglyadviseyoutogetfamiliarwithsomeofthemastheycanmakeyoureverydayworkwithElasticsearcheasier.
WechosethreeexamplesofmonitoringsolutionswhichtakeadifferentapproachofintegrationwithElasticsearch.ThefirsttwotoolsareavailableasElasticsearchpluginsandthethirdtakesadifferentapproachtointegration.
www.EBooksWorld.ir
![Page 706: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/706.jpg)
ElasticsearchHQThistoolisavailableasanElasticsearchpluginbutcanalsobedownloadedseparatelyasaJavaScriptapplicationruninabrowser.
ElasticsearchHQusesJavaScriptandAJAXtechniqueswheredataisfetchedperiodicallyfromthecluster,preparedforvisualizationonthebrowserside,andshowntotheuser.
Thetoolallowsustotrackstatisticsonaparticularnode.Thebrowsercanpresentvitalinformationabouttheclusterandparticularnodes.ThefollowingscreenshotshowsthegraphicaluserinterfacefromElasticsearchHQ:
Wehavethebasicinformationaboutthecluster,thenumberofnodes,andElasticsearchhealth.Wecanalsoseewhichnodewearelookingatandsomestatisticsaboutthenode,whichincludethememoryusage(bothheapandnon-heap),thenumberofthreads,Javavirtualmachinegarbagecollectorwork,andsoon.Thepluginalsopresentssimplifiedinformationaboutschemaandshardsandallowsexecutionofsimplequeries.
InordertoinstallElasticsearchHQ,oneshouldjustrunthefollowingcommand:
bin/plugininstallroyrusso/elasticsearch-HQ
Afterthat,theGUIwillbeavailableathttp://localhost:9200/_plugin/hq/.
OnethingtorememberisthatElasticsearchHQdoesn’tpersistthefetcheddataanywhere,sothedataisonlyfetchedwhenyourbrowserisrunningandhasElasticsearchHQopened.Ifsomethinghashappenedinthepast,youwon’tbeabletodiagnoseit.
www.EBooksWorld.ir
![Page 707: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/707.jpg)
MarvelMarvelisthetoolcreatedbytheElasticsearchteam.Inthecurrentversion,itisbuiltasapluginforavisualizationplatformcalledKibana(https://www.elastic.co/products/kibana).
NoteKibanaisoutofthescopeofthisbook.YoucanfindmoreaboutKibanaonofficialproductpageavailableat
https://www.elastic.co/.
Marvelalsovisualizesbasicinformationaboutclustersandnodesbydrawingnicegraphsthataredynamicallyupdatedovertime.ThemaindifferencefromElasticsearchHQisthattheperformancedataisstoredontheserverside(inthesameorexternalElasticsearchcluster),sohistoricaldataisavailable.Theexamplescreenshotispresentednext:
TheinstallationprocedureforMarvelcontainsthreesteps:
bin/plugininstalllicense
bin/plugininstallmarvel-agent
Andfinally,thethirdstepistoinstalltheMarvelplugininKibanabyrunningthefollowingcommand:
bin/kibanaplugin--installelasticsearch/marvel/latest
www.EBooksWorld.ir
![Page 708: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/708.jpg)
SPMforElasticsearchThistoolpresentsadifferentapproachthanthepreviouslymentionedtools.SPMisaSoftwareasaService(SaaS)solutioncreatedformonitoringElasticsearchinstallationsofanysizeandallowsmonitoringseveralclustersanddifferenttechnologies.ThoughitsrootsareSaaS-based,itisalsoavailableonpremises,whichmeansthatyoucanrunSPMonyourownmachineswithouttheneedforsendingyourmetricstocloud.
InformationissentbysimpleclientsoftwareinstalledontheElasticsearchmachinetotheSPMservers.Themainadvantageisthepossibilityofstoringinformationforawiderrangeoftimeandseeingwhatwashappeninginthepast.Youcancreateyourowndashboardsandcorrelatemetricswithlogsbetweenmultipleapplications(SPMallowsyoutomonitorawidevarietyofapplications).
ThefollowingscreenshotshowsthedashboardofSPMforElasticsearch:
Theoverviewdashboardshownintheprecedingscreenshotprovidesinformationabouttheclusternodes,therequestrateandlatency,thenumberofdocumentsintheindices,CPUusage,load,memorydetails,Javavirtualmachinememory,thediskspaceusage,andfinallynetworktraffic.Youcangetdetailedinformationabouteachoftheseelementsbygoingintothetabdedicatedtoit.
YoucanfindadditionalinformationaboutSPMinstallationandavailableoptionsathttp://sematext.com/spm/index.html.
www.EBooksWorld.ir
![Page 709: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/709.jpg)
www.EBooksWorld.ir
![Page 710: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/710.jpg)
SummaryInthischapter,wefocusedonscalingandtuningElasticsearch.Westartedwiththehardwarepreparationsanddecisionsweneedtomake.Next,wetunedasingleElasticsearchnodeasmuchaswecouldandafterthatweconfiguredthewholeclustertoworkaswellasitcould.Wediscussedverticalexpansionpossibilitiesandwelearnedhowtomonitorourclusteronceithitstheproductionenvironment.
Sonowwehavereachedtheendofthebook.Wehopethatitwasanicereadingexperienceandthatyoufoundthebookinteresting.Sincethepreviouseditionofthebook,Elasticsearchhaschangedalot.Notonlywhenitcomestoversions,butalsowhenitcomestofunctionalities.Someofthefeaturesarenolongerthere,someofthemweremovedtoplugins,andofcoursenewfeatureswereadded.WereallyhopethatyouhavelearnedsomethingfromthisbookandnowyouwillfinditeasiertouseElasticsearcheveryday–nomatterifyouareabeginnerinthisworldorasemi–experiencedElasticsearchuser.Astheauthorsofthisbook,butalsoasElasticsearchusersourselves,wetriedtobringyou,ourreaders,thebestreadingexperiencewecould.OfcourseElasticsearchismorethanwedescribedinthebook,especiallywhenitcomestomonitoringandadministrationcapabilitiesandAPI.However,thenumberofpagesislimitedandifweweretodescribeeverythingingreatdetailswewouldhaveendedupwithabookonethousandpageslong.WeneedtorememberthatElasticsearchisnotonlyuserfriendlybutalsoprovidesalargeamountofconfigurationoptions,queryingpossibilities,andsoon.Duetothat,wehadtochoosewhichfunctionalitiestodescribeingreaterdetails,whichhadtobeonlymentioned,andwhichhadtobetotallyskipped.Aswiththetwopreviouseditionsofthebookyouareholding,wehopethatwemadetherightchoiceandthatyouarehappyaboutwhatyou’veread.
WewouldalsoliketosaythatitisworthrememberingthatElasticsearchisconstantlyevolving.Whenwritingthisbook,wewentthroughafewstableversionsfinallymakingittothereleaseofElasticsearch2.2.Evenbackthenweknewthatnewfeaturesandimprovementswerecoming,likesomeofthechangesmentionedinthebookthatwillbepartofthenextrelease,oratleasttheyareplannedtobe.BesuretochecktheofficialdocumentationofElasticsearchperiodicallyforthereleasenotesfornewversionsofElasticsearch,ifyouwanttobeuptodatewiththenewfeaturesbeingadded.Wewillalsobewritingaboutnewfeaturesthatwethinkareworthmentioningonwww.elasticsearchserverbook.com.Soifyouareinterested,visitthesitefromtimetotime.
Onceagainthankyouforthetimeyou’vespentwiththebook.
www.EBooksWorld.ir
![Page 711: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/711.jpg)
IndexA
advices,forhighqueryratescenariosabout/Adviceforhighqueryratescenariosshardrequestcache/Shardrequestcachequeries/Thinkaboutthequeriesqueries,parallelizing/Parallelizeyourqueriesfielddatacache/Fielddatacacheandbreakingthecircuitcircuit,breaking/Fielddatacacheandbreakingthecircuitsize,controlling/Keepsizeandshardsizeundercontrolshardsize,controlling/Keepsizeandshardsizeundercontrol
aggregationengineworking/Insidetheaggregationsengine
aggregationsabout/Aggregationsgeneralquerystructure/Generalquerystructuretypes/Aggregationtypesdate_histogram/Datehistogramaggregationgeodistanceaggregations/Geodistanceaggregationsgeohashgridaggregation/Geohashgridaggregationglobalaggregation/Globalaggregationsignificant_termsaggregation/Significanttermsaggregationsampleraggregation/Sampleraggregationchildrenaggregation/Childrenaggregationnestedaggregation/Nestedaggregationreverse_nestedaggregation/Reversenestedaggregationnestingaggregations/Nestingaggregationsandorderingbuckets
aggregations,typesmetrics/Aggregationtypes,Metricsaggregationsbuckets/Aggregationtypes,Bucketsaggregationspipeline/Aggregationtypes
AmazonURL/Costandperformanceflexibility
AmazonS3URL/Creatingasnapshotrepository
AnalyzeAPIURL/Definingyourownanalyzers
analyzersusing/Usinganalyzersout-of-the-boxanalyzers/Out-of-the-boxanalyzersdefining/Definingyourownanalyzersdefaultanalyzers/Defaultanalyzers
www.EBooksWorld.ir
![Page 712: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/712.jpg)
ApacheLucene/GettingbacktoApacheLuceneURL/Fulltextsearchingglossary/TheLuceneglossaryandarchitecturearchitecture/TheLuceneglossaryandarchitecturedocument/TheLuceneglossaryandarchitecturefield/TheLuceneglossaryandarchitectureterm/TheLuceneglossaryandarchitecturetoken/TheLuceneglossaryandarchitecturetokenizer/Inputdataanalysisscoring/IntroductiontoApacheLucenescoring
ApacheLuceneJavadocsfortheTFIDFURL/Scoringandqueryrelevance
ApacheLucenescoringabout/IntroductiontoApacheLucenescoringdocumentmatching,factors/Whenadocumentismatcheddefaultscoringformula/Defaultscoringformularelevantdocuments/Relevancymatters
ApacheSolrURL/UsingApacheSolrsynonyms
ApacheSolrsynonymsusing/UsingApacheSolrsynonymsexplicitsynonyms/Explicitsynonymsequivalentsynonyms/Equivalentsynonymsexpandproperty/Expandingsynonyms
ApacheTikaURL/Detectingthelanguageofthedocument
arbitrarygeoshapesabout/Arbitrarygeoshapespoint/Pointenvelope/Envelopepolygon/Polygonmultipolygon/Multipolygonexampleusage/Anexampleusagestoring,inindex/Storingshapesintheindex
arguments,CatAPIURL/Commonarguments
attributes,indexstructuremappingindex_name/Commonattributesindex/Commonattributesstore/Commonattributesdoc_values/Commonattributesboost/Commonattributesnull_value/Commonattributescopy_to/Commonattributes
www.EBooksWorld.ir
![Page 713: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/713.jpg)
include_in_all/Commonattributesprecision_step/Number,Datecoerce/Numberignore_malformed/Number,Dateformat/Dateformat,referencelink/Datenumeric_resolution/Date
availableobjects,scriptexecution_doc/Objectsavailableduringscriptexecution_source/Objectsavailableduringscriptexecution_fields/Objectsavailableduringscriptexecution
AzureURL/Creatingasnapshotrepository
www.EBooksWorld.ir
![Page 714: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/714.jpg)
Bbasicqueries
about/Basicqueriestermquery/Thetermquerytermsquery/Thetermsquerymatchallquery/Thematchallquerytypequery/Thetypequeryexistsquery/Theexistsquerymissingquery/Themissingquerycommontermsquery/Thecommontermsquerymatchquery/Thematchquerymultimatchquery/Themultimatchqueryquerystringquery/Thequerystringquerysimplequerystringquery/Thesimplequerystringqueryidentifiersquery/Theidentifiersqueryprefixquery/Theprefixqueryfuzzyquery/Thefuzzyquerywildcardquery/Thewildcardqueryrangequery/Therangequeryregularexpressionquery/Regularexpressionquerymorelikethisquery/Themorelikethisquery
batchindexingused,forspeedingupindexingprocess/Batchindexingtospeedupyourindexingprocess
Booleanpropertiessetnode.master/Configuringnoderolesnode.data/Configuringnoderolesnode.client/Configuringnoderoles
boolqueryabout/Theboolqueryshouldsection/Theboolquerymustsection/Theboolquerymust_notsection/Theboolqueryfilterparameter/Theboolqueryboostparameter/Theboolqueryminimum_should_matchparameter/Theboolquerydisable_coordparameter/Theboolqueryused,forexplicitfiltering/Explicitfilteringwithboolquery
boostingquery/Theboostingqueryboost_modeparameter
multiplyvalue/Structureofthefunctionqueryreplacevalue/Structureofthefunctionquerysumvalue/Structureofthefunctionquery
www.EBooksWorld.ir
![Page 715: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/715.jpg)
avgvalue/Structureofthefunctionquerymaxvalue/Structureofthefunctionqueryminvalue/Structureofthefunctionquery
bucketaggregationsordering/Nestingaggregationsandorderingbuckets,Bucketsordering
buckets/Generalquerystructurebucketsaggregations
about/Bucketsaggregationsfilteraggregation/Filteraggregationfiltersaggregation/Filtersaggregationtermsaggregation/Termsaggregationrangeaggregation/Rangeaggregationdate_rangeaggregation/Daterangeaggregationip_rangeaggregation/IPv4rangeaggregationmissingaggregation/Missingaggregationhistogramaggregation/Histogramaggregation
bulkindexingdata,preparing/Preparingdataforbulkindexing
www.EBooksWorld.ir
![Page 716: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/716.jpg)
Ccaches
about/Elasticsearchcachesfielddatacache/Fielddatacachefielddata,usingwithdocvalues/Fielddataanddocvaluesshardrequestcache/Shardrequestcachenodequerycache/Nodequerycacheindexingbuffers/Indexingbuffersavoiding,scenarios/Whencachesshouldbeavoided
CatAPIabout/TheCatAPIdefining/Thebasicsusing/UsingCatAPIcommonarguments/Commonargumentsexamples/Theexamples,Gettinginformationaboutthenodes
childrenaggregationabout/Childrenaggregation
CIDRnotationURL/IPv4rangeaggregation
ClassDateTimeFormatURL/Tuningthetypedeterminingmechanismfordates
clientnodeabout/Noderoles,Clientnode
clusterabout/Nodesandclustersinstalling/Installingandconfiguringyourclusterconfiguring/Installingandconfiguringyourclusterdirectorylayout/Thedirectorylayoutsystem-specificinstallationandconfiguration/Thesystem-specificinstallationandconfiguration
clusterhealthAPIabout/ClusterhealthAPIinformationdetails,controlling/Controllinginformationdetailsadditionalparameters/Additionalparameters
clusterrebalancingcontrolling/Controllingclusterrebalancingdefining/Understandingrebalanceimplementing/Clusterbeingreadysettings/Theclusterrebalancesettings,Controllingthenumberofshardsbeingmovedbetweennodesconcurrently
clustersettingsAPI/TheclustersettingsAPIclusterwideallocation
about/Cluster-wideallocation
www.EBooksWorld.ir
![Page 717: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/717.jpg)
allocationawareness/Allocationawarenessallocationawareness,forcing/Forcingallocationawarenessfiltering/Filtering
CMSsystemURL/Creatinganewdocument
commontermsquery/Thecommontermsquerycompletionsuggester
about/CompletionsuggesterinElasticsearch2.2/Completionsuggester
completionsuggester,Elasticsearch2.1data,indexing/Indexingdataindexeddata,querying/Queryingindexedcompletionsuggesterdatacustomweights/Customweights
completionsuggester,Elasticsearch2.2about/Completionsuggester
compoundqueriesabout/Compoundqueriesboolquery/Theboolquerydis_maxquery/Thedis_maxqueryboostingquery/Theboostingqueryconstant_scorequery/Theconstant_scorequeryindicesquery/Theindicesquery
compressedoopsURL/Thememory
compressedordinaryobjectpointersreferencelink/MultipleElasticsearchinstancesonasinglephysicalmachine
configurationoptions,phrasesuggestermax_errors/Configurationseparator/Configuration
configurationoptions,termsuggestertext/Termsuggesterconfigurationoptionsfield/Termsuggesterconfigurationoptionsanalyzer/Termsuggesterconfigurationoptionssize/Termsuggesterconfigurationoptionssuggest_mode/Termsuggesterconfigurationoptionssort/Termsuggesterconfigurationoptions
constant_scorequery/Theconstant_scorequerycontent
searching,indifferentlanguages/Searchingcontentindifferentlanguagescontent,searchingindifferentlanguages
about/Searchingcontentindifferentlanguageslanguages,handling/Handlinglanguagesdifferentlymultiplelanguages,handling/Handlingmultiplelanguagesdocumentlanguage,detecting/Detectingthelanguageofthedocument
www.EBooksWorld.ir
![Page 718: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/718.jpg)
sampledocument/Sampledocumentmappings/Themappingsdata,querying/Queryingqueries,combining/Combiningqueries
contextsuggesterabout/Contextsuggestertypes/Contexttypesusing/Usingcontextgeolocationcontext,using/Usingthegeolocationcontext
contextswitchesreferencelink/Threadpoolstuning
coretypes,indexstructuremappingabout/Coretypescommonattributes/Commonattributesstring/Stringnumber/Numberboolean/Booleanbinary/Binarydate/Date
counttoitfield/Addingpartialdocumentscreate,retrieve,update,delete(CRUD)
URL/ManipulatingdatawiththeRESTAPIcURLcommand
URL/InstallingElasticsearch
www.EBooksWorld.ir
![Page 719: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/719.jpg)
Ddata
manipulating,withRESTAPI/ManipulatingdatawiththeRESTAPIstoring,inElasticsearch/StoringdatainElasticsearchpreparing,forbulkindexing/Preparingdataforbulkindexingindexing/Indexingthedata_allfield/The_allfield_sourcefield/The_sourcefieldinternalfields/Additionalinternalfieldssorting/Sortingdatadefaultsorting/Defaultsortingquerying,inchilddocuments/Queryingdatainthechilddocumentsquerying,inparentdocuments/Queryingdataintheparentdocuments
data,thatisnotflatindexing/Indexingdatathatisnotflatdata/Dataobjects/Objectsarrays/Arraysmappings/Mappingsdynamicbehavior/Tobeornottobedynamicobjectindexing,disabling/Disablingobjectindexing
datanodeabout/Noderoles
dataquerying,casesidentifiedlanguage,using/Querieswithanidentifiedlanguageunknownlanguage,using/Querieswithanunknownlanguage
datasetsforegroundsets/Choosingsignificanttermsbackgroundsets/Choosingsignificantterms
datasortingabout/Sortingdatadefaultsorting/Defaultsortingfields,selecting/Selectingfieldsusedforsortingmode/Sortingmodebehaviorformissingfields,specifying/Specifyingbehaviorformissingfieldsdynamiccriteria/Dynamiccriteriascoring,calculating/Calculatescoringwhensorting
date_histogramaggregationsabout/Datehistogramaggregationtimezones/Timezones
DEBpackageused,forinstallingElasticsearch/InstallingElasticsearchusingtheDEBpackage
www.EBooksWorld.ir
![Page 720: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/720.jpg)
defaultindexing/Defaultindexingderivativeaggregation
URL/Derivativeaggregationdesignatednodesrolesforlargerclusters
about/Designatednoderolesforlargerclustersqueryaggregatornodes/Queryaggregatornodesdatanodes/Datanodesmastereligiblenodes/Mastereligiblenodes
DigitalOceanURL/Costandperformanceflexibility
directorylayout,clusterbin/Thedirectorylayoutconfig/Thedirectorylayoutlib/Thedirectorylayoutmodules/Thedirectorylayoutdata/Thedirectorylayoutlogs/Thedirectorylayoutplugins/Thedirectorylayoutwork/Thedirectorylayout
disk-basedshardallocationabout/Disk-basedshardallocationconfiguring/Configuringdiskbasedshardallocationdisabling/Disablingdiskbasedshardallocation
dis_maxquery/Thedis_maxqueryDocker
referencelink/MultipleElasticsearchinstancesonasinglephysicalmachinedocument
about/Documentcreating/Creatinganewdocumentautomaticidentifiercreation,creating/Automaticidentifiercreationretrieving/Retrievingdocumentsupdating/Updatingdocumentsnon-existingdocuments,dealingwith/Dealingwithnon-existingdocumentspartialdocuments,adding/Addingpartialdocumentsdeleting/Deletingdocuments
documenttype/Documenttypedoubletype
URL/Numberdynamictemplates
about/Templatesanddynamictemplates,Dynamictemplatesmatchingpattern/Thematchingpatterntargetfielddefinition,writing/Fielddefinitions
www.EBooksWorld.ir
![Page 721: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/721.jpg)
EElasticsearch
about/ThebasicsofElasticsearchkeyconcepts/KeyconceptsofElasticsearchindex/Indexdocument/Documentdocumenttype/Documenttypemapping/Mappingindexing/Indexingandsearching,Elasticsearchindexingsearching/IndexingandsearchingURL/InstallingElasticsearch,Availablesimilaritymodelsinstalling/InstallingElasticsearchrunning/RunningElasticsearchshuttingdown/ShuttingdownElasticsearchconfiguring/ConfiguringElasticsearchinstalling,withRPMpackage/InstallingElasticsearchusingRPMpackagesinstalling,withDEBpackage/InstallingElasticsearchusingtheDEBpackageconfigurationfiles,localization/Elasticsearchconfigurationfilelocalizationquerying/QueryingElasticsearch,Asimplequeryexampledata/Theexampledatapaging/Pagingandresultsizeresultsize,controlling/Pagingandresultsizeversionvalue,returning/Returningtheversionvaluescore,limiting/Limitingthescorereturnfields,selecting/Choosingthefieldsthatwewanttoreturnsourcefiltering/Sourcefilteringscriptfields,using/Usingthescriptfieldsparameters,passingtoscriptfields/Passingparameterstothescriptfieldsparametrs,passingtoscriptfields/Passingparameterstothescriptfieldsscriptingcapabilities/ScriptingcapabilitiesofElasticsearchspatialcapabilities/Elasticsearchspatialcapabilitiesreferencedocumentation,URL/Configurationplugins/Elasticsearchpluginscaches/Elasticsearchcacheshardwarepreparations/Hardwaremonitoring/MonitoringKibana,URL/Marvel
Elasticsearch2.1URL/Threadpools
Elasticsearch2.2completionsuggester/Completionsuggester
Elasticsearchclusterpreparing,forhighindexing/Preparingtheclusterforhighindexingand
www.EBooksWorld.ir
![Page 722: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/722.jpg)
queryingthroughputpreparing,forhighquerying/Preparingtheclusterforhighindexingandqueryingthroughput
ElasticsearchHQtoolusing/ElasticsearchHQ
Elasticsearchindexingabout/Elasticsearchindexingshards/Shardsandreplicasreplicas/Shardsandreplicasindices,creating/Creatingindices
Elasticsearchinfrastructurekeyconcepts/KeyconceptsoftheElasticsearchinfrastructurenode/Nodesandclusterscluster/Nodesandclustersshard/Shardsreplica/Replicasgateway/Gateway
Elasticsearchmonitoringabout/MonitoringElasticsearchHQtool,using/ElasticsearchHQMarveltool,using/MarvelSPMtool,using/SPMforElasticsearch
Elasticsearchtimemachineabout/Elasticsearchtimemachinesnapshotrepository,creating/Creatingasnapshotrepositorysnapshots,creating/Creatingsnapshotssnapshot,restoring/Restoringasnapshotparameters/Restoringasnapshotoldsnapshots,deleting/Cleaningup–deletingoldsnapshots
existsquery/TheexistsqueryExplainAPI
URL/Explainingthequeryexplaininformation
about/Understandingtheexplaininformationfieldanalysis/Understandingfieldanalysisquery,explaining/Explainingthequery
www.EBooksWorld.ir
![Page 723: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/723.jpg)
Ffactors,forscorepropertycalculation
documentboost/Whenadocumentismatchedfieldboost/Whenadocumentismatchedcoord/Whenadocumentismatchedinversedocumentfrequency/Whenadocumentismatchedlengthnorm/Whenadocumentismatchedtermfrequency/Whenadocumentismatchedquerynorm/Whenadocumentismatched
FastVectorHighlighterURL/Underthehood
FedoraLinuxURL/InstallingElasticsearchusingRPMpackages
fielddatacacheabout/Fielddatacachesize,controlling/Fielddatasizecircuitbreakers/Circuitbreakers
fielddefinitionvariables,dynamictemplates{name}/Fielddefinitions{dynamic_type}/Fielddefinitions
filteringabout/Filteringinclude/Whatdoinclude,exclude,andrequiremeanrequire/Whatdoinclude,exclude,andrequiremeanexclude/Whatdoinclude,exclude,andrequiremean
filterslowercasefilter/Inputdataanalysissynonymsfilter/Inputdataanalysislanguagestemmingfilters/Inputdataanalysis
filtersandtokenizersURL/Definingyourownanalyzers
filtertypesURL/Definingyourownanalyzers
fulltextsearchingabout/FulltextsearchingApacheLucene,glossary/TheLuceneglossaryandarchitectureApacheLucene,architecture/TheLuceneglossaryandarchitectureinputdataanalysis/Inputdataanalysisindexing/Indexingandqueryingquerying/Indexingandqueryingscoring/Scoringandqueryrelevancequeryrelevance/Scoringandqueryrelevance
functionscorequery
www.EBooksWorld.ir
![Page 724: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/724.jpg)
about/Thefunctionscorequerystructure/Structureofthefunctionqueryweightfactorfunction/Theweightfactorfunctionfield_value_factorfunction/Fieldvaluefactorfunctionscript_scorefunction/Thescriptscorefunctionrandom_scorefunction/Therandomscorefunctiondecayfunctions/Decayfunctions
function_scorequeryURL/Decayfunctions
fuzzyqueryabout/Thefuzzyquery
www.EBooksWorld.ir
![Page 725: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/725.jpg)
Ggateway/Gatewaygatewaymodule
about/Thegatewayandrecoverymodules,Thegatewaygatewayrecoveryoptions
gateway.recover_after_master_nodes/Additionalgatewayrecoveryoptionsgateway.recover_after_data_nodes/Additionalgatewayrecoveryoptionsgateway.expected_master_nodes/Additionalgatewayrecoveryoptionsgateway.expected_data_nodes/Additionalgatewayrecoveryoptions
generalpreparations,singleElasticsearchnodeabout/Thegeneralpreparationsswapping,avoiding/Avoidingswappingfiledescriptors/Filedescriptorsvirtualmemory/Virtualmemory,Thememory
Geo/Geoboundsaggregationgeodistanceaggregations
about/GeodistanceaggregationsGeohash
URL/Geohashgridaggregationgeohashgridaggregation
about/GeohashgridaggregationURL/Geohashgridaggregation
GeohashvalueURL/Exampledata
GeoJSONURL/Arbitrarygeoshapes
geospatialqueriesURL/Samplequeries
geo_fieldpropertiesgeohash/Additionalgeo_fieldpropertiesgeohash_precision/Additionalgeo_fieldpropertiesgeohash_prefix/Additionalgeo_fieldpropertiesignore_malformed/Additionalgeo_fieldpropertieslat_lon/Additionalgeo_fieldpropertiesprecision_step/Additionalgeo_fieldproperties
GitHubURL/Installingpluginsautomaticstorethrottling,URL/Automaticstorethrottling
Githubissue,URL/String
globalaggregationabout/Globalaggregation
Groovy
www.EBooksWorld.ir
![Page 726: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/726.jpg)
URL/ScriptingcapabilitiesofElasticsearch
www.EBooksWorld.ir
![Page 727: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/727.jpg)
Hhardwarepreparations,forrunningElasticsearch
about/Hardwarephysicalservers/Physicalserversoracloudcloud/PhysicalserversoracloudCPU/CPURAMmemory/RAMmemorymassstorage/Massstoragenetwork/Thenetworkserverscounting/Howmanyserverscostcutting/Costcutting
HDFSURL/Creatingasnapshotrepository
highlightedfragmentscontrolling/Controllinghighlightedfragments
highlightertypeselecting/Forcinghighlightertype
highlightingabout/Highlightingusing/Gettingstartedwithhighlightingfieldconfiguration/FieldconfigurationApacheLucene,using/Underthehoodhighlightertype,selecting/ForcinghighlightertypeHTMLtags,,configuring/ConfiguringHTMLtagsglobalsettings/Globalandlocalsettingslocalsettings/Globalandlocalsettingsmatchingneed/Requirematchingcustomquery/CustomhighlightingqueryPostingshighlighter/ThePostingshighlighter,Validatingyourqueries
horizontalexpansionabout/Horizontalexpansionreplicas,automaticcreation/Automaticallycreatingthereplicasredundancy/Redundancyandhighavailabilityhighavailability/Redundancyandhighavailabilityreferencelinks/Redundancyandhighavailabilitycostandperformanceflexibility/Costandperformanceflexibilitycontinuesupgrades/ContinuousupgradesmultipleElasticsearchinstances,onsinglephysicalmachine/MultipleElasticsearchinstancesonasinglephysicalmachinedesignatednodesrolesforlargerclusters/Designatednoderolesforlargerclusters
howsimilarphrase/UnderstandingtheexplaininformationHTTPmodule
www.EBooksWorld.ir
![Page 728: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/728.jpg)
properties,URL/HTTPhostHTTPprotocol
URL/UnderstandingtheRESTAPIHTTPtransportsettings,adjusting
node/AdjustingHTTPtransportsettingsHTTP,disabling/DisablingHTTPHTTPport/HTTPportHTTPhost/HTTPhost
HyperLogLog++algorithmURL/Fieldcardinality
www.EBooksWorld.ir
![Page 729: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/729.jpg)
Iidentifiersquery
about/Theidentifiersqueryindex
segments/TheLuceneglossaryandarchitectureabout/Index
index-timeboostingusing/Whendoesindex-timeboostingmakesense?defining,inmappings/Definingboostinginthemappings
indexaliasabout/Indexaliasingandusingittosimplifyyoureverydayworkdefining/Analiascreating/Creatinganaliasmodifying/Modifyingaliasescommands,combining/Combiningcommandsretrieving/Retrievingaliasesremoving/Removingaliasesfiltering/Filteringaliasesandrouting/Aliasesandroutingandzerodowntimereindexing/Zerodowntimereindexingandaliases
indexation/Inputdataanalysisindexingprocess
speedingup,batchindexingused/Batchindexingtospeedupyourindexingprocess
indexingrelatedadvicesabout/Indexingrelatedadviceindexrefreshrate/Indexrefreshratethreadpools,tuning/Threadpoolstuningautomaticstorethrottling/Automaticstorethrottlingtime-baseddata,handling/Handlingtime-baseddatamultipledatapaths/Multipledatapathsdatadistribution/Datadistributionbulkindexing/BulkindexingRAMbuffer,usedforindexing/RAMbufferforindexing
indexrefreshratereferencelink/Indexrefreshrate
indexstructuremodifying,withupdateAPI/ModifyingyourindexstructurewiththeupdateAPI
indexstructure,modifyingmappings/Themappingsnewfield,adding/Addinganewfieldtotheexistingindexexistingindexfields,modifying/Modifyingfieldsofanexistingindex
www.EBooksWorld.ir
![Page 730: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/730.jpg)
indexstructure,parent-childrelationshipabout/Indexstructureanddataindexingchildmappings/Childmappingsparentmappings/Parentmappingsparentdocument/Theparentdocumentchildrendocuments/Childdocuments
indexstructuremappingabout/Indexstructuremappingtypes/Typeandtypesdefinitiontypesdefinition/Typeandtypesdefinitionfields/Fieldscoretypes/Coretypesmultifields/MultifieldsIPaddresstype/TheIPaddresstypetokencounttype/Tokencounttype
indices,Elasticsearchindexingcreating/Creatingindicesautomaticcreation,altering/Alteringautomaticindexcreationnewlycreatedindex,settings/Settingsforanewlycreatedindexdeleting/Indexdeletion
indicesanalyzeAPIURL/Queryanalysis
indicesquery/TheindicesqueryindicessettingsAPI/TheindicessettingsAPIindicesstatsAPI
about/IndicesstatsAPIdocs/Docsstore/Storeindexing/Indexing,get,andsearchget/Indexing,get,andsearchsearch/Indexing,get,andsearchdefining/Additionalinformation
internalfields_id/Additionalinternalfields_uid/Additionalinternalfields_type/Additionalinternalfields_field_names/Additionalinternalfields
invertedindexabout/TheLuceneglossaryandarchitectureURL/Index
www.EBooksWorld.ir
![Page 731: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/731.jpg)
JJava
URL/Fulltextsearchinginstalling/InstallingJava
JavaScriptObjectNotation(JSON)URL/RunningElasticsearch
JavathreadsURL/Threadpools
JavatypesURL/Number
JavaVersion7URL/InstallingJava
JavaVirtualMachine(JVM)/ConfiguringElasticsearchJMeter
URL/WhencachesshouldbeavoidedJodaTimelibrary
URL/DaterangeaggregationJSON
URL/Document
www.EBooksWorld.ir
![Page 732: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/732.jpg)
KKibana
URL/Marvel
www.EBooksWorld.ir
![Page 733: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/733.jpg)
Llanguageanalyzer
URL/Out-of-the-boxanalyzerslanguageanalyzers
URL/Sampledocumentlanguagedetection
URL/DetectingthelanguageofthedocumentLevenshteinalgorithm
URL/TheBooleanmatchqueryLinux
Elasticsearch,installing/InstallingElasticsearchonLinuxElasticsearch,configuringassystemservice/ConfiguringElasticsearchasasystemserviceonLinux
LogstashURL/Indexaliasingandusingittosimplifyyoureverydaywork
LuceneJavadocsURL/Defaultscoringformula
Lucenequerysyntaxabout/Lucenequerysyntax
www.EBooksWorld.ir
![Page 734: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/734.jpg)
Mmapping/Mappingmappings
configuration/Mappingsconfigurationtypedeterminingmechanism/Typedeterminingmechanismindexstructuremapping/Indexstructuremappinganalyzers,using/Usinganalyzerssimilaritymodels/Differentsimilaritymodelsabout/Mappingsfinalmappings/Finalmappingssending,toElasticsearch/SendingthemappingstoElasticsearchnewfield,addingtoexistingindex/Addinganewfieldtotheexistingindexfieldofexistingindex,modifying/Modifyingfieldsofanexistingindex
Marveltoolusing/Marvel
masternodeabout/Noderoles,Masternode
matchallquery/Thematchallquerymatchingpattern,dynamictemplates
match/Thematchingpatternunmatch/Thematchingpattern
matchqueryabout/ThematchqueryBooleanmatchquery/TheBooleanmatchqueryphrasematchquery/Thephrasematchquerymatchphraseprefixquery/Thematchphraseprefixquery
MavenURL/Installingplugins
MavenCentralURL/Installingplugins
MavenSonatypeURL/Installingplugins
mergepolicyabout/Themergepolicyproperties/Themergepolicy
mergescheduler/Themergeschedulermetricsaggregations
about/Metricsaggregationsmin/Minimum,maximum,average,andsummax/Minimum,maximum,average,andsumavg/Minimum,maximum,average,andsumsum/Minimum,maximum,average,andsummissingvalues/Missingvalues
www.EBooksWorld.ir
![Page 735: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/735.jpg)
scripts,using/Usingscriptsfieldvaluestatistics/Fieldvaluestatisticsandextendedstatisticsextended_statistics/Fieldvaluestatisticsandextendedstatisticsvalue_countaggregation/Valuecountfieldcardinalityaggregation/Fieldcardinalitypercentilesaggregation/Percentilespercentile_ranksaggregation/Percentilerankstop_hitsaggregation/Tophitsaggregationtop_hitsaggregation,additionalparameters/Additionalparametersgeo_boundsaggregation/Geoboundsaggregationscriptedmetricsaggregation/Scriptedmetricsaggregation
MicrosoftWindowsplatformfilehandles,URL/ConfiguringElasticsearch
minimum_should_matchparameterURL/Theboolquery
missingquery/Themissingquerymorelikethisquery
about/Themorelikethisquerymovingaveragescalculation
URL/Pipelineaggregationsmoving_avgaggregation
URL/Movingavgaggregationabout/Movingavgaggregationfuturebuckets,predicting/Predictingfuturebucketsmodels/Themodelsmodels,URL/Themodels
multimatchquery/ThemultimatchquerymultipleElasticsearchinstances,onsinglephysicalmachine
about/MultipleElasticsearchinstancesonasinglephysicalmachineshard,preventingonsamenode/Preventingashardanditsreplicasfrombeingonthesamenodereplicas,preventingonsamenode/Preventingashardanditsreplicasfrombeingonthesamenode
multipleindicesURL/URIsearch
multiterm/Queryrewritemultivaluedfield/DocumentMustache
URL/ScriptingcapabilitiesofElasticsearch
www.EBooksWorld.ir
![Page 736: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/736.jpg)
Nnativecode,using
factoryimplementation/Thefactoryimplementationnativescriptimplementation/Implementingthenativescriptplugindefinition/Theplugindefinitionplugin,installing/Installingthepluginscript,running/Runningthescript
nestedaggregationabout/Nestedaggregation
nestedobjectsusing/UsingnestedobjectsURL/Usingnestedobjectsnestedqueries/Scoringandnestedqueriesscore_modeproperty,setting/Scoringandnestedqueries
nestingaggregationsabout/Nestingaggregationsandorderingbuckets
networkattachedstorage(NAS)/Massstoragenode/Nodesandclusters
discoveryTopicnabout/Understandingnodediscoverydiscoverytypes/Understandingnodediscovery,Discoverytypesroles/Noderolesclustername,setting/Settingthecluster’snameZendiscovery/ZendiscoveryHTTPtransportsettings,adjusting/AdjustingHTTPtransportsettings
noderolesmasternode/Noderoles,Masternodedatanode/Noderoles,Datanodeclientnode/Noderoles,Clientnodeconfiguring/Configuringnoderoles
nodesinfoAPIabout/NodesinfoAPIrequisites/NodesinfoAPIextensiveinformation,returning/Returnedinformation
NoSQLURL/ManipulatingdatawiththeRESTAPI
number,indexstructuremappingbyte/Numbershort/Numberinteger/Numberlong/Numberfloat,URL/Numberfloat/Numberdouble/Number
www.EBooksWorld.ir
![Page 737: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/737.jpg)
double,URL/Number
www.EBooksWorld.ir
![Page 738: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/738.jpg)
Oobjectindexing
disabling/Disablingobjectindexingofficialrepository
URL/InstallingpluginsOpenJDK
URL/InstallingJavaoptimisticlocking
URL/Versioningoptions,termsuggester
lowercase_terms/Additionaltermsuggesteroptionsmax_edits/Additionaltermsuggesteroptionsprefix_len/Additionaltermsuggesteroptionsmin_word_len/Additionaltermsuggesteroptionsshard_size/Additionaltermsuggesteroptions
out-of-the-boxanalyzersstandard/Out-of-the-boxanalyzerssimple/Out-of-the-boxanalyzerswhitespace/Out-of-the-boxanalyzersstop/Out-of-the-boxanalyzerskeyword/Out-of-the-boxanalyzerspattern/Out-of-the-boxanalyzerslanguage/Out-of-the-boxanalyzerssnowball/Out-of-the-boxanalyzers
www.EBooksWorld.ir
![Page 739: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/739.jpg)
Pparameters,Booleanmatchquery
operator/TheBooleanmatchqueryanalyzer/TheBooleanmatchqueryfuzziness/TheBooleanmatchqueryprefix_length/TheBooleanmatchquerymax_expansions/TheBooleanmatchqueryzero_terms_query/TheBooleanmatchquerycero_terms_query/TheBooleanmatchquerylenient/TheBooleanmatchquery
parameters,fuzzyqueryvalue/Thefuzzyqueryboost/Thefuzzyqueryfuzziness/Thefuzzyqueryprefix_length/Thefuzzyquerymax_expansions/Thefuzzyquery
parameters,morelikethisqueryfields/Themorelikethisquerylike/Themorelikethisqueryunlike/Themorelikethisqueryin_term_freq/Themorelikethisquerymax_query_terms/Themorelikethisquerystop_words/Themorelikethisquerymin_doc_freq/Themorelikethisquerymin_word_len/Themorelikethisquerymax_word_len/Themorelikethisqueryboost_terms/Themorelikethisqueryboost/Themorelikethisqueryinclude/Themorelikethisqueryminimum_should_match/Themorelikethisqueryanalyzer/Themorelikethisquery
parameters,querystringqueryquery/Thequerystringquerydefault_field/Thequerystringquerydefault_operator/Thequerystringqueryanalyzer/Thequerystringqueryallow_leading_wildcard/Thequerystringquerylowercase_expand_terms/Thequerystringqueryenable_position_increments/Thequerystringqueryfuzzy_max_expansions/Thequerystringqueryfuzzy_prefix_length/Thequerystringqueryphrase_slop/Thequerystringqueryboost/Thequerystringquery
www.EBooksWorld.ir
![Page 740: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/740.jpg)
analyze_wildcard/Thequerystringqueryauto_generate_phrase_queries/Thequerystringqueryminimum_should_match/Thequerystringqueryfuzziness/Thequerystringquerymax_determined_states/Thequerystringquerylocale/Thequerystringquerytime_zone/Thequerystringquerylenient/Thequerystringquery
parameters,rangequerygte/Therangequerygt/Therangequerylte/Therangequerylt/Therangequery
parent-childrelationshipusing/Usingtheparent-childrelationshipindexstructure/Indexstructureanddataindexingdataindexing/Indexstructureanddataindexingquerying/Queryingperformanceconsiderations/Performanceconsiderations
parentaggregations/Availabletypespatternanalyzer
URL/Out-of-the-boxanalyzerspercolator
about/Percolatorindex/Theindexpreparing/Percolatorpreparationexploring/Gettingdeeperreturnedresultssize,controlling/Controllingthesizeofreturnedresultsusing,forandscorecalculation/Percolatorandscorecalculationcombining,withotherfunctionalities/Combiningpercolatorswithotherfunctionalitiesmatchingqueriescount,obtaining/Gettingthenumberofmatchingqueriesindexeddocumentspercolation/Indexeddocumentpercolation
phrasematchqueryslop/Thephrasematchqueryanalyzer/Thephrasematchquery
phrasesuggesterabout/Phrasesuggesterconfiguration/Configuration
pipelineaggregationsabout/PipelineaggregationsURL/Pipelineaggregationsparentaggregationfamily/Availabletypessiblingaggregationfamily/Availabletypes
www.EBooksWorld.ir
![Page 741: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/741.jpg)
types/Availabletypes,Pipelineaggregationtypesotheraggregations,referencing/Referencingotheraggregationsdata,gaps/Gapsinthedata
pipelineaggregations,typessum_bucket/Min,max,sum,andaveragebucketaggregationsmin_bucket/Min,max,sum,andaveragebucketaggregationsmax_bucket/Min,max,sum,andaveragebucketaggregationsavg_bucket/Min,max,sum,andaveragebucketaggregationscumulative_sumaggregation/Cumulativesumaggregationbucket_selectoraggregation/Bucketselectoraggregationbucket_scriptaggregation/Bucketscriptaggregationserial_diffaggregation/Serialdifferencingaggregationderivativeaggregation/Derivativeaggregationmoving_avgaggregation/Movingavgaggregation
pluginsabout/Elasticsearchpluginsbasics/Thebasicsinstalling/Installingpluginsremoving/Removingplugins
PostingsHighlighterURL/Underthehoodabout/ThePostingshighlighter
prefixquery/Theprefixqueryproperties,faultdetectionpingsettings
discovery.zen.fd.ping_interval/Faultdetectionpingsettingsdiscovery.zen.fd.ping_timeout/Faultdetectionpingsettingsdiscovery.zen.fd.ping_retries/Faultdetectionpingsettings
properties,mergepolicyindex.merge.policy.expunge_deletes_allowed/Themergepolicyindex.merge.policy.max_merge_at_once/Themergepolicyindex.merge.policy.max_merge_at_once_explicit/Themergepolicyindex.merge.policy.max_merged_segment/Themergepolicyindex.merge.policy.segments_per_tier/Themergepolicyindex.merge.policy.reclaim_deletes_weight/Themergepolicy
www.EBooksWorld.ir
![Page 742: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/742.jpg)
Qqueries
selecting,forwarming/Choosingqueriesforwarmingqueryboost
applying,todocument/Theboostqueryboosts
used,forinfluencingscores/Influencingscoreswithqueryboostsabout/Theboostadding,toqueries/Theboost,Addingtheboosttoqueriesscore,modifying/Modifyingthescore
queryingdata,inchilddocuments/Queryingdatainthechilddocumentsdata,inparentdocuments/Queryingdataintheparentdocuments
queryingprocessabout/Understandingthequeryingprocessquerylogic/Querylogicsearchtype,specifying/Searchtypesearchexecutionpreference,specifying/SearchexecutionpreferencesearchshardsAPI,specifying/SearchshardsAPI
queryparserURL/Lucenequerysyntax
queryrewriteabout/Queryrewriteprefixquery,example/PrefixqueryasanexampleApacheLucene,using/GettingbacktoApacheLuceneproperties/Queryrewriteproperties
querystringqueryabout/Thequerystringqueryrunning,againstmultiplefields/Runningthequerystringqueryagainstmultiplefields
www.EBooksWorld.ir
![Page 743: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/743.jpg)
RRackspace
URL/CostandperformanceflexibilityRAID
URL/Massstoragerangeaggregation
about/Rangeaggregationkeyedbuckets/Keyedbuckets
rangequery/Therangequeryrecoverymodules
about/Thegatewayandrecoverymodulesrecoveryprocess
about/Recoverycontrolgatewayrecoveryoptions/AdditionalgatewayrecoveryoptionsindicesrecoveryAPI/IndicesrecoveryAPIdelayedallocation/Delayedallocationindexrecoveryprioritization/Indexrecoveryprioritization
regularexpressionqueryabout/RegularexpressionqueryURL/Regularexpressionquery
replica/Replicasreplicas,Elasticsearchindexing
about/Shardsandreplicaswriteconsistency,controlling/Writeconsistency
RESTAPIused,fordatamanipulation/ManipulatingdatawiththeRESTAPIabout/UnderstandingtheRESTAPIURL/UnderstandingtheRESTAPIdata,storinginElasticsearch/StoringdatainElasticsearchdocuments,retrieving/Retrievingdocumentsdocuments,updating/Updatingdocumentsdocuments,deleting/Deletingdocumentsversioning/Versioning
resultsfiltering/Filteringyourresultsquerycontext/Thecontextisthekeyexplicitfiltering,boolqueryused/Explicitfilteringwithboolquery
reverse_nestedaggregationabout/Reversenestedaggregation
rewriteproperty,valuesscoring_boolean/Queryrewritepropertiesconstant_score/Queryrewritepropertiesconstant_score_boolean/Queryrewriteproperties
www.EBooksWorld.ir
![Page 744: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/744.jpg)
top_terms/Queryrewritepropertiestop_terms_blendedfreqs/Queryrewritepropertiestop_terms_boost_N/Queryrewriteproperties
rightqueryselecting/Choosingtherightqueryusecases/Theusecasesresults,limitingtogiventags/Limitingresultstogiventagsvaluesinrange,searching/Searchingforvaluesinarange
routingabout/Introductiontorouting,Routingdefaultindexing/Defaultindexingdefaultsearching/Defaultsearchingparameters/Theroutingparametersfields/Routingfields
RPMpackageused,forinstallingElasticsearch/InstallingElasticsearchusingRPMpackages
www.EBooksWorld.ir
![Page 745: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/745.jpg)
Ssample
distance-basedsorting/Distance-basedsortingboundingboxfiltering/Boundingboxfilteringdistance,limiting/Limitingthedistance
samplequeriesabout/Samplequeries
sampleraggregationabout/Sampleraggregation
scoreabout/IntroductiontoApacheLucenescoringinfluencing,withqueryboosts/Influencingscoreswithqueryboostsmodifying/Modifyingthescore
score,modifyingabout/Modifyingthescoreconstant_scorequery/Constantscorequeryboostingquery/Boostingqueryfunctionscorequery/Thefunctionscorequery
score_modeparameterabout/Structureofthefunctionquerymultiplevalue/Structureofthefunctionquerysumvalue/Structureofthefunctionqueryavgvalue/Structureofthefunctionqueryfirstvalue/Structureofthefunctionquerymaxvalue/Structureofthefunctionqueryminvalue/Structureofthefunctionquery
scriptfieldsselecting/Usingthescriptfieldsparameters,passingto/Passingparameterstothescriptfields
scriptingcapabilitiesabout/ScriptingcapabilitiesofElasticsearchscriptexecution,availableobjects/Objectsavailableduringscriptexecutionscript,types/Scripttypesquerying,scriptsused/Queryingwithscriptsparameters,using/Scriptingwithparameterslanguages,Groovy/Scriptlanguagesotherthanembeddedlanguages,using/Usingotherthanembeddedlanguagesnativecode,using/Usingnativecode
scriptpropertiesscript/Queryingwithscriptsinline/Queryingwithscriptsid/Queryingwithscriptsfile/Queryingwithscripts
www.EBooksWorld.ir
![Page 746: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/746.jpg)
lang/Queryingwithscriptsparams/Queryingwithscripts
scripts,scripted_metricaggregationinit_script/Scriptedmetricsaggregationmap_script/Scriptedmetricsaggregationcombine_script/Scriptedmetricsaggregationreduce_script/Scriptedmetricsaggregation
scripttypesabout/Scripttypesinlinescripts/Scripttypes,Inlinescriptsinfilescripts/Infilescriptsindexedscripts/Indexedscripts
ScrollAPIabout/TheScrollAPIproblemdefinition/Problemdefinitionproblemdefinition,solution/Scrollingtotherescue
searching/Defaultsearchingsearchingrequestexecution/Indexingandsearchingsegmentmerging
about/Introductiontosegmentmerging,Segmentmergingneedfor/Theneedforsegmentmergingmergepolicy/Themergepolicymergepolicy,basicproperties/Themergepolicymergescheduler/Themergeschedulerthrottling/Throttling
shardallocationIPaddress,usingfor/UsingtheIPaddressforshardallocationcancelling/Cancelingshardallocationforcing/ForcingshardallocationmultiplecommandsperHTTPrequest/MultiplecommandsperHTTPrequestoperations,allowingonprimaryshards/Allowingoperationsonprimaryshards
shardandreplicaallocationcontrolling/Controllingtheshardandreplicaallocationcontrolling,explicitly/Explicitlycontrollingallocationnodeparameters,specifying/Specifyingnodeparametersconfiguration/Configurationindex,creating/Indexcreationnodes,excluding/Excludingnodesfromallocationnodeattributes,requiring/Requiringnodeattributesnumberofshardsandreplicaspernode/Thenumberofshardsandreplicaspernodeallocationthrottling/Allocationthrottlingclusterwideallocation/Cluster-wideallocationshardsandreplicas,movingmanually/Manuallymovingshardsandreplicas
www.EBooksWorld.ir
![Page 747: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/747.jpg)
rollingrestarts,handling/Handlingrollingrestartsshardrequestcache
about/Shardrequestcacheenabling/Enablingandconfiguringtheshardrequestcacheconfiguring/Enablingandconfiguringtheshardrequestcacheperrequestshardrequestcache,disabling/Perrequestshardrequestcachedisablingusagemonitoring/Shardrequestcacheusagemonitoring
shards/Index,Shardsmoving/Movingshards
shards,Elasticsearchindexingabout/Shardsandreplicaswriteconsistency,controlling/Writeconsistency
siblingaggregations/Availabletypessignificant_termsaggregation
about/Significanttermsaggregationsignificantterms,selecting/Choosingsignificanttermsmultiplevalue,analyzing/Multiplevalueanalysis
similaritymodelsabout/Differentsimilaritymodelsper-fieldsimilarity,setting/Settingper-fieldsimilarityOkapiBM25model/Availablesimilaritymodelsrandomnessmodel,divergence/Availablesimilaritymodelsinformation-basedmodel/Availablesimilaritymodelsdefaultsimilarity,configuring/ConfiguringdefaultsimilarityBM25similarity,configuring/ConfiguringBM25similarityDFRsimilarity,configuring/ConfiguringDFRsimilarityIBsimilarity,configuring/ConfiguringIBsimilarity
simplequerystringqueryabout/ThesimplequerystringqueryURL/Thesimplequerystringquery
singleElasticsearchnodetuning/PreparingasingleElasticsearchnodegeneralpreparations/Thegeneralpreparationsfielddatacache/Fielddatacacheandbreakingthecircuitcircuit,breaking/Fielddatacacheandbreakingthecircuitdocvalues,using/UsedocvaluesRAMbuffer,usedforindexing/RAMbufferforindexingindexrefreshrate/Indexrefreshratethreadpools/Threadpools
snapshotscreating/Creatingsnapshotsadditionalparameters/Additionalparameters
snowballanalyzer
www.EBooksWorld.ir
![Page 748: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/748.jpg)
URL/Out-of-the-boxanalyzersSoftwareasaService(SaaS)/SPMforElasticsearchsourcefiltering/Sourcefilteringspan/Aspanspanfirstquery/Spanfirstqueryspannearquery/Spannearqueryspannotquery/Spannotqueryspanorquery/Spanorqueryspanqueries
using/Usingspanqueriesspan/Aspanspan_termquery/Spantermqueryspanfirstquery/Spanfirstqueryspannearquery/Spannearqueryspanorquery/Spanorqueryspannotquery/Spannotqueryspan_withinquery/Spanwithinqueryspan_containingquery/Spancontainingqueryspan_multiquery/Spanmultiqueryperformanceconsiderations/Performanceconsiderations
span_contaningquery/Spancontainingqueryspan_multiquery/Spanmultiqueryspan_termquery/Spantermqueryspan_withinquery/Spanwithinqueryspatialcapabilities
about/Elasticsearchspatialcapabilitiesmappingspreparation/Mappingpreparationforspatialsearchesexampledata/Exampledatageo_fieldproperties/Additionalgeo_fieldproperties
SPMtoolURL/SPMforElasticsearch
standardanalyzerURL/Out-of-the-boxanalyzers
stateandhealth,clustermonitoring/Monitoringyourcluster’sstateandhealthclusterhealthAPI/ClusterhealthAPIindicesstatsAPI/IndicesstatsAPInodesinfoAPI/NodesinfoAPInodesstatsAPI/NodesstatsAPIclusterstateAPI/ClusterstateAPIclusterstatsAPI/ClusterstatsAPIpendingtasksAPI/PendingtasksAPIindicesrecoveryAPI/IndicesrecoveryAPIindicesshardstoresAPI/IndicesshardstoresAPI
www.EBooksWorld.ir
![Page 749: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/749.jpg)
indicessegmentsAPI/IndicessegmentsAPIstaticproperties,forindexingbuffersizeconfiguration
indices.memory.index_buffer_size/Indexingbuffersindices.memory.min_index_buffer_size/Indexingbuffersindices.memory.max_index_buffer_size/Indexingbuffersindices.memory.min_shard_index_buffer_size/Indexingbuffers
statuscodedefinitionURL/Indexingthedata
stemmingURL/Out-of-the-boxanalyzers
stopanalyzerURL/Out-of-the-boxanalyzers
stopwordsURL/Thecommontermsquery
string,indexstructuremappingterm_vector/Stringanalyzer/Stringsearch_analyzer/Stringnorms.enabled/Stringnorms.loading/Stringposition_offset_gap/Stringindex_options/Stringignore_above/String
suggestersusing/UsingsuggestersURL/Usingsuggesters,Additionaltermsuggesteroptionstypes/Availablesuggestertypessuggestions,including/Includingsuggestionsresponse/Suggesterresponsetextproperty/Suggesterresponsescoreproperty/Suggesterresponsefreqproperty/Suggesterresponse
synonymrulesdefining/DefiningsynonymrulesApacheSolrsynonyms,using/UsingApacheSolrsynonymsWordNetsynonyms,using/UsingWordNetsynonyms
synonymsabout/Wordswiththesamemeaningfiltering/Synonymfilterinmappings/Synonymsinthemappingsstoring,infilesystem/Synonymsstoredonthefilesystemrules,defining/Definingsynonymrulesindex-timesynonymsexpansion/Queryorindex-timesynonymexpansionquery-timesynonymexpansion/Queryorindex-timesynonymexpansion
www.EBooksWorld.ir
![Page 750: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/750.jpg)
synonymsfilterusing/Synonymfilter
system-specificinstallationandconfigurationabout/Thesystem-specificinstallationandconfigurationElasticsearch,installingonLinux/InstallingElasticsearchonLinuxElasticsearch,configuringassystemserviceonLinux/ConfiguringElasticsearchasasystemserviceonLinuxElasticsearch,usingassystemserviceonWindows/ElasticsearchasasystemserviceonWindows
www.EBooksWorld.ir
![Page 751: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/751.jpg)
TT-Digestalgorithm
URL/Percentilestemplates
about/Templatesexample/Anexampleofatemplate
termquery/Thetermquerytermsaggregation
about/Termsaggregationapproximatecounts/Countsareapproximateminimumdocumentcount/Minimumdocumentcount
termsquery/Thetermsquerytermsuggester
about/Termsuggesterconfigurationoptions/Termsuggesterconfigurationoptionsoptions/Additionaltermsuggesteroptions
threadpoolsabout/Threadpoolsgeneric/Threadpoolsindex/Threadpoolssearch/Threadpoolssuggest/Threadpoolsget/Threadpoolsbulk/Threadpoolspercolate/Threadpools
throttling,adjustingtypesetting/Throttlingvalue/Throttlingnonevalue/Throttlingmergevalue/Throttlingallvalue/Throttling
timezonesURL/Timezones
tree-likestructuresindexing/Indexingtree-likestructuresdatastructure/Datastructureanalysis/Analysis
typedeterminingmechanismabout/Typedeterminingmechanismdisabling/Disablingthetypedeterminingmechanismtuning,fornumerictypes/Tuningthetypedeterminingmechanismfornumerictypestuning,fordates/Tuningthetypedeterminingmechanismfordates
www.EBooksWorld.ir
![Page 752: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/752.jpg)
typeproperty,valuesplain/Forcinghighlightertypefvh/Forcinghighlightertypepostins/Forcinghighlightertype
typequery/Thetypequerytypes,suggesters
term/Availablesuggestertypes,Termsuggesterphrase/Availablesuggestertypes,Phrasesuggestercompletion/Availablesuggestertypes,Completionsuggestercontext/Availablesuggestertypes,Contextsuggester
www.EBooksWorld.ir
![Page 753: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/753.jpg)
UUnicast
URL/DiscoverytypesupdateAPI
used,formodifyingindexstructure/ModifyingyourindexstructurewiththeupdateAPI
UpdateAPIURL/Addingpartialdocuments
updatesettingsAPIabout/TheupdatesettingsAPIclustersettingsAPI/TheclustersettingsAPIindicessettingsAPI/TheindicessettingsAPI
URIquerystringparametersabout/URIquerystringparametersquery/Thequerydefaultsearchfield/Thedefaultsearchfieldanalyzerproperty/Analyzerdefaultoperator/Thedefaultoperatorpropertyexplainparameter/Queryexplanationfieldsreturned/Thefieldsreturnedresults,sorting/Sortingtheresultssearchtimeout/Thesearchtimeoutresultswindow/Theresultswindowpershardresults,limiting/Limitingper-shardresultsunavailableindices,ignoring/Ignoringunavailableindicessearchtype/Thesearchtypelowercasingtermsexpansion/Lowercasingtermexpansionwildcardqueriesanalysis/Wildcardandprefixanalysisanalyze_wildcardproperty/Wildcardandprefixanalysisprefixqueriesanalysis/Wildcardandprefixanalysis
URIrequestqueryused,forsearching/SearchingwiththeURIrequestquerysampledata/SampledataURIsearch/URIsearchanalyzing/Queryanalysisparameters/URIquerystringparametersLucenequerysyntax/Lucenequerysyntax
URIsearchabout/URIsearchElasticsearchqueryresponse/ElasticsearchqueryresponseURL/Wildcardandprefixanalysis
www.EBooksWorld.ir
![Page 754: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/754.jpg)
VValidateAPI
using/UsingtheValidateAPIvalues,has_childqueryparameter
none/Queryingdatainthechilddocumentsmin/Queryingdatainthechilddocumentsmax/Queryingdatainthechilddocumentssum/Queryingdatainthechilddocumentsavg/Queryingdatainthechilddocuments
values,inrangesearching/Searchingforvaluesinarangematcheddocuments,boosting/Boostingsomeofthematcheddocumentslowerscoringpartialqueries,ignoring/IgnoringlowerscoringpartialqueriesLucenequerysyntax,usinginqueries/UsingLucenequerysyntaxinqueriesuserquerieswithouterrors,handling/Handlinguserquerieswithouterrorsprefixes,usedforprovidingautocompletefunctionality/Autocompleteusingprefixessimilarterms,finding/Findingtermssimilartoagivenonespans/Spans,spanseverywhere
values,score_modepropertyavg/Scoringandnestedqueriessum/Scoringandnestedqueriesmin/Scoringandnestedqueriesmax/Scoringandnestedqueriesnone/Scoringandnestedqueries
versioningabout/Versioningusageexample/Usageexamplefromexternalsystem/Versioningfromexternalsystems
verticalscaling/PreparingasingleElasticsearchnode
www.EBooksWorld.ir
![Page 755: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/755.jpg)
Wwarmingquery
about/Warmingupdefining/Defininganewwarmingquerydefinedwarmingqueries,retrieving/Retrievingthedefinedwarmingqueriesdeleting/Deletingawarmingquerywarmingupfunctionality,disabling/Disablingthewarmingupfunctionality
wildcardquery/ThewildcardqueryWindows
Elasticsearch,configuringassystemservice/ElasticsearchasasystemserviceonWindows
WordNetURL/UsingWordNetsynonyms
www.EBooksWorld.ir
![Page 756: dl.ebooksworld.irdl.ebooksworld.ir/motoman/Packt.Elasticsearch.Server.3rd.Edition.w… · Table of Contents Elasticsearch Server Third Edition Credits About the Authors About the](https://reader035.vdocuments.mx/reader035/viewer/2022070908/5f87a74e61533a78dd166ada/html5/thumbnails/756.jpg)
ZZendiscovery
about/Zendiscoverymasterelectionconfiguration/Masterelectionconfigurationunicast,configuring/Configuringunicastfaultdetectionpingsettings/Faultdetectionpingsettingsclusterstateupdatescontrol/Clusterstateupdatescontrolmasterunavailability,dealingwith/Dealingwithmasterunavailability
www.EBooksWorld.ir