open source search evolution
DESCRIPTION
From Gopher, WAIS, and Harvest to Lucene, Solr, SolrCloud, and Elasticsearch.TRANSCRIPT
![Page 2: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/2.jpg)
Today
![Page 3: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/3.jpg)
The Early Days
![Page 4: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/4.jpg)
Even Earlier Days
![Page 5: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/5.jpg)
Foci
1974 1995 now()________________________________________________________________________________________________________________________
SEARCH
![Page 6: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/6.jpg)
Otis Who?
SEARCH
![Page 7: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/7.jpg)
Then & Now
1990s 2014WebGlimpse
Swish
Harvest
Ht://Dig
freeWAIS elasticsearch.
![Page 8: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/8.jpg)
Still New?
elasticsearch.
…………………... 2000
…………………... 2004
…………………... 2010
![Page 9: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/9.jpg)
Dominance
[Open Source]Search Evolution
![Page 10: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/10.jpg)
Big Cake
Big DataBeyond Text
Memory FootprintDistributed ModelLanguage Support
Indexing Speed, NRTRelevance Algorithms
![Page 11: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/11.jpg)
Language Support: Stemming
![Page 12: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/12.jpg)
Language Support: Lemmatization
![Page 13: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/13.jpg)
Language Support: Morphology
![Page 14: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/14.jpg)
Language Support
Lucene 2004: ~ 20 languagesLucene 2014: ~ 40 languages
most are stemmers
![Page 15: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/15.jpg)
Relevance Models: VSM
TF IDFFor term i in document j
wi,j = tfi,j x log(N/dfi)
tfi,j = number of occurrences of i in jdfi = number of document containing i
N = total number of documents
![Page 16: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/16.jpg)
Relevance Models: Pluggable
Lucene until 2011: 1 relevance modelLucene 2014: 6 relevance models
got more?
![Page 17: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/17.jpg)
Distributed Architecture
1 Master - N Slavesgood for scaling queriesnot good for scaling data
Sharded index with replicationgood for scaling queries
good for scaling data
![Page 18: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/18.jpg)
Indexing Speed & NRT Search
![Page 19: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/19.jpg)
Memory Footprint
![Page 20: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/20.jpg)
Beyond Text
Geospatial SearchClassifier
Recommendation EngineKey Value Store
NoSQL DBAnalytical DB
![Page 21: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/21.jpg)
Geospatial Search
![Page 22: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/22.jpg)
Classifier
![Page 23: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/23.jpg)
Recommender
Content Similarity
Collaborative Filtering
![Page 24: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/24.jpg)
Key Value Store
id123 ⇒ manu:Apple desc:foo bar price:$111
id234 ⇒ manu:Sony desc:baz bam price:$222
![Page 25: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/25.jpg)
NoSQL DB
DistributedReplicated
Horizontally ScalableFast RetrievalSearchable?
![Page 26: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/26.jpg)
Slicing & Dicing
![Page 27: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/27.jpg)
Analytical Queries
![Page 28: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/28.jpg)
Gobble Gobble
If software is eating the world,then [open source] search is gobbling it.
And has been for years.
![Page 29: Open Source Search Evolution](https://reader035.vdocuments.mx/reader035/viewer/2022062617/54c69d1b4a7959d9668b4569/html5/thumbnails/29.jpg)
FIN. Questions