solr @ ebay kleinanzeigen

17
Solr @ eBay Kleinanzeigen Olaf Zschiedrich, eBay Classifieds Group [email protected], 5/25/2011

Upload: lucidworks-archived

Post on 15-Jan-2015

6.701 views

Category:

Technology


1 download

DESCRIPTION

Attendees will learn how eBay Germany has implemented Solr, why Solr was selected, which Solr features are utilized. and how Solr is configured and used in production. Recommended best practices will be profiled alomng with eBay Kleinanzeigen plans for future deployment of Solr.

TRANSCRIPT

Page 1: Solr @ eBay Kleinanzeigen

Solr @ eBay Kleinanzeigen

Olaf Zschiedrich, eBay Classifieds Group [email protected], 5/25/2011

Page 2: Solr @ eBay Kleinanzeigen

Who I am? !  Olaf Zschiedrich !  eBay Classifieds Group !  Head of Technology @ eBay Kleinanzeigen !  Area of expertise/interest:

•  High traffic web-applications •  Agile development •  Java/JEE •  Search technologies

3

Page 3: Solr @ eBay Kleinanzeigen

Agenda !  About eBay Classifieds Group/ebay Kleinanzeigen !  Metrics & Traffic Numbers !  Why Solr? !  Solr Features in Action !  Data Indexing !  Solr in Production !  Best Practices !  Problems !  Outlook !  Questions

4

Page 4: Solr @ eBay Kleinanzeigen

About eBay Classifieds Group

5

Page 5: Solr @ eBay Kleinanzeigen

About eBay Classifieds Group

online classifieds company in the world

6

Page 6: Solr @ eBay Kleinanzeigen

About eBay Kleinanzeigen !  Typilcal classifieds ad platform (horizontal, local trading)

!  Launched 2009 after 4 months of development !  Small agile team (using Scrum)

•  12-15 people total •  5-7 developers

!  Leverages open source (Spring, Solr, MySQL, ActiveMQ)

!  Applications: •  Public website •  Customer support tool •  API (Rest supporting JSON and XML) •  Iphone App (~ 250.000 installations) •  Facebook App

7

Page 7: Solr @ eBay Kleinanzeigen

Metrics & Traffic Numbers !  Site metrics:

•  ~ 3.2 M active ads •  16 – 24 M PVs per day •  Peak hours = 1.8 M PVs (~ 500 PVs per second)

!  Solr request metrics: •  ~ 60 M requests per day •  Peak hours = ~ 1500 request per second

!  Avg. response time •  20 ms (search) and 3 ms for auto-suggest

Site is rapidly growing !!!

8

Page 8: Solr @ eBay Kleinanzeigen

Why Solr !  Open Source !  Good documentation / big community !  Java-based (the language we know/use)

!  Widely used (especially lucene)

!  Based on lucene (de-facto standard for full text search in java)

!  Feature-rich (including enterprise features)

!  Extensible (e.g. easy implementation of own tokenizers)

!  Easy to integrate (HTTP, SolrJ client)

!  Easy to setup (java web application)

Most promising option we looked at. Due to very aggressive timelines no time consuming research was possible!

9

Page 9: Solr @ eBay Kleinanzeigen

Solr Features in Action !  Faceting !  Language specific stemming !  More Like This !  Auto-Suggest based on TermComponent !  Spellchecking !  Synonyms !  Stopwords !  Dynamic fields

10

Page 10: Solr @ eBay Kleinanzeigen

Data Indexing !  Use of Delta Import Handler !  Delta import runs every 10 minutes !  Full import only done in case schema

change requires full index rebuild !  Index optimized once a day

11

MySQL Slave

Solr Master

Solr Slave

JDBC

Delta Import Handler

Solr Slave Solr Slave

HTTP / REST API Replication Handler

Page 11: Solr @ eBay Kleinanzeigen

Solr In Production !  2 datacenters !  1 Master + 6 Slaves per datacenter

Slaves show very low resource consumption. Could go down to 4 slaves per datacenter while still having 50% overcapacity

!  Master only used for indexing !  Load balancer in front of slaves !  Varnish in front of slaves (for dedicated use cases)

!  Working closely with SITE-OPS Team !  DEV-OPS are part of development process

12

Page 12: Solr @ eBay Kleinanzeigen

Solr 3.1 in Production !  Solr 3.1 productive since mid of May !  Not plug and play. Needs migration path as:

•  Index format has changed •  Java-bin format has changed

!  Two major problems: •  Bug in spellchecker (SOLR-2462)

Leads to infinite GC loops

•  Bug in replication handler (SOLR-2469) Leads to growing disk usage as old index files are not removed is case “replicateAfter=startup” is used.

13

Page 13: Solr @ eBay Kleinanzeigen

Best Practises !  Use solr cores right from the beginning

Allows you to run mutiple indexes on one box in dev and distribute indexes to mutiple boxes in production

!  Use filter queries !  Use caching (FieldCache, QueryCache, Web Proxy Cache e.g. Varnish or Squid)

!  Tune JVM properly !  Build search-layer hiding the usage of solr

SearchCommand cmd = new SearchCommand(); cmd.setKeywords(“BMW 323“); ... SearchResult result = searchService.searchActiveAds(cmd); List<Ad> ads = result.getAds();"

!  Create a QueryBuilder to ease query building SolrQueryBuilder sqb = new SolrQueryBuilder(); sqb = sqb.freetext("freetext", "BMW").and().in("color", "RED", "BLACK“); sqb = sqb.and().not().eq("fuel_type", "GAS").and().lt(“price“, "10000"); ... String query = sqb.build(); (Just an example. Normally filter queries should be used for a query like this!)

14

Page 14: Solr @ eBay Kleinanzeigen

Problems !  Distance search including sorting

•  Not supported in previous Solr versions •  LocalSolr

not working with Solr 1.4 final, GC issues, performance issues •  Solution:

Got rid of sort by distance. Implemented own distance search based on bounding boxes and simple range queries.

•  Solved in 3.1

!  Real time updates !  Deep paging large result sets (SOLR-1726)

15

Page 15: Solr @ eBay Kleinanzeigen

Outlook / Future Plans !  Migrate further applications to Solr

Most batch-jobs and customer support tool search against db which is getting slower due to growth of data.

!  Evaluate new features of Solr 3.1 •  Spatial/distance search •  New auto-suggest component •  Extended dismax query parser

16

Page 16: Solr @ eBay Kleinanzeigen

Questions ?

17

Page 17: Solr @ eBay Kleinanzeigen

Contact !  Olaf Zschiedrich

•  [email protected] •  [email protected] •  www.ebay-kleinanzeigen.de

18