finding cars and hunting down logs - elasticsearch @autoscout24
TRANSCRIPT
Finding Cars and Hunting Down Logs: Elasticsearch @ AutoScout24
AutoScout24
24 Nov 2016
Philipp Garbe Lead developer ([email protected])
Juri Smarschevski Team lead ([email protected])
SearchAutoScout24 search journey in nutshell
2
Who we are ?Unique Monthly Visitors in Europa
3
… 10 more
Some numbers
Search index contains ~2.6M classifieds
4
Unique visitors (monthly): ~10M
Search requests per day: ~36M
Index update rate per day ~400.000 classifieds
Status quo. March 2013.
Endeca used as a search engine
5
Use case: providing search results and facets for the entire AS24 platform
Problems: • New product requirements, performance of Endeca becomes slower• Time to market of our required features is not sufficient• Maintenance is complex / expensive
Possible candidates
Solr ?
• <feeling> too complex installation / configuration </feeling>
6
Sphinx ?• Support situation is unclear
Elasticsearch ?• Fresh buzzword• From beginning on built for distributed systems (rumors)• Easy installation / configuration (fact)
POC
Goals
• Performance should be comparable with Endeca• The solution should be scalable
7
8
Rollout plan. 03.2013 - 11.2013
07.2013 11.201302.2013 03.2013 05.2013
POC
Implementation & migration
Training
Go live phase
#real_project_picture_squeezed
9
Endeca Elasticsearch(0.9.x)
Amount of machines 60 20
[Re]index time ~180 min ~45 min
Deploy to Live up to 2 days < 3 hours
Effort for testing an issue on local machine 4 h 1 h
Performance = =
Product / dev guys satisfaction :( :)
300%
400%
1000%
400%
% ?
Results after 8 months of working.
No problems after migration ?
Cluster split brain
Has in fact nothing to do with Elasticsearch, is more related to learn phase at AS24
10
Deep pagination
Elasticsearch 5.x release notes: “Deep pagination of search results is now possible with the search_after feature, which efficiently skips over previously returned results to return just the next page.“
11
Status quo. November 2014.
Project “Tatsu” has started.NET => JVM
C# => Scala
IIS / Windows => Play / Linux
Local data center => AWS
Monolith => Micro services
Windows workstations => Mac notebooks
... => ...
12
Status quo. November 2014.
Project “Tatsu” has started.NET => JVM
C# => Scala
IIS / Windows => Play / Linux
Local data center => AWS
Monolith => Micro services
Windows workstations => Mac notebooks
? => ?
=> 2015
13
Elasticsearch clusters “lift & shift” to AWS ?
AWS Elasticsearch Service ?
Elasticsearch as a service (SaaS) ?
Own hosting in AWS ?
16
Rolling update in details (possible scenario).
Time1
Initial state
17
Rolling update in details (possible scenario).
Node has been replaced
Time1 2
Initial state
~ 60 sec
18
Rolling update in details (possible scenario).
Master has been killed
Node has been replaced
Time1 2 3
Initial state
19
Rolling update in details (possible scenario).
Master has been killed
Node has been replaced
Master election
Time1 2 3 4
Initial state
20
Rolling update in details (possible scenario).
Master has been killed
Node has been replaced
Master election
Time1 2 3 4 5
Initial state Last node has been replaced
21
Rolling update findingsMaster has been killed
?Outage=
22
Rolling update findings
LoggingContinuously deployed, immutable and stateful
23
7.4 billion documents
Some numbers
36 TB EBS
18 nodes á m4.4xlarge
(64GB / 53.5 cpu units)
Unified Logs
25
Challenge: Deployment time
Rolling updates
27
Challenge: Costs
First setup
● 18x m4.4xlarge● 18x 2TB gp2
● 3TB/day cross-zone traffic
Cost/Usage Optimized Setup
● 15x m4.x2large● 15x 384GB gp2
● 6x SpotFleet● 6x 4TB st1
● 9TB/day cross-zone traffic
Savings: ~40%
Future. What next ?
Percolator (saved search)
36
Elastic Graph (recommendations)
Freetext search
37
Conclusion
Here is a simple question - if we had the possibility to go back in the time and start the same journey with Elasticsearch,
would we do it the same way ?
Q & A
38