building a vibrant search ecosystem @ bloomberg: presented by steven bower & ken laporte,...

22
OCTOBER 11-14, 2016 BOSTON, MA

Upload: lucidworks

Post on 16-Apr-2017

145 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg

O C T O B E R 1 1 - 1 4 , 2 0 1 6 • B O S T O N , M A

Page 2: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg

Building a Vibrant Search Ecosystem @ Bloomberg

Steven Bower & Ken LaPorte

Copyright 2016 Bloomberg Finance L.P. All rights reserved.

Page 3: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg

3

01 Bloomberg •  Largest provider of financial news and information •  Our strength is quickly and accurately delivering data, news and analytics •  Creating high performance and accurate information retrieval systems is core to our

strength

Page 4: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg

4

02 Why are we giving this talk?

Page 5: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg

5

01 What came before…

•  Search has been around for a long time at Bloomberg -  Rapid delivery of product to clients -  Proprietary, commercial and open-source search technologies

•  Fragmented solutions -  Disparate search technologies -  Custom code -  Deployment patterns -  Lack of standards

•  Costly to maintain & evolve

Page 6: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg

6

01 How We Got Started • Created a team to specialize in search • Reviewed existing applications reliant upon search • Selected a set of representative applications

-  Various scales -  Data types -  Distinct requirements

Page 7: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg

7

01 Why Solr? •  Evaluated other open source search engines

-  Already used at Bloomberg •  Large community & widely used •  Established & growing feature set •  Scalable •  Committed to open source

-  Ability to contribute to core engine -  Ability to fix bugs ourselves -  Contributions in almost every Solr release since 4.5.0 -  3 Solr committers at the company

Page 8: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg

8

01 Search as a service •  Designed platform with application teams •  Middleware service to wrap Solr

-  Familiar & lightweight interface -  Simplified APIs -  Insulate clients from changes in Solr

•  Pass-thru capability •  Basic monitoring/metrics

Page 9: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg

9

01 Open for business!

•  Hundreds of search applications -  Diverse use cases and scale -  Displaced other technologies

•  >10 Billion documents •  >10 Million new documents daily •  >4000 Solr instances •  >100s of servers •  >2,000 of queries per second •  Mission critical to Bloomberg and the financial markets

0

50

100

150

200

250

300

2012

Num

ber

of C

olle

ctio

ns

Time

Number of Collections over Time

2016

Page 10: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg

10

01 What have we done?! •  Human scaling •  Ineffective Alarming •  Manual build process

-  Limited automated testing •  Configuration Management •  Lots of known unknowns

Page 11: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg

11

01 Challenge: EcoSystem

•  Ownership -  Where’s the line?

•  Planning for scale •  Education

-  Search != Database -  Data types (text parsing) -  Relevance -  Features

Page 12: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg

12

01 Solution: Ecosystem •  Survey

-  Understand business requirements -  Identify scale and complexity -  Assist with schema and query design -  Concerns

•  Develop & Test -  Best practices -  Documentation & code samples -  Office hours & support chat -  Community development

Page 13: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg

13

01 Solution: Ecosystem •  Validate & Deploy

-  Hardware provisioning -  Automated deployments -  Hot & cold collections -  Load testing

•  Maintain and Grow -  Applications change & grow -  Solr & platform upgrades -  Monitoring

Page 14: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg

14

01 Challenge: Monitoring Solr •  Very large monitoring footprint •  What should we monitor?

-  Ping -  Cluster state -  Process state -  Server health

•  False alarms -  Flutter -  Solr can lie to you! (SOLR-8599)

•  Many different ways to view system health -  Different people care about different things -  Active vs Forensic

Page 15: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg

15

01 Solution: Monitoring Solr •  Monitor via multiple mechanisms •  Aggregate events

-  Alarm on multiple signals -  Delay alarms

•  Niteowl -  Solr / ZooKeeper / Generic -  Distributed / Scalable -  Events indexed into Solr

•  Led to massive stability improvements

Page 16: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg

16

01 What We Found •  Long Garbage Collections

-  Profiler interactions with Mmap -  Young generation pressure during ingest -  Use G1GC / Keep heap small

•  Long Recovery Times -  Transaction logs don’t hold enough -  Always doing full replications when under ingest load

•  Solr Bugs •  Out of Memory Exceptions

-  One off OOMs are not uncommon -  Use DocValues! -  OOM Killer

SOLR-9310SOLR-9207SOLR-9506

Long recovery times

SOLR-6931 Random connection reset issues

SOLR-8085 Replicas get out of sync

SOLR-8599 ZooKeeper client in inconsistent state

Page 17: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg

17

01 Challenge: Configuration Management

•  Deployment process •  Requires versioning / rollback

-  Some changes cannot be rolled back •  Template driven configuration

-  Good for simple things -  Doesn’t scale for complex collections

•  Lack of provenance

Page 18: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg

18

01 Solution: Configuration Management •  Convert to SDLC process

-  Configurations live in Git repository -  Solr extensions linked as dependencies -  Built with Maven / Jenkins -  Published to artifact repository

•  Validation of configurations during build -  Static Analysis

•  Allowed schema changes •  Access control of solr configuration

-  Integration testing

•  Deployed to ZooKeeper / Solr

Page 19: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg

19

01 Challenge: Infrastructure •  Substantial demand •  Large lead times •  Differing requirements

-  Security -  Scale -  Control

•  Too many pets!

Page 20: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg

20

01 Solution: Infrastructure •  Streamlined process •  Shared and dedicated resources •  Built from the ground up

-  Well defined layers of abstraction -  Cattle not pets -  Infrastructure-as-code -  SDLC / provenance

•  Better hardware == better experience -  SSDs -  More RAM -  Faster network

Hardware / OS

Control Plane

Applications

APIs

Page 21: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg

21

01 What’s next? •  Containerization

-  Simplify / decentralize operational procedures -  Local testing and development -  Security / Metrics / QoS

•  Delegation of control -  Mute / Direct alarms to tenants -  Tenant managed

•  Detect failures before they happen -  Heuristics / ML models

•  Solr -  More work on streaming -  Analytics

•  distributed analytics •  pivot faceting

Page 22: Building a Vibrant Search Ecosystem @ Bloomberg: Presented by Steven Bower & Ken LaPorte, Bloomberg

Building a Vibrant Search Ecosystem @ Bloomberg

QUESTIONS?

Steven Bower [email protected]

Ken LaPorte [email protected]