why is my solr slow?: presented by mike drob, cloudera
TRANSCRIPT
OCTOBER 11-14, 2016 • BOSTON, MA
Why is my Solr slow?Mike Drob
Software Engineer, Cloudera
Who Am I?● Apache HTrace (Incubating) Committer● Software Engineer @ Cloudera● Middle School LEGO Robotics Mentor
Outline● The need for tracing● HTrace introduction● Examples with Solr● Finding the slow server!
Scenario● User complains that queries are “slow”
● Caches?● Or faceting?● Or sorting?● Or...?
The State of the ArtRun a query with debug=timing:
"timing":{ "time":178.0, "prepare":{"time":7.0, "query":{"time":4.0}} "process":{"time":160.0, "query":{"time":118.0}, "facet":{"time":33.0}}
What is Apache HTrace● Distributed Tracing Library● Primitive Type is the “Span”
● Unique ID● Source + Description● Start/Stop Time● Additional Metadata
● See Also: Dapper, Zipkin, Wingtips
Reference Architecture
ApplicationTraceGen
ApplicationTraceGen
ApplicationTraceGen
ApplicationTraceGen
Collector
Example Architecture
Solr
trace
Solr
trace
Solr
trace
Solr
trace
htraced
How Do I Trace?● Can Trace Any “Unit Of Work”
● Method Calls● Threads● RPCs doQuery();
try (TraceScope scope = tracer.newScope(“work”)) { doQuery();}
What Does It Look Like?
● Trace of q=*:* and resulting sub-queries● Data modeled on debug=timing
Aside: Benefits over debug=timing● Separate time-lines for each server● Ability to mark annotations● Data stored for later analysis● Trace non-query requests● Trace calls to other systems
Something More Complex
techproducts/select?q=*:*&facet.field=popularity
And Other Commands
And Other Commands
Configuration● New section in solr.xml<trace> <str name="span.receiver.classes">org.apache.htrace.impl.HTracedSpanReceiver</str> <str name="htraced.receiver.address">trace-server:9075</str> <str name="sampler.classes">org.apache.htrace.core.ProbabilitySampler</str></trace>
● Sampler Options● Advanced: Force Trace
… traceid=<64-bit> …
● Advanced: Buffer Options
Performance● Each trace add ~25ns of overhead
● Search handler can generate ~20 traces● Traces are buffered in memory● May consider multiple recievers
Future Work● Add tracing to clients (SolrJ)● Leverage MDC logging● Async Requests
Demo
Demo (Backup) "responseHeader":{ "zkConnected":true, "status":0, "QTime":5251, "params":{ "q":"*:*", "indent":"on", "wt":"json", "_":"1476334957442"}},
Demo (Backup)
Demo (Backup)
Demo (Backup)
Demo (Backup)
Thank you [email protected] @mikhaildrob