solr for indexing and searching logs
DESCRIPTION
How to index logs from Logstash, Ryslog, Flume, Fluentd, via Morphlines, etc. into Solr and make them searchable.TRANSCRIPT
![Page 1: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/1.jpg)
Using Solr to Search and
Analyze Logs
Radu Gheorghe
@radu0gheorghe@sematext
![Page 2: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/2.jpg)
Elasticsearch API
syslogreceiver
Logsene
Kibana
syslogd
Logstash
![Page 4: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/4.jpg)
What about ?
![Page 5: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/5.jpg)
defining and handling logs in general
4 sets of tools to send logs to
Performance tuning and SolrCloud
![Page 6: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/6.jpg)
syslog
Defining and Handling Logs(story time!)
syslog
syslog
syslog
?
![Page 7: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/7.jpg)
Requirements
1) What’s wrong?
http://eddysuaib.com/wp-content/uploads/2012/12/Keyword-icon.png
( for debugging)
![Page 8: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/8.jpg)
Problem
looooots of messages coming in
http://www.sciencesurvivalblog.com/getting-published/unfinished-manuscripts_2346
![Page 9: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/9.jpg)
Solved with no indexing
BUT
![Page 10: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/10.jpg)
Elasticsearch
![Page 11: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/11.jpg)
Requirements
1) What’s wrong? ✓
2) What will go wrong?
(stats)
![Page 12: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/12.jpg)
Parsing Raw Logs
BUT
mickey mouse 10
user item time
still slow format changes
![Page 13: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/13.jpg)
Parsing Raw Logs
BUT
mickey mouse 0 10
add error code
still slow format changes
![Page 14: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/14.jpg)
Facets. Logging in JSON
2013-11-06… mickey mouse
{ "date": "2013-11-06", "message": "mickey mouse"}
![Page 15: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/15.jpg)
Facets. Logging in JSON
2013-11-06… @cee:{"user": "mickey"}
{ "date": "2013-11-06", "user": "mickey"}
2013-11-06… mickey mouse
{ "date": "2013-11-06", "message": "mickey mouse"}
![Page 16: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/16.jpg)
Requirements
1) What’s wrong? ✓
2) What will go wrong? ✓
3) Handle logs like production data ✓
![Page 17: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/17.jpg)
Requirements
1) What’s wrong? ✓
2) What will go wrong? ✓
3) Handle logs like production data ✓
What is a log?
How to handle logs?
![Page 18: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/18.jpg)
4 Ways of Sending Logs to Solr
logger
Logstash
files
![Page 19: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/19.jpg)
Schemaless
% cd solr-4.5.1/example/% mv solr solr.bak
% cp -R example-schemaless/solr/ .
![Page 20: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/20.jpg)
Automatic ID generation
solrconfig.xml
<updateRequestProcessorChain name="add-unknown-fields-to-the-schema"> ……..
<processor class="solr.UUIDUpdateProcessorFactory"> <str name="fieldName">id</str> </processor><processor class="solr.LogUpdateProcessorFactory"/><processor class="solr.RunUpdateProcessorFactory"/>
</updateRequestProcessorChain>
http://solr.pl/en/2013/07/08/automatically-generate-document-identifiers-solr-4-x/
![Page 21: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/21.jpg)
logger
/dev/log
mmjsonparseomprog + script
![Page 22: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/22.jpg)
/dev/log -> parse -> format -> send to Solr
% logger '@cee: {"hello": "world"}'
rsyslog.conf
module(load="imuxsock") # version 7+
![Page 23: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/23.jpg)
/dev/log -> parse -> format -> send to Solr
...
module(load="mmjsonparse")
action(type="mmjsonparse")
![Page 24: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/24.jpg)
/dev/log -> parse -> format -> send to Solr
...template(name="CEE"
type="list") {
property(name="$!all-json")
constant(value="\n")
}
![Page 25: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/25.jpg)
/dev/log -> parse -> format -> send to Solr
...action(type="mmjsonparse")template(name="CEE"…module(load="omprog")
if $parsesuccess == "OK" then action(type="omprog"
binary="/opt/json-to-solr.py"
template="CEE")
![Page 26: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/26.jpg)
/dev/log -> parse -> format -> send to Solr
import json, pysolr, sys
solr = pysolr.Solr('http://localhost:8983/solr/')
while True:
line = sys.stdin.readline()
doc = json.loads(line)
solr.add([doc])
![Page 27: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/27.jpg)
Avro
MorphlineSolr Sink
![Page 28: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/28.jpg)
Avro -> buffer -> parse -> send to Solr
https://github.com/mpercy/flume-log4j-example
flume.confagent.sources = avroSrc
agent.sources.avroSrc.type = avro
agent.sources.avroSrc.bind = 0.0.0.0
agent.sources.avroSrc.port = 41414
![Page 29: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/29.jpg)
Avro -> buffer -> parse -> send to Solr
flume.conf
agent.channels = solrMemoryChannel
agent.channels.solrMemoryChannel.type = memory
agent.sources.avroSrc.channels = solrMemoryChannel
![Page 30: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/30.jpg)
Avro -> buffer -> parse -> send to Solr
flume.conf
agent.sinks = solrSink
agent.sinks.solrSink.type = org.apache.flume.sink.solr.morphline.MorphlineSolrSink
agent.sinks.solrSink.morphlineFile = conf/morphline.conf
agent.sinks.solrSink.channel = solrMemoryChannel
![Page 31: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/31.jpg)
Avro -> buffer -> parse -> send to Solr
morphline.conf... commands : [
{ readLine { charset : UTF-8 }}
{ grok {
dictionaryFiles : [conf/grok-patterns]
expressions : {
message : """%{INT:pid} %{DATA:message}"""
...
https://github.com/cloudera/search/tree/master/samples/solr-nrt/grok-dictionaries
![Page 32: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/32.jpg)
Avro -> buffer -> parse -> send to Solr
morphline.conf
SOLR_LOCATOR : { collection : collection1 #zkHost : "127.0.0.1:2181" solrUrl : "http://localhost:8983/solr/"}... commands : [
...
{ loadSolr {
solrLocator : ${SOLR_LOCATOR}
...
![Page 33: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/33.jpg)
fluent-logger fluent-plugin-solr
![Page 34: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/34.jpg)
fluent-logger -> fluentd -> fluent-plugin-solr
% pip install fluent-logger
from fluent import sender,event
sender.setup('solr.test')
event.Event('forward', {'hello': 'world'})
![Page 35: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/35.jpg)
fluent-logger -> fluentd -> fluent-plugin-solr
<source>
type forward
</source>
<match solr.**>
type solr
host localhost
port 8983
core collection1
</match>
![Page 36: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/36.jpg)
fluent-logger -> fluentd -> fluent-plugin-solr
% gem install fluent-plugin-solr
doc = Solr::Document.new(:hello => record["hello"])
https://github.com/btigit/fluent-plugin-solr
out_solr.rb
![Page 37: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/37.jpg)
file input solr_http output
Logstashfile
grok filter
![Page 38: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/38.jpg)
logstash.conf:
input { file { path => "/tmp/testlog" }}
file input -> grok filter -> solr_http output
% echo '2 world' >> /tmp/testlog
![Page 39: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/39.jpg)
logstash.conf:
filter { grok { match => ["message", "%{NUMBER:pid} %{GREEDYDATA:hello}"] }}
file input -> grok filter -> solr_http output
{"pid": "2", "hello":"world"}
![Page 40: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/40.jpg)
logstash.conf:
output { solr_http { # master or v1.2.3+ solr_url => "http://localhost:8983/solr" }}
file input -> grok filter -> solr_http output
![Page 41: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/41.jpg)
Fast and Cloud
![Page 42: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/42.jpg)
“It Depends”
http://www.bigskytech.com/wp-content/uploads/2011/02/guage.png
load test monitor: SPM
20% off: LR2013SPM20
![Page 43: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/43.jpg)
|>>>>|Single Core: # of docs/update
http://static.memrise.com.s3.amazonaws.com/uploads/blog-pictures/Simpsons_Updates.bmp
![Page 44: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/44.jpg)
|>>>>|Single Core: Commits
http://cache.desktopnexus.com/thumbnails/1306-bigthumbnail.jpghttp://www.musicfestivaljunkies.com/wp-content/uploads/2012/01/HardLogo.png
<autoSoftCommit> <maxTime>...
<autoCommit> <openSearcher>false <maxTime>???
<ramBufferSizeMB>???
![Page 45: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/45.jpg)
|>>>>|Single Core: Size and Merges
http://sweetclipart.com/multisite/sweetclipart/files/scissors_blue_silver.pnghttp://mergewords.com/gfx/logo-big.png
omitNorms="true"omitTermFreqAndPositions="true" <mergeFactor>??
![Page 46: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/46.jpg)
|>>>>|Single Core: Caches
http://vector-magz.com/wp-content/uploads/2013/06/diamond-clip-art4.pnghttp://www.clker.com/cliparts/1/f/6/3/11971228961330048838SaraSara_Ice_cube_2.svg.med.png
http://clipartist.info/RSS/openclipart.org/2011/May/02-Monday/migrating_penguin_penguinmigrating-555px.png
<fieldValueCache ... size="???" autowarmCount="0"
docValues="true"
facets
changing datato sort&facet
![Page 47: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/47.jpg)
SolrCloud: ZooKeeper
bin/zkServer.sh start
OR
java -DzkRun … -jar start.jarhttp://www.clker.com/cliparts/c/a/8/d/1331060720387485902Roaring%20Tiger.svg.hi.png
http://fc03.deviantart.net/fs71/f/2012/196/6/a/piggy_back_rides_are_the_best_rides__by_yipped-d57b3sh.png
![Page 48: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/48.jpg)
SolrCloud: ZooKeeper
zkcli.sh -cmd upconfig \ -zkhost SERVER:2181 \ -confdir solr/collection1/conf/ \ -confname start
-Dbootstrap_confdir=solr/collection1/conf -Dcollection.configName=start
http://www.clker.com/cliparts/c/a/8/d/1331060720387485902Roaring%20Tiger.svg.hi.pnghttp://fc03.deviantart.net/fs71/f/2012/196/6/a/piggy_back_rides_are_the_best_rides__by_yipped-d57b3sh.png
![Page 49: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/49.jpg)
SolrCloud: Start Nodes
java -DzkHost=SERVER:2181 -jar start.jar
![Page 50: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/50.jpg)
Timed Collections
04Nov
05Nov
06 Nov
07Nov
search latest
search all
index
optimize
![Page 51: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/51.jpg)
Collections API
05Nov
06Nov
07 Nov
08Nov
action=CREATE&name=08Nov&numShards=4
action=DELETE&name=05Nov
![Page 52: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/52.jpg)
Aliases. Optimize
05Nov
06Nov
07 Nov
08Nov
action=CREATEALIAS&name=ALL&collection=06Nov,07Nov,08Nov
action=CREATEALIAS&name=LATEST&collection=08Nov07Nov/update?optimize=true
![Page 53: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/53.jpg)
![Page 54: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/54.jpg)
logs =production
data
![Page 55: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/55.jpg)
logs =production
data
Logstash
![Page 56: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/56.jpg)
logs =production
data
Logstash
docs/updatecommits
mergeFactor
omit*docValues
caches
![Page 57: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/57.jpg)
logs =production
data
Logstash
docs/updatecommits
mergeFactor
omit*docValues
caches
![Page 58: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/58.jpg)
logs =production
data
Logstash
docs/updatecommits
mergeFactor
omit*docValues
caches
time
Collections APIaliases
optimize
![Page 59: Solr for Indexing and Searching Logs](https://reader034.vdocuments.mx/reader034/viewer/2022042518/54c64ae44a7959d95b8b45b8/html5/thumbnails/59.jpg)
We’re hiring!
sematext.com/about/jobs