splunksummit 2015 - update on splunk enterprise 6.3 & hunk 6.3
TRANSCRIPT
Update on Splunk 6.3 & HUNK 6.3
Jag Dhillon Senior Sales Engineer ANZ
Safe Harbor Statement During the course of this presentaCon, we may make forward looking statements regarding future events or the expected performance of the company. We cauCon you that such statements reflect our current expectaCons and esCmates based on factors currently known to us and that actual events or results could differ materially. For important factors that may cause actual results to differ from those contained in our forward-‐looking statements, please review our filings with the SEC. The forward-‐looking statements made in this presentaCon are being made as of the Cme and date of its live presentaCon. If reviewed aRer its live presentaCon, this presentaCon may not contain current or accurate informaCon. We do not assume any obligaCon to update any forward looking statements we may make. In addiCon, any informaCon about our roadmap outlines our general product direcCon and is subject to change at any Cme without noCce. It is for informaConal purposes only and shall not be incorporated into any contract or other commitment. Splunk undertakes no obligaCon either to develop the features or funcConality described or to include any such feature or funcConality in a future release.
Splunk Enterprise 6.3
3
Advanced Analysis & Visualiza1on
Breakthrough Performance & Scale
High Volume Event Collec1on
Enterprise-‐Scale PlaBorm
Supports DevOps and IoT data analysis at scale
Simplifies analysis of large datasets
Delivers Enterprise pla;orm requirements
Doubles performance and lowers TCO
• 2X Search & Indexing Speed • 20-‐50% Increased Capacity • 20%+ Reduced TCO
• Anomaly DetecCon • GeospaCal Mapping • Single-‐Value Display
• HTTP Event Collector • Developer API & SDKs • 3rd Party IntegraCons
• Expanded Management • Custom Alert AcCons • Data Integrity Control
Mee#ng the needs of the most demanding organiza#ons
Splunk Enterprise 6.3
4
Advanced Analysis & Visualiza1on
Breakthrough Performance & Scale
High Volume Event Collec1on
Enterprise-‐Scale PlaBorm
Supports DevOps and IoT data analysis at scale
Simplifies analysis of large datasets
Delivers Enterprise pla;orm requirements
Doubles performance and lowers TCO
• 2X Search & Indexing Speed • 20-‐50% Increased Capacity • 20%+ Reduced TCO
• Anomaly DetecCon • GeospaCal Mapping • Single-‐Value Display
• HTTP Event Collector • Developer API & SDKs • 3rd Party IntegraCons
• Expanded Management • Custom Alert AcCons • Data Integrity Control
Mee#ng the needs of the most demanding organiza#ons
Breakthrough Performance, Scale, TCO
5
Search Performance
Indexing Speed
Intelligent Scheduling 25%+ Capacity Gain
2X ExecuCon Speed
2-‐4X Data Rate
Ver#cal scaling maximizes use of CPU power
Total System Capacity 20-‐50% Increase
Improve speed of searches & reports Onboard & analyze larger datasets OpCmize resource uClizaCon Reduce TCO by 20% or more Comparisons are to Splunk Enterprise 6.2. Customer performance and TCO will vary according to workload, configuraCon and available processing capacity.
3 Tier Architecture
6
Forwarders Indexers
Raw Data Searches
Search Heads
Search Results
Insight into the Indexer
7
Splunkd Server Daemon
Splunk Search Process
.
.
. Raw Data
TradiConal Indexer Hosts
Disk
Buckets
B
B
B
.
.
Search Results
Search Results
SP SP SP
Splunk Search Process SP SP SP
Splunkd Server Daemon / Pipelineset
8
Parsing Queue
Agg Queue
Typing Queue
Index Queue
TCP/UDP pipeline
Tailing
FIFO pipeline
FSChange
Exec pipeline
ug8
header
Parsing Pipeline
linebreaker aggregator
Merging Pipeline
regex replacement
annotator
Typing Pipeline
tcp out
syslog out
indexer
Index Pipeline
IngesCon Pipeline Set
Indexer Core UClizaCon
9
Process Cores (approx.)
Splunkd Server Daemon 4 to 6 cores
Splunk Search Process 1 core / search process
• Rule of Thumb:
• Example core uClizaCon of a Indexer Host: – 4 To 6 cores for Splunkd Server daemon – 10 X 1 Cores for Splunk Search Processes – Total cores used: 14 to 16 cores
Under-‐UClized Indexer
10
Splunkd Server Daemon
Splunk Search Process Disk
Buckets
B
B
B
UnuClized Resources CPU/Memory/Network/Disk
SP SP SP
Splunk Search Process SP SP SP
0
400
800
1200
1600
2000
2400
2800
3200
Core U1liza1on %
Performance Enhancements in 6.3
• MulCple Pipeline Sets – Parallel ingesCng pipeline sets – Improves resource uClizaCon of the host machine
• Search Improvements – Faster batch searches using parallel search pipelines
11
MulCple IngesCon Pipeline Sets
Splunkd with MulCple IngesCon Pipeline Sets
13
Splunkd Server Daemon
Raw Data
Disk
Buckets B
B
B
B
B
B
B
B
B
Indexer with 3 Pipeline Sets
Configuring MulCple IngesCon Pipeline Sets • $SPLUNK_HOME/etc/system/local/server.conf
14
[general]parallelIngestionPipelines = 3
MulCple IngesCon Pipeline Sets – Details
• Each Pipeline Set has its own set of Queues, Pipelines and Processors – ExcepCons are Input Pipelines which are usually singleton
• No state is shared across Pipeline Sets • Data from a unique source is handled by only one Pipeline Set at a Cme
15
MulCple IngesCon Pipeline Sets over Network
16
Forwarder with 3 Pipeline Sets
Splunkd Forwarder
Indexer with 3 Pipeline Sets
File
File
Script
Splunkd Server Daemon
Disk
Buckets B
B
B
B
B
B
B
B
B
TCP
MulCple IngesCon Pipeline Sets – Monitor Input
• Each Pipelineset has its own set of TailReader, BatchReader and Archive Processor
• Enables parallel reading of files and archives on Forwarders • Each file/archive is assigned to one pipeline set
17
MulCple IngesCon Pipeline Sets -‐ Forwarding
• Forwarder: – One tcp output processor per pipeline set – MulCple tcp connecCons from the forwarder to different indexers at the
same Cme – Load balancing rules applied to each pipeline set independently
• Indexer: – Every incoming tcp forwarder connecCon is bound to one pipeline set
on the Indexer
18
MulCple IngesCon Pipeline Sets -‐ Indexing
• Every pipeline set will independently write new data to indexes • Data is wripen in parallel to beper uClize resources • Buckets produced by different pipeline sets could have overlapping Cme ranges
19
Search : ParallelizaCon Efforts Performance Improvements
Search ParallelizaCon: Performance Improvement
Splunk Searches are faster in 6.3.
21
• Parallelizing the Search Pipeline • Improving the Search Scheduler • The Summary Building is parallelized and faster
Search Pipeline
22
Cursored Search
…B6 B5 B4 B3 B2 B1
Reading Order Iterates over Cme hence needs to read bucket based on the Cme ordering.
Batch Search
OpCon1:…B3 B5 B1 B2 B1 B6 OpCon2:…B6 B5 B4 B3 B2 B1 OpCon 3…B6 B5 B4 B7 B4 B9
Reading Order
Iterates over buckets, Cme ordering is not needed
Target search bucket ids
B1 B2 B3
B4 B5 B6
B7 B8 B9
b11 b11 b11 Search Post Processing
Search Processor
Search Processor
Serialize &
Transmit
Indexer (Disk)
Search Pipeline at the Peer
Facilitates parallel processing of buckets independently across mulCple pipeline
• Cursored Search: Time ordered data retrieval. • Batch Search: Bucket ordered data retrieval.
Batch Search: Pipeline ParallelizaCon
23
Target search buckets
B1 B2 B3
b11 b11 b11
B7 B8 B9
B4 B5 B6
Indexer (Disk)
Search Processor
Search Processor
Search Processor
Search Processor
Search Processor
Search Processor
Search Processor
Search Processor
Search Post Processing
Aggregator
& Serializer
Transmit (I/O)
Search Pipeline 1
Search Pipeline 4
Search Pipeline 3
Search Pipeline 2
T
T
T
T
T
T
T= Thread
Batch Search: Pipeline ParallelizaCon
• Under-‐uClized indexers provide us opportunity to execute mulCple search pipelines
• Batch Search Cme-‐unordered data access mode is ideal for mulCple search pipelines
• No state is shared i.e. no dependency exists across Search Pipelines • Peer/Indexer side opCmizaCons • Take-‐away :
– Under uClized indexers are candidates for search pipeline parallelizaCon – Do NOT enable if indexers are loaded
24
Configuring the Batch Search in Parallel mode
• How to enable?
25
• What to expect? Search performance in terms of retrieving search results improved. Increase in number of threads
$SPLUNK_HOME/etc/system/local/limits.conf
[search] batch_search_max_pipeline = 2
Search Scheduler Improvements
• Scheduler improvements in Splunk Enterprise 6.3: – Priority Scoring – Schedule Windows
• Performance improvements over previous schedulers – Lower Lag – Fewer skipped searches
26
Search Scheduler Improvements Priority Score
27
Problem in 6.2: Simple single-‐term priority scoring could result in saved search lag, skipping, and starvaCon (under CPU constraint)
score(j) = next_runCme(j) + average_runCme(j) × priority_runCme_factor – skipped_count(j) × period(j) ×
priority_skipped_factor + schedule_window_adjustment(j)
Solu1on in 6.3: Beper mulC-‐term priority scoring miCgates problems and improves performance by 25%.
Search Scheduler Improvements
28
Problem in 6.2 Scheduler can not disCnguish between searches that (A) really should run at a specific Cme (just like cron) from those that (B) don't have to. This can cause lag or skipping. Solu1on in 6.3: Give a schedule window to searches that don’t have to run at specific Cmes.
Example: For a given search, it’s OK if it starts running someCme between midnight and 6am, but you don't really care when specifically
• A search with a window helps other searches
• Search windows should not be used for searches that run every minute
• Search windows must be less than a search’s period
Configuring Search Scheduler
29
[scheduler] max_searches_perc = 50 # Allow value to be 75 anyCme on weekends. max_searches_perc.1 = 75 max_searches_perc.1.when = * * * * 0,6 # Allow value to be 90 between midnight and 5am. max_searches_perc.2 = 90 max_searches_perc.2.when = * 0-‐5 * * *
$SPLUNK_HOME/etc/system/local/limits.conf
Search: Parallel SummarizaCon
• SequenCal nature of building summary data for data model and saved reports is slow
• Summary Building process has been parallelized in 6.3
30
Summary Building ParallelizaCon
31
auto summary search
every N minutes
SCHEDULER SCHEDULER
auto summary search
auto summary search
auto summary search
SequenCal Summary Building Parallelized Summary Building
Configuring Summary Building for ParallelizaCon
32
• $SPLUNK_HOME/etc/system/local/savedsearches.conf
[default] auto_summarize.max_concurrent = 2
$SPLUNK_HOME/etc/system/local/datamodels.conf
[default] acceleraCon.max_concurrent = 2
So What Does Breakthrough Mean?
● CriCcal reports can be available in ¼ the 1me
● It takes 20% less indexing HW to expand or deploy Splunk
● New data is ready for analysis in ½ the 1me
33
Splunk expansion costs have dropped over 50% since 2013 A new customer can deploy Splunk using 1/3 the HW vs. 2013 Splunk deployment is now ½ the cost vs. 2013
Release 6.3 vs.
Release 6.2
Release 6.3 vs.
Release 6.0
Splunk Enterprise 6.3
34
Advanced Analysis & Visualiza1on
Breakthrough Performance & Scale
High Volume Event Collec1on
Enterprise-‐Scale PlaBorm
Supports DevOps and IoT data analysis at scale
Simplifies analysis of large datasets
Delivers Enterprise pla;orm requirements
Doubles performance and lowers TCO
• 2X Search & Indexing Speed • 20-‐50% Increased Capacity • 20%+ Reduced TCO
• Anomaly DetecCon • GeospaCal Mapping • Single-‐Value Display
• HTTP Event Collector • Developer API & SDKs • 3rd Party IntegraCons
• Expanded Management • Custom Alert AcCons • Data Integrity Control
Mee#ng the needs of the most demanding organiza#ons
Analysis & VisualizaCon ● Anomaly DetecCon – Incorporates Z-‐Score, IQR & histogram
methodologies in a single command
● GeospaCal VisualizaCon – Visualizes metric variance across a
customizable geographic area
● Single Value Display – At-‐a-‐glance, single-‐value indicators
with useful context
35
36
GeospaCal VisualizaCon
• Choropleth maps help users to easily spot spaCal paperns
• Color scales can be configured per use case
• Users can upload their own geographical polygon definiCons
Visualizes metric variance across a customizable geographic area
37
Single Value Display
• Large type and prominent colors make values or changes visible, even from a distance
• Sparkline shows trends in the recent history
• Delta indicator shows changes since a previous Cme
At-‐a-‐glance, single-‐value indicators with useful context
Anomaly DetecCon New SPL command provides histogram-‐based anomaly detec#on
● Net new histogram-‐based approach offers a more accurate detecCon method
● Single command offers 3 opCons: zscore, IQR & histogram
● Replaces exisCng Outlier and AnomalousValue commands
38
Splunk Enterprise
39
Advanced Analysis & Visualiza1on
Breakthrough Performance & Scale
High Volume Event Collec1on
Enterprise-‐Scale PlaBorm
Supports DevOps and IoT data analysis at scale
Simplifies analysis of large datasets
Delivers Enterprise pla;orm requirements
Doubles performance and lowers TCO
• 2X Search & Indexing Speed • 20-‐50% Increased Capacity • 20%+ Reduced TCO
• Anomaly DetecCon • GeospaCal Mapping • Single-‐Value Display
• HTTP Event Collector • Developer API & SDKs • 3rd Party IntegraCons
• Expanded Management • Custom Alert AcCons • Data Integrity Control
Mee#ng the needs of the most demanding organiza#ons
HTTP Event Collector Supports DevOps and IoT data analysis needs at scale
40
DevOps & Developers
IoT Devices & Applica1ons
1. Standard API and logging libraries send events directly to Splunk 2. Libraries integrated into popular plagorms and services
Scales to Millions of Events/Second
Demo
hpp://splunk.com/shake
41
Splunk Enterprise 6.3
42
Advanced Analysis & Visualiza1on
Breakthrough Performance & Scale
High Volume Event Collec1on
Enterprise-‐Scale PlaBorm
Supports DevOps and IoT data analysis at scale
Simplifies analysis of large datasets
Delivers Enterprise pla;orm requirements
Doubles performance and lowers TCO
• 2X Search & Indexing Speed • 20-‐50% Increased Capacity • 20%+ Reduced TCO
• Anomaly DetecCon • GeospaCal Mapping • Single-‐Value Display
• HTTP Event Collector • Developer API & SDKs • 3rd Party IntegraCons
• Expanded Management • Custom Alert AcCons • Data Integrity Control
Mee#ng the needs of the most demanding organiza#ons
Distributed Management Console -‐ II New topology views, status, and aler#ng for Splunk deployments
● Visualizes Search Head/Indexer matrix with KPI and performance overlays
● Search Head clustering replicaCon and scheduler views
● Forwarder views with status and performance data
● Index and metadata storage uClizaCon ● System health alerCng
43
Indexer Auto-‐Discovery Simplifies Forwarders management in a dynamic environment
● Cluster master maintains dynamic Indexer list accessed by Forwarders
● Indexers can be added/removed without affecCng Forwarder configuraCon or operaCon
44
…
Data Integrity Control Helps Ensure data fidelity; Meets GPG13 compliance requirements
● Hash signatures of selected index data are saved at regular intervals
● Intervals can be validated by the admin ● Meets security and compliance requirements by verifying that data has not been tampered with
● Hashes can be exported to further ensure security
45
Custom Alert AcCons Use Splunk Alerts to trigger & automate workflows
● Allows packaged integraCon with third-‐party applicaCons
● Simple admin/user configuraCon ● Developers can build, package, and publish alert acCons within an app
● Growing list of integraCons available
46
Alert AcCon Examples ● NoCficaCon Services
‣ Send message to IM clients (HipChat, Slack) ‣ Send SMS
● Incident RemediaCon / TickeCng ‣ Automate the creaCon of Cckets (ServiceNow, Jira)
● IT Monitoring ‣ Send incident/alert into monitoring tools (xMapers, BigPanda)
● Security ‣ Take acCon or send events to firewalls, devices, management
consoles
● Internet-‐of-‐Things ‣ Trigger device-‐level acCons (change lights, sounds an alarm, send
acCon to device)
● Custom AcCon ‣ Trigger any organizaCon-‐specific acCon (restart applicaCon,
integrate with homegrown service, and more)
47
Eco-‐system Partners
Splunk Mobile Access Splunk dashboards, alerts and more for iOS and Android devices
● Monitor dashboards, KPIs, reports ● Receive real-‐Cme business and
operaConal alerts ● Annotate and share data ● Supports MDM and single sign-‐on ● No longer requires separate Mobile
Access Server
48
Formally called “Splunk Mobile App”
What’s New in Hunk 6.3
Introducing Hunk 6.3
50
Archive to Hadoop Single Splunk Interface to Search Real-‐Time & Historical Data
Drive Down TCO
Access Data Using Hive or Pig Query Without Moving or ReplicaCng Data
Open Access for 3rd-‐Party Hadoop Tools
Anomaly DetecCon GeospaCal VisualizaCon Contextual Display
Advanced Analy1cs & Visualiza1ons
Archive Splunk Data to HDFS or AWS S3
Hadoop Clusters WARM
COLD
FROZEN
Drive Down TCO by Archiving Historical Data to Commodity Hardware
Unified Search Intelligently Search Across Real-‐Time and Historical Data Using the Same Splunk Interface
Real-‐Time Data Historical Data in Hadoop
53
Open Access to Historical Data Using 3rd-‐party Hadoop tools
Hadoop Clusters
Historical Data in HDFS 3rd-‐Party Hadoop Tools
Data Scien1st
Splunk Archive Reader for Hadoop
• Use 3rd-‐party Hadoop tools (e.g., Hive, Pig) to perform addiConal analysis • Broaden data access to wider set of audiences, e.g. data scienCsts and analysts • Run queries without moving or replicaCng data
Advanced AnalyCcs and VisualizaCon CapabiliCes
● Anomaly DetecCon – Incorporates Z-‐Score, IQR & histogram
methodologies in a single command
● GeospaCal VisualizaCon – Visualizes metric variance across a
customizable geographic area
● Single Value Display – Derive more context by layering on
visual cues and more flexible formaYng
54
Release 6.3 – Value Across Products
Splunk Enterprise All 6.3 features & performance
Splunk Cloud Most features, scalability
Hunk VisualizaCon & analysis of large datasets
Splunk Light VisualizaCon, HTTP events, data integrity
55
Enterprise Cloud Hunk Light
Performance & Scale
Yes Scale Search only
No
HTTP Events Yes Yes No Yes
Data VisualizaCon Yes Yes Yes Yes
Alert AcCon IntegraCon
Yes Yes Yes No
Data Integrity Check
Yes Yes No Yes
Distributed Mgt Console
Yes No Yes No
Other Management Yes Yes ParCal ParCal
Splunk Enterprise 6.3
56
Advanced Analysis & Visualiza1on
Breakthrough Performance & Scale
High Volume Event Collec1on
Enterprise-‐Scale PlaBorm
Supports DevOps and IoT data analysis at scale
Simplifies analysis of large datasets
Delivers Enterprise pla;orm requirements
Doubles performance and lowers TCO
• 2X Search & Indexing Speed • 20-‐50% Increased Capacity • 20%+ Reduced TCO
• Anomaly DetecCon • GeospaCal Mapping • Single-‐Value Display
• HTTP Event Collector • Developer API & SDKs • 3rd Party IntegraCons
• Expanded Management • Custom Alert AcCons • Data Integrity Control
Mee#ng the needs of the most demanding organiza#ons
Q & A ?
THANK YOU!