splunksummit 2015 - update on splunk enterprise 6.3 & hunk 6.3

Update on Splunk 6.3 & HUNK 6.3

Jag Dhillon Senior Sales Engineer ANZ

Safe Harbor Statement During the course of this presentaCon, we may make forward looking statements regarding future events or the expected performance of the company. We cauCon you that such statements reflect our current expectaCons and esCmates based on factors currently known to us and that actual events or results could differ materially. For important factors that may cause actual results to differ from those contained in our forward-‐looking statements, please review our filings with the SEC. The forward-‐looking statements made in this presentaCon are being made as of the Cme and date of its live presentaCon. If reviewed aRer its live presentaCon, this presentaCon may not contain current or accurate informaCon. We do not assume any obligaCon to update any forward looking statements we may make. In addiCon, any informaCon about our roadmap outlines our general product direcCon and is subject to change at any Cme without noCce. It is for informaConal purposes only and shall not be incorporated into any contract or other commitment. Splunk undertakes no obligaCon either to develop the features or funcConality described or to include any such feature or funcConality in a future release.

Splunk Enterprise 6.3

3

Advanced Analysis & Visualiza1on

Breakthrough Performance & Scale

High Volume Event Collec1on

Enterprise-‐Scale PlaBorm

Supports DevOps and IoT data analysis at scale

Simplifies analysis of large datasets

Delivers Enterprise pla;orm requirements

Doubles performance and lowers TCO

• 2X Search & Indexing Speed • 20-‐50% Increased Capacity • 20%+ Reduced TCO

• Anomaly DetecCon • GeospaCal Mapping • Single-‐Value Display

• HTTP Event Collector • Developer API & SDKs • 3rd Party IntegraCons

• Expanded Management • Custom Alert AcCons • Data Integrity Control

Mee#ng the needs of the most demanding organiza#ons


4














Breakthrough Performance, Scale, TCO

5

Search Performance

Indexing Speed

Intelligent Scheduling 25%+ Capacity Gain

2X ExecuCon Speed

2-‐4X Data Rate

Ver#cal scaling maximizes use of CPU power

Total System Capacity 20-‐50% Increase

Improve speed of searches & reports Onboard & analyze larger datasets OpCmize resource uClizaCon Reduce TCO by 20% or more Comparisons are to Splunk Enterprise 6.2. Customer performance and TCO will vary according to workload, configuraCon and available processing capacity.

3 Tier Architecture

6

Forwarders Indexers

Raw Data Searches

Search Heads

Search Results

Insight into the Indexer

7

Splunkd Server Daemon

Splunk Search Process

.

.

. Raw Data

TradiConal Indexer Hosts

Disk

Buckets

B

B

B

.

.

Search Results

Search Results

SP SP SP

Splunk Search Process SP SP SP

Splunkd Server Daemon / Pipelineset

8

Parsing Queue

Agg Queue

Typing Queue

Index Queue

TCP/UDP pipeline

Tailing

FIFO pipeline

FSChange

Exec pipeline

ug8

header

Parsing Pipeline

linebreaker aggregator

Merging Pipeline

regex replacement

annotator

Typing Pipeline

tcp out

syslog out

indexer

Index Pipeline

IngesCon Pipeline Set

Indexer Core UClizaCon

9

Process Cores (approx.)

Splunkd Server Daemon 4 to 6 cores

Splunk Search Process 1 core / search process

•  Rule of Thumb:

•  Example core uClizaCon of a Indexer Host: –  4 To 6 cores for Splunkd Server daemon –  10 X 1 Cores for Splunk Search Processes –  Total cores used: 14 to 16 cores

Under-‐UClized Indexer

10


Splunk Search Process Disk

Buckets

B

B

B

UnuClized Resources CPU/Memory/Network/Disk

SP SP SP

Splunk Search Process SP SP SP

0

400

800

1200

1600

2000

2400

2800

3200

Core U1liza1on %

Performance Enhancements in 6.3

•  MulCple Pipeline Sets –  Parallel ingesCng pipeline sets –  Improves resource uClizaCon of the host machine

•  Search Improvements –  Faster batch searches using parallel search pipelines

11

MulCple IngesCon Pipeline Sets

Splunkd with MulCple IngesCon Pipeline Sets

13


Raw Data

Disk

Buckets B

B

B

B

B

B

B

B

B

Indexer with 3 Pipeline Sets

Configuring MulCple IngesCon Pipeline Sets •  $SPLUNK_HOME/etc/system/local/server.conf

14

[general]parallelIngestionPipelines = 3

MulCple IngesCon Pipeline Sets – Details

•  Each Pipeline Set has its own set of Queues, Pipelines and Processors –  ExcepCons are Input Pipelines which are usually singleton

•  No state is shared across Pipeline Sets •  Data from a unique source is handled by only one Pipeline Set at a Cme

15

MulCple IngesCon Pipeline Sets over Network

16

Forwarder with 3 Pipeline Sets

Splunkd Forwarder

Indexer with 3 Pipeline Sets

File

File

Script


Disk

Buckets B

B

B

B

B

B

B

B

B

TCP

MulCple IngesCon Pipeline Sets – Monitor Input

•  Each Pipelineset has its own set of TailReader, BatchReader and Archive Processor

•  Enables parallel reading of files and archives on Forwarders •  Each file/archive is assigned to one pipeline set

17

MulCple IngesCon Pipeline Sets -‐ Forwarding

•  Forwarder: –  One tcp output processor per pipeline set –  MulCple tcp connecCons from the forwarder to different indexers at the

same Cme –  Load balancing rules applied to each pipeline set independently

•  Indexer: –  Every incoming tcp forwarder connecCon is bound to one pipeline set

on the Indexer

18

MulCple IngesCon Pipeline Sets -‐ Indexing

•  Every pipeline set will independently write new data to indexes •  Data is wripen in parallel to beper uClize resources •  Buckets produced by different pipeline sets could have overlapping Cme ranges

19

Search : ParallelizaCon Efforts Performance Improvements

Search ParallelizaCon: Performance Improvement

Splunk Searches are faster in 6.3.

21

•  Parallelizing the Search Pipeline •  Improving the Search Scheduler •  The Summary Building is parallelized and faster

Search Pipeline

22

Cursored Search

…B6 B5 B4 B3 B2 B1

Reading Order Iterates over Cme hence needs to read bucket based on the Cme ordering.

Batch Search

OpCon1:…B3 B5 B1 B2 B1 B6 OpCon2:…B6 B5 B4 B3 B2 B1 OpCon 3…B6 B5 B4 B7 B4 B9

Reading Order

Iterates over buckets, Cme ordering is not needed

Target search bucket ids

B1 B2 B3

B4 B5 B6

B7 B8 B9

b11 b11 b11 Search Post Processing

Search Processor

Search Processor

Serialize &

Transmit

Indexer (Disk)

Search Pipeline at the Peer

Facilitates parallel processing of buckets independently across mulCple pipeline

•  Cursored Search: Time ordered data retrieval. •  Batch Search: Bucket ordered data retrieval.

Batch Search: Pipeline ParallelizaCon

23

Target search buckets

B1 B2 B3

b11 b11 b11

B7 B8 B9

B4 B5 B6

Indexer (Disk)

Search Processor

Search Processor

Search Processor

Search Processor

Search Processor

Search Processor

Search Processor

Search Processor

Search Post Processing

Aggregator

& Serializer

Transmit (I/O)

Search Pipeline 1

Search Pipeline 4

Search Pipeline 3

Search Pipeline 2

T

T

T

T

T

T

T= Thread

Batch Search: Pipeline ParallelizaCon

•  Under-‐uClized indexers provide us opportunity to execute mulCple search pipelines

•  Batch Search Cme-‐unordered data access mode is ideal for mulCple search pipelines

•  No state is shared i.e. no dependency exists across Search Pipelines •  Peer/Indexer side opCmizaCons •  Take-‐away :

–  Under uClized indexers are candidates for search pipeline parallelizaCon –  Do NOT enable if indexers are loaded

24

Configuring the Batch Search in Parallel mode

•  How to enable?

25

•  What to expect? Search performance in terms of retrieving search results improved. Increase in number of threads

$SPLUNK_HOME/etc/system/local/limits.conf

[search] batch_search_max_pipeline = 2

Search Scheduler Improvements

•  Scheduler improvements in Splunk Enterprise 6.3: –  Priority Scoring –  Schedule Windows

•  Performance improvements over previous schedulers –  Lower Lag –  Fewer skipped searches

26

Search Scheduler Improvements Priority Score

27

Problem in 6.2: Simple single-‐term priority scoring could result in saved search lag, skipping, and starvaCon (under CPU constraint)

score(j) = next_runCme(j) + average_runCme(j) × priority_runCme_factor – skipped_count(j) × period(j) ×

priority_skipped_factor + schedule_window_adjustment(j)

Solu1on in 6.3: Beper mulC-‐term priority scoring miCgates problems and improves performance by 25%.

Search Scheduler Improvements

28

Problem in 6.2 Scheduler can not disCnguish between searches that (A) really should run at a specific Cme (just like cron) from those that (B) don't have to. This can cause lag or skipping. Solu1on in 6.3: Give a schedule window to searches that don’t have to run at specific Cmes.

Example: For a given search, it’s OK if it starts running someCme between midnight and 6am, but you don't really care when specifically

•  A search with a window helps other searches

•  Search windows should not be used for searches that run every minute

•  Search windows must be less than a search’s period

Configuring Search Scheduler

29

[scheduler] max_searches_perc = 50 # Allow value to be 75 anyCme on weekends. max_searches_perc.1 = 75 max_searches_perc.1.when = * * * * 0,6 # Allow value to be 90 between midnight and 5am. max_searches_perc.2 = 90 max_searches_perc.2.when = * 0-‐5 * * *

$SPLUNK_HOME/etc/system/local/limits.conf

Search: Parallel SummarizaCon

•  SequenCal nature of building summary data for data model and saved reports is slow

•  Summary Building process has been parallelized in 6.3

30

Summary Building ParallelizaCon

31

auto summary search

every N minutes

SCHEDULER SCHEDULER

auto summary search

auto summary search

auto summary search

SequenCal Summary Building Parallelized Summary Building

Configuring Summary Building for ParallelizaCon

32

•  $SPLUNK_HOME/etc/system/local/savedsearches.conf

[default] auto_summarize.max_concurrent = 2

$SPLUNK_HOME/etc/system/local/datamodels.conf

[default] acceleraCon.max_concurrent = 2

So What Does Breakthrough Mean?

●  CriCcal reports can be available in ¼ the 1me

●  It takes 20% less indexing HW to expand or deploy Splunk

●  New data is ready for analysis in ½ the 1me

33

  Splunk expansion costs have dropped over 50% since 2013   A new customer can deploy Splunk using 1/3 the HW vs. 2013   Splunk deployment is now ½ the cost vs. 2013

Release 6.3 vs.

Release 6.2

Release 6.3 vs.

Release 6.0


34














Analysis & VisualizaCon ●  Anomaly DetecCon –  Incorporates Z-‐Score, IQR & histogram

methodologies in a single command

●  GeospaCal VisualizaCon –  Visualizes metric variance across a

customizable geographic area

●  Single Value Display –  At-‐a-‐glance, single-‐value indicators

with useful context

35

36

GeospaCal VisualizaCon

•  Choropleth maps help users to easily spot spaCal paperns

•  Color scales can be configured per use case

•  Users can upload their own geographical polygon definiCons

Visualizes metric variance across a customizable geographic area

37

Single Value Display

•  Large type and prominent colors make values or changes visible, even from a distance

•  Sparkline shows trends in the recent history

•  Delta indicator shows changes since a previous Cme

At-‐a-‐glance, single-‐value indicators with useful context

Anomaly DetecCon New SPL command provides histogram-‐based anomaly detec#on

●  Net new histogram-‐based approach offers a more accurate detecCon method

●  Single command offers 3 opCons: zscore, IQR & histogram

●  Replaces exisCng Outlier and AnomalousValue commands

38

Splunk Enterprise

39














HTTP Event Collector Supports DevOps and IoT data analysis needs at scale

40

DevOps & Developers

IoT Devices & Applica1ons

1. Standard API and logging libraries send events directly to Splunk 2. Libraries integrated into popular plagorms and services

Scales to Millions of Events/Second

Demo

hpp://splunk.com/shake

41


42














Distributed Management Console -‐ II New topology views, status, and aler#ng for Splunk deployments

●  Visualizes Search Head/Indexer matrix with KPI and performance overlays

●  Search Head clustering replicaCon and scheduler views

●  Forwarder views with status and performance data

●  Index and metadata storage uClizaCon ●  System health alerCng

43

Indexer Auto-‐Discovery Simplifies Forwarders management in a dynamic environment

●  Cluster master maintains dynamic Indexer list accessed by Forwarders

●  Indexers can be added/removed without affecCng Forwarder configuraCon or operaCon

44

…

Data Integrity Control Helps Ensure data fidelity; Meets GPG13 compliance requirements

●  Hash signatures of selected index data are saved at regular intervals

●  Intervals can be validated by the admin ●  Meets security and compliance requirements by verifying that data has not been tampered with

●  Hashes can be exported to further ensure security

45

Custom Alert AcCons Use Splunk Alerts to trigger & automate workflows

●  Allows packaged integraCon with third-‐party applicaCons

●  Simple admin/user configuraCon ●  Developers can build, package, and publish alert acCons within an app

●  Growing list of integraCons available

46

Alert AcCon Examples ●  NoCficaCon Services

‣  Send message to IM clients (HipChat, Slack) ‣  Send SMS

●  Incident RemediaCon / TickeCng ‣  Automate the creaCon of Cckets (ServiceNow, Jira)

●  IT Monitoring ‣  Send incident/alert into monitoring tools (xMapers, BigPanda)

●  Security ‣  Take acCon or send events to firewalls, devices, management

consoles

●  Internet-‐of-‐Things ‣  Trigger device-‐level acCons (change lights, sounds an alarm, send

acCon to device)

●  Custom AcCon ‣  Trigger any organizaCon-‐specific acCon (restart applicaCon,

integrate with homegrown service, and more)

47

Eco-‐system Partners

Splunk Mobile Access Splunk dashboards, alerts and more for iOS and Android devices

●  Monitor dashboards, KPIs, reports ●  Receive real-‐Cme business and

operaConal alerts ●  Annotate and share data ●  Supports MDM and single sign-‐on ●  No longer requires separate Mobile

Access Server

48

Formally called “Splunk Mobile App”

What’s New in Hunk 6.3

Introducing Hunk 6.3

50

  Archive to Hadoop   Single Splunk Interface to Search Real-‐Time & Historical Data

Drive Down TCO

  Access Data Using Hive or Pig   Query Without Moving or ReplicaCng Data

Open Access for 3rd-‐Party Hadoop Tools

  Anomaly DetecCon   GeospaCal VisualizaCon   Contextual Display

Advanced Analy1cs & Visualiza1ons

Archive Splunk Data to HDFS or AWS S3

Hadoop Clusters WARM

COLD

FROZEN

Drive Down TCO by Archiving Historical Data to Commodity Hardware

Unified Search Intelligently Search Across Real-‐Time and Historical Data Using the Same Splunk Interface

Real-‐Time Data Historical Data in Hadoop

53

Open Access to Historical Data Using 3rd-‐party Hadoop tools

Hadoop Clusters

Historical Data in HDFS 3rd-‐Party Hadoop Tools

Data Scien1st

Splunk Archive Reader for Hadoop

•  Use 3rd-‐party Hadoop tools (e.g., Hive, Pig) to perform addiConal analysis •  Broaden data access to wider set of audiences, e.g. data scienCsts and analysts •  Run queries without moving or replicaCng data

Advanced AnalyCcs and VisualizaCon CapabiliCes

●  Anomaly DetecCon –  Incorporates Z-‐Score, IQR & histogram

methodologies in a single command

●  GeospaCal VisualizaCon –  Visualizes metric variance across a

customizable geographic area

●  Single Value Display –  Derive more context by layering on

visual cues and more flexible formaYng

54

Release 6.3 – Value Across Products

Splunk Enterprise All 6.3 features & performance

Splunk Cloud Most features, scalability

Hunk VisualizaCon & analysis of large datasets

Splunk Light VisualizaCon, HTTP events, data integrity

55

Enterprise Cloud Hunk Light

Performance & Scale

Yes Scale Search only

No

HTTP Events Yes Yes No Yes

Data VisualizaCon Yes Yes Yes Yes

Alert AcCon IntegraCon

Yes Yes Yes No

Data Integrity Check

Yes Yes No Yes

Distributed Mgt Console

Yes No Yes No

Other Management Yes Yes ParCal ParCal


56














Q & A ?

THANK YOU!

splunksummit 2015 - update on splunk enterprise 6.3 & hunk 6.3

Data & Analytics