monitorama 2013 keynote
DESCRIPTION
TRANSCRIPT
Mo e Than Monitoring#monitoringr++
Neil Gunther
Performance Dynamics
Monitorama KeynoteBoston, March 28 2013
SM
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 1 / 47
Let’s Get Calibrated about Data
Outline
1 Let’s Get Calibrated about Data
2 Potted History of Monitoring
3 Performance Visualization Basics
4 Monitored Data are Time Series
5 Performance Visualization in R
6 Possible Hacks
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 2 / 47
Let’s Get Calibrated about Data
Guerrilla Mantra: All data is wrong by definition
Measurement is a process, not math.
All data contains measurement errors.
How big are they and can you tolerate them?
Treating data as divine is a sin.
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 3 / 47
Let’s Get Calibrated about Data
Guerrilla Mantra: All data is wrong by definition
Measurement is a process, not math.
All data contains measurement errors.
How big are they and can you tolerate them?
Treating data as divine is a sin.
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 3 / 47
Let’s Get Calibrated about Data
Guerrilla Mantra: VAMOOS your data doubts
Visualize
Analyze
Modelize
Over and Over until
Satisfied
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 4 / 47
Let’s Get Calibrated about Data
Guerrilla Mantra: VAMOOS your data doubts
Visualize
Analyze
Modelize
Over and Over until
Satisfied
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 4 / 47
Let’s Get Calibrated about Data
Guerrilla Mantra: There are only 3 performance metrics
1 Time, e.g., cpu_ticks2 Rate (inverse time), e.g., httpGets/s,3 Number or count, e.g., RSS
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 5 / 47
Let’s Get Calibrated about Data
Guerrilla Mantra: There are only 3 performance metrics
1 Time, e.g., cpu_ticks2 Rate (inverse time), e.g., httpGets/s,3 Number or count, e.g., RSS
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 5 / 47
Let’s Get Calibrated about Data
Watch Out for Patterns
I mean that in a bad way. Your brain can’t help itself.
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 6 / 47
Potted History of Monitoring
Outline
1 Let’s Get Calibrated about Data
2 Potted History of Monitoring
3 Performance Visualization Basics
4 Monitored Data are Time Series
5 Performance Visualization in R
6 Possible Hacks
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 7 / 47
Potted History of Monitoring
Old Adage: “Nothing New in Computer Science”
Mainframes didn’t need real-time monitoring. Batch processing.c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 8 / 47
Potted History of Monitoring
How You Programmed It
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 9 / 47
Potted History of Monitoring
Later ... the interface improved
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 10 / 47
Potted History of Monitoring
CTSS (Compatible Time-Sharing System) developed in 1961 at MIT on IBM 7094.Compatible meant compatibility with the standard IBM batch processing O/S.
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 11 / 47
Potted History of Monitoring
Multics Instrumentation c.1965
Multics was a multiuser O/S following CTSS time-share.
The Implementation“a rough measure of response time for a time-sharing console user, an exponential average of the numberof users in the highest priority scheduling queue is continuously maintained. An integrator, L , initiallyzero, is updated periodically by the formula
L ← L ×m + Nq
where Nq is the measured length of the scheduling queue at the instant of update, and m is an exponentialdamping constant”
This equation is an iterative form of exponentially damped moving average.In modern terminology, it’s a data smoother.
The Lesson“experience with Multics, and earlier with CTSS, shows that building permanent instrumentation into keysupervisor modules is well worth the effort, since the cost of maintaining well-organized instrumentation islow, and the payoff is very high.”
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 12 / 47
Potted History of Monitoring
You know this better as ...
Linux load average58 extern unsigned long avenrun[ ]; /* Load averages */5960 #define FSHIFT 11 /* nr of bits of precision */61 #define FIXED_1 (1<<FSHIFT) /* 1.0 as fixed-point */62 #define LOAD_FREQ (5*HZ) /* 5 sec intervals */63 #define EXP_1 1884 /* 1/exp(5sec/1min) as fixed-pt */64 #define EXP_5 2014 /* 1/exp(5sec/5min) */65 #define EXP_15 2037 /* 1/exp(5sec/15min) */6667 #define CALC_LOAD(load,exp,n) \68 load *= exp; \69 load += n*(FIXED_1-exp); \70 load >>= FSHIFT;
Lines 67–70 are identical to the 1965 Multics formula.See Chap. 4 of my Perl::PDQ book for the details.
UNIX load average
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 13 / 47
Potted History of Monitoring
Unix at Bell Labs c.1970
CTSS begat Multics begat Unics begat UnixGet it?
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47
Potted History of Monitoring
Unix at Bell Labs c.1970
CTSS
begat Multics begat Unics begat UnixGet it?
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47
Potted History of Monitoring
Unix at Bell Labs c.1970
CTSS begat Multics
begat Unics begat UnixGet it?
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47
Potted History of Monitoring
Unix at Bell Labs c.1970
CTSS begat Multics begat Unics
begat UnixGet it?
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47
Potted History of Monitoring
Unix at Bell Labs c.1970
CTSS begat Multics begat Unics begat Unix
Get it?
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47
Potted History of Monitoring
Unix at Bell Labs c.1970
CTSS begat Multics begat Unics begat UnixGet it?
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 14 / 47
Potted History of Monitoring
Then Came Screens 9:40
Note the mouse in her right hand.c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 15 / 47
Potted History of Monitoring
Unix top: A Legacy App
Green ASCII characters on black backgroundc© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 16 / 47
Potted History of Monitoring
Desktop GUI c.1995
Lots of colored spaghettic© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 17 / 47
Potted History of Monitoring
Static Charts on the Web c.2000
Load average over 24 hr period with 1, 5, 15 min LAs as green, blue, red TS.(which is completely redundant, BTW)
As informative as watching a ticker chart on Wall Street
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 18 / 47
Potted History of Monitoring
Browser-based Dashboards
Interminable strip charts are not good for your brain.c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 19 / 47
Performance Visualization Basics
Outline
1 Let’s Get Calibrated about Data
2 Potted History of Monitoring
3 Performance Visualization Basics
4 Monitored Data are Time Series
5 Performance Visualization in R
6 Possible Hacks
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 20 / 47
Performance Visualization Basics
The Central Challenge
Find the best cognitive impedance match
between the digital computer and the neural computer
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 21 / 47
Performance Visualization Basics
Cognitive Circuitry is Largely Unknown
PerfViz is an N-dimensional problem
Brain is trapped in (3 + 1)-dimensions
No 5-fold rotational symmetry
Physicists have all the fun with SciViz
Time dimension becomes animation sequence
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 22 / 47
Performance Visualization Basics
Your Brain is Easily Fooled
All cognition is computationYour brain is a differential analyzerDifference errors produce perceptual illusions
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 23 / 47
Monitored Data are Time Series
Outline
1 Let’s Get Calibrated about Data
2 Potted History of Monitoring
3 Performance Visualization Basics
4 Monitored Data are Time Series
5 Performance Visualization in R
6 Possible Hacks
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 24 / 47
Monitored Data are Time Series
Gothic graphs can hurt your brain (Bad Z value)
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 25 / 47
Monitored Data are Time Series
There’s a Whole Science of Color
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 26 / 47
Monitored Data are Time Series
Pastel Colors on White
0 1000 2000 3000 4000 5000
200000
400000
600000
800000
1200000
t-Index
LIO/s
Sandy Bridge 16 VPU Throughput
test1.HTT.Turb
test2.Turbo
test3.HTT
test4.AllOff
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 27 / 47
Monitored Data are Time Series
Pastel Colors on Black
0 1000 2000 3000 4000 5000
200000
400000
600000
800000
1200000
t-Index
LIO/s
Sandy Bridge 16 VPU Throughput
test1.HTT.Turb
test2.Turbo
test3.HTT
test4.AllOff
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 28 / 47
Monitored Data are Time Series
Pastel Colors on Neutral Gray
0 1000 2000 3000 4000 5000
200000
400000
600000
800000
1200000
t-Index
LIO/s
Sandy Bridge 16 VPU Throughput
test1.HTT.Turb
test2.Turbo
test3.HTT
test4.AllOff
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 29 / 47
Monitored Data are Time Series
Coordinated Colors on Neutral Gray
0 1000 2000 3000 4000 5000
200000
400000
600000
800000
1200000
t-Index
LIO/s
Sandy Bridge 16 VPU Throughput
test1.HTT.Turb
test2.Turbo
test3.HTT
test4.AllOff
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 30 / 47
Monitored Data are Time Series
Time Series Can Reveal Data Correlations 9:50
02:00 07:00 12:00 17:00 22:00
010
2030
CPU%
02:00 07:00 12:00 17:00 22:00
7585
95
Mem%
02:00 07:00 12:00 17:00 22:00
05
1015
20
ioWait%
02:00 07:00 12:00 17:00 22:00
0.0
0.2
0.4
Time
LdAvg-1
server.p.65 : 2012-05-03 to 2012-05-04
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 31 / 47
Monitored Data are Time Series
But Data Doesn’t Tell All: Monitored Server Consumption
050
100
150
200
Time (m:s)
Capacity (
U%
)
00:02 02:32 05:08 07:38 10:08 12:38 15:18 17:48 20:18 22:48
Server saturation
Uavg dataUmax data
Monitored Server Consumption
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 32 / 47
Monitored Data are Time Series
Beyond Data: Effective Server Consumption
050
100
150
200
Time (m:s)
Capacity (
U%
)
00:02 02:32 05:08 07:38 10:08 12:38 15:18 17:48 20:18 22:48
Effective max consumption
Server saturation
Uavg dataUmax data
Ueff predicted
Lookahead Server Consumption
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 33 / 47
Performance Visualization in R
Outline
1 Let’s Get Calibrated about Data
2 Potted History of Monitoring
3 Performance Visualization Basics
4 Monitored Data are Time Series
5 Performance Visualization in R
6 Possible Hacks
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 34 / 47
Performance Visualization in R
Choose Your Cognitive Z in R0
12
34
5
mpg
100 200 300 400 2 3 4 5
1015
2025
30
100
200
300
400
disp
drat
3.0
3.5
4.0
4.5
5.0
10 15 20 25 30
23
45
3.0 3.5 4.0 4.5 5.0
wt
4 6 8
10
15
20
25
30
3D Scatterplot
1 2 3 4 5 6
10
15
20
25
30
35
0
100
200
300
400
500
wt
dispmpg
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 35 / 47
Performance Visualization in R
Enhanced Plots in R
Raw bench data
p
Xp
50
100
150
200
250
300
10 20 30 40 50 60
Data smoother
p
Xp
50
100
150
200
250
300
10 20 30 40 50 60
USL fit
p
Xp
50
100
150
200
250
300
10 20 30 40 50 60
USL fit + CI bands
p
Xp
50
100
150
200
250
300
10 20 30 40 50 60
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 36 / 47
Performance Visualization in R
Chernoff Faces in R
Example (using R)library(TeachingDemos)faces2(matrix( runif(18*10), nrow=12), main=’Random Faces’)
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 37 / 47
Performance Visualization in R
Kiviat and Radar Charts in RCorrelation Radar
Alp12Mn
AvrROE
DivToP
GrowAPS
GrowAsst
GrowBPS
GrowCFPS
GrowDPS
GrowEPS
GrowSPS
HistAlp
HistSigm
InvVsSal
LevGrow
Payout5
PredSigm
RecVsSal
Ret12Mn
Ret3MnRet1Mn
ROE_CshPlow_DDM_EarnMom_EstChgs_EstRvMd_Neglect_NrmEToP
_PredEToP_RelStMd
_ResRev_SectMom
AssetToP
ARM_Pref_Earnings
AvrCFtoP
AvrDtoP
AvrEtoP
ARM_Sec_Earnings
BondSens
BookToP
Capt
CaptAdj
CashToP
CshFlToP
CurrSen
DivCuts5
EarnToP
Earnvar
Earnyld
Growth
HistBeta
IndConc
Leveflag
Leverag
Leverage
Lncap
Momentum
Payoflag
PredBeta
Ret_11M_Momentum
PotDilu
Price
ProjEgro
RecEPSGr
SalesToP
Size
SizeNonl
TradactvTradVol
ValueVarDPSVolatilityYieldCFROIADJUSTERC
RCSPX
R1000
MarketCapTotalRisk
Value_AX
truncate_ret_1mo
truncate_PredSigma
Residual_Returns
ARM_Revenue
ARM_Rec_Comp
ARM_Revisions_Comp
ARM_Global_Rank
ARM_Score
TEMP
EQ_Raw
EQ_Region_Rank
EQ_Acc_Comp
EQ_CF_Comp
EQ_Oper_Eff_Comp
EQ_Exc_Comp-0.5 0 0.5 1
Example (using R)require(plotrix)corelations <- c(1:97)corelation.names <- names(corelations) <- c("Alp12Mn","AvrROE", "DivToP", "GrowAPS", "GrowAsst", "GrowBPS", "GrowCFPS",...corelations <- c(0.223, 0.1884, -0.131, 0.1287, 0.0307,...par(ps=6)radial.plot(corelations, labels=corelation.names,rp.type="p",main="Correlation Radar", radial.lim=c(-1,1),line.col="blue")
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 38 / 47
Performance Visualization in R
Treemaps in R
GDAT: Top 100 Websites
-8e+09 -4e+09 0e+00 4e+09 8e+09
Search/portal
Retail
Software Media/news
Social network Reference
Video
Portal
Blogging Financial Computer
Media/news
Commerce
Tech news
Photo sharing
Health
WeatherAdult Travel
Gaming
Voip
File sharing Online dating
Children
Recruitment
Sport
File storageForum
GDAT: Top 100 Websites
-8e+09 -4e+09 0e+00 4e+09 8e+09
MSNBing
Yahoo!
Microsoft
Facebook YouTube Wikipedia
AOL eBay Apple Amazon Blogger
Ask
Fox Interactive Media
Mozilla
Real Network
Adobe
About PayPalWordPressWeather Channel Glam MediaCNN
Skype
CBS
IMDb
Wal-Mart
Craigslist
BBC Terra CNETOrangeDisney OnlineAT&TNetShelter Technology
Flickr
Picasa
Gorilla Nation Websites
WikiAnswers
Orkut
Chase
UOLBank of AmericaeHowLivejasminESPN ZyngaShopzilla
Comcast
Videolan
Everyday Health Network
Expedia
iG
Target
Dell
Globo
Scripps Networks Digital
NYTimes
LimeWire
WebMDFriendFinder NetworkShopping.comNickelodeon Kids and Family NetworkClassmates Online
NetflixMeeboSix ApartTurner Sports & Entertainment Digital NetworkComcast
Hewlett Packard
NexTag
NBC Universal
Conduit
Verizon
TripAdvisorBest BuyMonsterRTL NetworkPriceline Network
Experian
Pornhub
iVillage
UPS
SuperPagesFox NewsNFL Dailymotion
T-Online
Reed Business Information Network
Free
CitibankVistaprintSears
Tribune NewspapersElectronic Arts Online
MegauploadVodafoneGeeknet
Example (using R)library(portfolio)bbc <- read.csv("nielsen100-2010.csv")map.market(id=seq(1:100), area=bbc$uniqueAudience, group=bbc$categoryBBC,color=bbc$totalVisits, main="GDAT: Top 100 Websites")
There is another treemap pkg on CRANc© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 39 / 47
Performance Visualization in R
Heatmap of Multiple Servers in Time
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 40 / 47
Performance Visualization in R
Barry in 2D
p1
p3p2
p3=1/3
p1=
1/3
p2=1/3
p2
p3=0.3
p1=
0.6
p2=0.1
p1
p3
p2 p4
p3
p1
p2 p4
p3
p1
p2=.25,p4=.25,p3=.1,p1=.4
p2=.1,p4=.05,p3=.05,p1=.8
p1
p3p2
p3=1/3
p1=
1/3
p2=1/3
p2
p3=0.3
p1=
0.6
p2=0.1
p1
p3
p2 p4
p3
p1
p2 p4
p3
p1
p2=.25,p4=.25,p3=.1,p1=.4
p2=.1,p4=.05,p3=.05,p1=.8
Barycentric coordinate system for %CPU = %user + %sys + %idle
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 41 / 47
Performance Visualization in R
Barry in 3D: Tukey-like Rotations
Tukey trumps Tufte ,
Barycentric coordinate system for %BW = %unicast + %multicast + %broadcast + %idle
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 42 / 47
Possible Hacks
Outline
1 Let’s Get Calibrated about Data
2 Potted History of Monitoring
3 Performance Visualization Basics
4 Monitored Data are Time Series
5 Performance Visualization in R
6 Possible Hacks
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 43 / 47
Possible Hacks
Interactive and Streaming in R
R derives from S at Bell Labs (home of Unix) c.1975, 1980, 1988
R scripting language
console interface > (x^(k-1)*exp^(-x/s))/(gamma(k)*s^k)
cf. Mathematica document paradigmxk−1 e−x/θ
Γ(k) θk
No fonts, no symbolic computation
More recent focus is on enabling:
Better IDE integration, e.g., RStudio
Browser-based interaction, e.g., Shiny
Streaming data acquisition, e.g., R plus Hadoop, but ...
R interpreter is single-threadedNeeds a full app stack b/w data and R engineRevolution Analytics is in this space
Plenty of room for innovative developmentc© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 44 / 47
Possible Hacks
Some Ideas for Tomorrow
1 Lots of opportunities
2 Coupling simple statistical analysis to monitored data
3 Display the errors in monitored data
4 Replace the black background in Graphite
5 Apply ColorBrewer to Graphite
6 Apply effective capacity consumption to your monitored data
7 Replacing strip charts with animation
WARNINGCommon sense is the p i t f
al
l
of all performance analysis
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 45 / 47
Possible Hacks
Modelizing GitHub Growth
Since I didn’t discuss modeling part of VAMOOS ...
Donnie Berkholz of redmonk.com wrote on his Jan21, 2013 blog that GitHub will reach:
4 million users near Aug 2013
5 million users near Dec 2013
That’s based on a log-linear model.I claim it’s a log-log model and therefore:
4 million users around Oct 2013
5 million users around Apr 2014
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 46 / 47
Possible Hacks
Performance Dynamics CompanyCastro Valley, Californiawww.perfdynamics.comperfdynamics.blogspot.comtwitter.com/[email protected]: +1-510-537-5758
c© 2013 Performance Dynamics Mo e Than Monitoring March 30, 2013 47 / 47