di&a slides: data insights and analytics frameworks
TRANSCRIPT
The First Step in Information Management
www.firstsanfranciscopartners.com
Producedby:
MONTHLY SERIES
Broughttoyouinpartnershipwith:
January 5, 2017Data Insights and Analytics Frameworks
Welcometothenewseries
§ Thepurposeofournewseriesisto:
− GrowunderstandingonDataInsightsandAnalytics
− CovertheinsandoutsoftheBigData,Analytics,BusinessIntelligenceand reportinguniverse
− Focusonpractical,realistic,value
− Wanttobypassfluff,hypeandfalsepromises
− Needyourfeedback− UseQ&A
pg 2© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
DataLakevs.Data
Warehouse
Descriptive,Prescriptive
andPredictiveAnalytics
GoverningQualityAnalytics
TheRoleofaDataScientist(InterviewwithaCDS)
Analytics,BIandDataScience:
What’stheProgression?
Topicsfortoday’swebinar
FrameworksdefinedEnterpriseanalyticsarchitectureOverviewofstandarddatainsightsandanalyticscomponentsBigDataSandboxReal-timeAnalyticsFromLegacyarchitecturestodatainsightKeytakeawaysQ&A
pg 3
Frameworksdefined
Enterpriseanalyticsarchitecture
• BigData• Sandbox• Real-timeAnalytics• LegacyArchitectures
Overviewofstandarddatainsightsandanalyticscomponents
Keytakeaways
Q&A
© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
Frameworksdefined
▪ Thestructurefordeliveringandgettingvalueoutofyourdataandanarchitecturefordecision-making,includingorganizationalmodels
▪ Yourenterpriseanalyticsarchitectureneedstoreflectholisticthinking▪ Yourstartingpointandbusinessneedsdeterminehowyouprogress,notapre-definedcurve
▪ Organizationsthatcanbarelydeliveranaccurateproductionreportaredoingpredictiveanalytics– Right?Wrong?
pg 4
Predictive
ManagingProactive
Operating
DataInsightandAnalyticsMaturity© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
TheDataInsightandAnalyticsFramework
DataInsightandAnalyticsFramework
DataInsightandAnalyticsStrategy
TechnologyInfrastructure
DataInsightandAnalyticsOperatingModel
Compo
nents “BestFit”Data
Architecture DataQuality DemandManagement
Presentation DataWrangling MetadataManagement
GOVERNANCE
ORGANIZATIONAL ALIGNMENT
pg 5
Sampleenterpriseanalyticsarchitecture
ODS*
DM Big Data*
Sandbox
Securityandgovernance
Presentation(visualization,reports,algorithms,queries)
Dataingestion
Operationsscheduling,
managem
ent,DataQuality,ControlsSources
* = Includes Real Time ETLandMovement
DW*
Metadata
Data Lake
pg 6© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
ODS*
DM
Big Data*
Sandbox
Securityandgovernance
Presentation(visualization,reports,algorithms,queries)
Dataingestion
Operationsscheduling,m
anagement,DataQ
uality,Controls
Sources
ETLandMovement
DW*
Metadata
Data Lake
Enterpriseanalyticsarchitecture– BigData
§ MorethantheBigData“stack”
§ Nolongerlinear– ProductiontoAccess
§ Arrangedbylatency,access,intendedvalue,datavelocity,datavolumeanddatamovementcapacity1. Standard“BigData”2. Sandbox3. Realtime4. Heritage
pg 7
12*3
4
© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
BigData
§ Components− DataSources− Ingestion− Structuring− AnalyticsandVisualization− Metadata
§ HighPriorityConcerns− Metadata– Technical,Business,Lineage,Meaning,Interpretation− Securityandprivacy- Accessandusagemustbemanaged
accordingtorisk,permissions,policy,contractualagreements− DataGovernance
§ Oversightofsemantics,lineage,quality
− Latency,access,usage§ Persistent§ Typeofdatastructure– Hive,Hbase
pg 8
DataSources IngestionandTransformation
StructuringData
AnalyticsandVisualization
MonitorTechnicalMetadataBusinessMetadata
© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
Usecase– BigData
§ Telecom− Analyze500discretedataelementsincludingsupportcallpatterns,lateor
delinquentpaymentsandotherongoingvitalsignsviapredictiveanalytics− Identify“churn”prospectsandtakestepstopreventit
§ Results− 47%reductionincustomerchurn,protecting$15millioninrevenue− Predictiveanalyticshasspreadorganicallytootherpartsofthecompany,
includingcollections
pg 9© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
Sandbox
§ Components– SimilartoBigData− Standalonesandboxisalsoarelevantcomponent− Ingestion– Batch− AnalyticsandVisualization- DataScientist/Analystonly
§ HighPriorityConcerns− Data
§ Discovery§ Understanding§ Standardization§ Usefulness
− SecurityandPrivacy§ Rawaspectimpliesnocontrols§ Controlthedata,nottheenvironment
− DataGovernancefocus,Usage,access− Metadata,provenanceandpedigree
§ Latency,access,usage− Sandboxequalsnon-persistent,non-production− Self-service− Housekeeping
pg 10
Stand Alone Sandbox
DataSources IngestionandTransformationAnalyticsandVisualization
MonitorTechnicalMetadataBusinessMetadata
© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
Usecase– Sandbox
§ PredictiveMaintenance,Manufacturing− Collectedmachinesensordata− Createdasandboxenvironmenttocentralizethepartsfailureanalysis− Combinedsensordatawithoperationaldata− Afterfindinginsights,operationalizetheprocessesinthelineofbusiness
§ Results− Shortertime-to-insight;Ittookonlythreeweeks(from6months)todevelopapartsfailure
predictionalgorithm
pg 11© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
Real-timeanalytics
§ Components- similartoregularBigData− Streamingprocessinganaddedcomponent− Realtimecapabilityonnon-BigDataaswell
§ HighPriorityConcerns− Speedofingestion
§ Real-timeDataIngestion− DataStreaming− DataMessaging− In-memorydatabaseforextremelowlatencyrequirements
− Securityandprivacy− DataGovernanceFocus
§ Metadata§ Compliance
− Latency,access,usage− Real-timeDataQuery
§ EnterpriseQueryandReporting§ FastQueryDatabasetostoreanalyticalresults§ Agents,messaging,newevents§ Flexiblepersistencyandaccessibility§ Verylowlatency,highperformance
pg 12
DataSources
IngestionandTransformation
StructuringData
AnalyticsandVisualization
MonitorTechnicalMetadataBusinessMetadata
StreamingData
Processing
Real time DW, or ODS
© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
Usecase– Real-timeanalytics
§ Health− Automaticallysiftsthroughmillionsofpostsondozensofsocialmediasites,localnews
reports,medicalworkers’socialnetworksandgovernmentwebsitestotrackinstancesofdisease
− Continuallyplotsdiseasehotspotsonamap
§ Results− Identifiedaclusterof“mysteryhemorrhagicfever”inGuineaoveraweekbeforethe
MinistryofHealthofGuineanotifiedtheWorldHealthOrganization(WHO),thatadaylaterconfirmedtheEbola outbreak
pg 13© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
Relevanceoflegacyortraditionaldatainsight
§ Legacystructuresstillhaverelevance− Reporting− StandardBI
§ Components− Familiarnames– ETL,DW,ODS,DM− ManyaspectsofBigDatatechnologyarenotrelevantto
manydatauses
§ TraditionalConcerns− ETLvs.webservicespipelinesviadatalayer− Understandingneedfortraditionaluses:
§ Departmentaluse§ Historicalreporting§ Operationalandad-hocreporting
− Supportofmulti-latency,historical,operational,etc.,requirements
pg 14
ODS*
DM
Presentation(visualization,reports,algorithms,queries)
Sources
ETLandMovement
DW*
© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
Usecase– Legacyortraditionaldatainsight
§ Healthcare− Adatawarehousetofacilitateinformationandbestpracticessharingbetweenthousandsof
providersandresearchprofessionals.− Alsodeployedpredictiveanalyticsandartificialintelligencetoderivebetterinsightsfrom
ElectronicHealthRecordsandimprovepatientoutcome.
§ Results− 400,000patientrecordscentralizedinasingledatawarehousewhichcanscaleupto20
millionrecords.− 42%anticipatedimprovementinpatientoutcomeswithArtificialIntelligence− 58%anticipatedreductionincostperunitofoutcomechange
pg 15© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
Keytakeaways
§ Referencearchitecturesarejustthat−Youmaynotneedalake,datawarehouseorsandbox
§ Avoidcobblingtogethertechnicalcomponents
§ Plantomatchyourarchitecturetoneedsandusage,vs.existingcomponents
§ WebServicesareanimportanttool− Ifusingservices,pleaseconsideradistinctdatalayer
pg 16© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
Q&A
pg 17© 2017 First San Francisco Partners www.firstsanfranciscopartners.com
pg 18
ThankyouandHappyNewYear!SeeyouThursday,February2forDataLakevs.DataWarehouse
JohnLadley@[email protected]
KelleO’Neal@[email protected]
© 2017 First San Francisco Partners www.firstsanfranciscopartners.com