Download - Large Scale Data Analytics
![Page 2: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/2.jpg)
Scenario
• Insurer uses meteorological data for pricing model • At present data from 2000 weather stations are
collected for analysis • Plan is to use 10,000 weather station data
( or more ) • Stochastic simulation needs to run to ID pattern in
weather data, to determine pricing • Volumetric : peta-bytes of information
( for 1 region )
2
![Page 3: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/3.jpg)
Trends
3
![Page 4: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/4.jpg)
Data Analytics Is Mostly About $$, Customers, Markets
4
![Page 5: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/5.jpg)
How Widespread Is Data Analytics?
5
![Page 6: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/6.jpg)
Expectations On Payback Period ( Aggressive )
6
![Page 7: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/7.jpg)
Large Scale Data Analytics
7
“Involves using different algorithms, distributed platforms, tools and techniques to analyze big data and provide actionable insights”
![Page 8: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/8.jpg)
Big Data
“ Data sets that are very large in volume and complex “
8
New platforms, tools and techniqueshave emerged to manage Big Data
We broke away from traditionalways to process and analyze them
![Page 9: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/9.jpg)
Data Structures
Vector, Matrix,
Or Complex Structure
Free Text Image or Binary Data Data “bags”
Iterative Logic Or Complex Branching
Advanced Analytic Routines
Rapidly Repeated
Measurements
Extreme Low
Latency
Access to all data required
Search Ranking X X X X X X
Ad Tracking X X X X X X X X
Location or Proximity Tracking X X X X X
Social CRM X X X X X X X
Document Similarity Testing X X X X X X X X
Genomic Analysis X X X X X
Customer Cohort groups X X X X X X
Fraud Detection X X X X X X X X X
Smart Utility Metering X X X X X X
Churn Analysis X X X X X X X
Satellite Image Analysis X X X X
Game Gesture Analysis X X X X X X X X
Data Bag Exploration X X X X X X
9
![Page 10: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/10.jpg)
Business Interests : Well Informed Customer Executive
10
Speech to Text Conversion
Voice Data
Unstructured data Analytical System
Customer Persona
• Customer Persona - Demographics, Top interactions, Channel Preferences, Dissatisfies
• Customer Lifetime Value • Recent Contact History • Customer Sentiment &
Trend during the call
Customer’s state of mind
Sentimental Analysis
Social media
Depositions
ComplaintsOther Channel
information (ATM, Branch)
Big Data Warehouse
Traditional Warehouse
Decision Engine • Customer Executive Dashboard presents all intelligence required to make a decision
• The decision engine also presents important decisions to be taken for the particular customer issue
![Page 11: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/11.jpg)
Well Informed Customer Executive…
Customer calls Banking Call Center
Executive understands the customer problemExecutive authenticates
customer and pulls up Customer Persona
Executive reviews risk of attrition
against Customer Lifetime Value
Executive reviews Last 5 call center
and banking transactions
Executive views customer’s state of
mind (risk of attrition ) through a barometer chart
Analytical Solution -Converts Speech to
textAnalytical engine listens to
customer voice
Suggested top 5 Actions requiredDecision Engine
Executive performs below actions based on his analysis and recommendations from Decision engine1. Reversal of overdraft fee2. One time fee waiver on Cheque book (predicting customer need based on historic usage cycles )3. Cash back Reward card for a minimum spend of $X through debit card4. Offer interest revision for investment products or mortgage5. Promote new mutual funds or credit cards based on customer willingness
Analytical engine monitors sentiment
Executive analyzes Customer Persona (demographic / Preferences / Satisfiers /
dissatisfies etc )
11
![Page 12: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/12.jpg)
Business Interests : Fraud Prevention
12
Envisaged Benefits ▪New fraud patterns can be identified by building ‘analytical models’ to run against historical data
▪ ‘Web crawling’, ‘Contextual text analysis’, ‘Natural Language Processing’ allows fraud behavior identification from social media. It may increase Fraud detection success rate
▪ ‘Real time’ models to capture behavioral patters and do pattern analysis against History data to evaluate Fraud case validity. The model learns by self and updates ‘Fraud pattern master sets.
▪Brings ‘artificial intelligent’ fraud pattern detection and analysis
▪ ‘Real time’ (in the order of .5-1 minute refresh rate) alerts to Fraud analysts about ‘self learned’ fraud patterns based on new customer behavior patterns
Big Data Usage ▪ Formation of key value groups to the order of XcY (where X no. of attributes that are relevant to Fraud
and Y is no. of attributes that should be combined to identify patterns)
▪High speed history data loading from source systems
▪ Efficient Real time fraud detection by identifying patterns through customer behavioral events and processing them over X yrs. of history data – e.g. using HBase
Scenario Formation of Fraud pattern reference tables using ▪ Real time data coming from different departments like IVR, WEB, Customer profile, Transactions etc ▪ Real time Mining and analysis of history data to form prior patterns (no. of years in range to 50-100 TB)
![Page 13: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/13.jpg)
Fraud Pattern Detection…
13
Legacy Fraud Data
Customer Profile Data
IVR Audio Data Web / Online
Card Transaction
Data
Fraud Pattern
Master Table Fraud Analyst
History Data Processing to
determine Fraud
Patterns over X years
Real-time Customer Behavior
Analysis for Fraud
Detection
Customer Behavior Change
Events
Customer Behavior Change
Events
Customer Behavior Change
Events
Real time Analysis of behavior patterns over
historical data
Real time update to Master Table on New
Fraud Patterns
Real time alert to Fraud Analyst
RDBMS RDBMS(JSON Files) RDBMS
Customer Behavior Change
Events
![Page 14: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/14.jpg)
Fraud Prevention…
14
![Page 15: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/15.jpg)
Benefits
15
BenefitsIndustry
Financial services▪ Customer Insights – Integrating Transactional data (CRM/Payments) and unstructured Social feeds ▪ Regulatory Compliance – Risk exposures across asset classes, LOBs and firms ▪ Fraud Detection in Credit Cards & Financial Crimes (AML) in Banks
Travel, Hospitality & Retail
▪ Customer centricity – Customer behavior analysis from Omni channel retailing & Social feeds ▪ Markdown Optimization – Improve markdown based on actual customer buying patters ▪ Market basket analysis – Narrow down market basket analysis by demographics
Life Science▪ Improve targeting & predictions – Automatic Detection of Adverse Drug Effects (ADEs) ▪ Patient data analysis – Longitudinal Patient Data (LPD) analysis ▪ Predictive Sciences – Analyze Preclinical Side Effect Profiles of Marketed Drugs
Healthcare (Payers & Providers)
▪ Cost of Care – Drug effectiveness & Cost of Care Analysis based on electronic Health Records (EMR) ▪ Self Service Healthcare – Increase in mHealth & eHealth to allow consumer access to health information ▪ Claims Analytics – Analyze insurance claims data for fraud detection & preferred treatment plans
Communication, Media & Entertainment
▪ Discover churn patterns based on Call data records (CDRs) and activity in subscribers’ networks ▪ Digital Asset Management (DAM) – Analyze & capitalize digital data assets
Manufacturing▪ Proactive Maintenance & Recommendation – Sensor Monitoring for automobile, buildings & machinery ▪ Energy Efficiency – Leveraging Smart meters for utility energy consumption ▪ Location or Proximity Tracking – Location based analytics using GPS Data
Hi-Tech ▪ Extend and complement conventional information supply chain with big data path ▪ Predictive analysis and real time decision support
![Page 16: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/16.jpg)
Hadoop
16
![Page 17: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/17.jpg)
Hadoop - HDFS
17
![Page 18: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/18.jpg)
Hadoop - MapReduce
18
![Page 19: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/19.jpg)
Hadoop - MapReduce
19
![Page 20: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/20.jpg)
Apache Spark
20
Spark
Iterative Processing
Batch Processing
Machine Learning
SQL
Stream Processing
Graph Processing
![Page 21: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/21.jpg)
Hadoop
21
![Page 22: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/22.jpg)
NoSQL Databases
22
![Page 23: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/23.jpg)
NoSQL Databases
23
![Page 24: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/24.jpg)
Modern Data Architecture
24
![Page 25: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/25.jpg)
Lambda Architecture
25
![Page 26: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/26.jpg)
Lambda Architecture
26
![Page 27: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/27.jpg)
Data Analytics Lifecycle
27
![Page 28: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/28.jpg)
Analytics - Trends
• Big Data Analytics In The Cloud • AWS, AWS-Redshift
• Hadoop • Enterprise Data Operating
System • Data Analytics Platform • SQL on Hadoop
• NoSQL • IoT ( Internet of Things )
28
• Multi-polar Analytics • Predictive Analytics ( Spark ) • In-memory Analytics • Data Lake • Deep Learning • Machine Learning • Neural Networks • Data Monetization
![Page 29: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/29.jpg)
Q & A
![Page 30: Large Scale Data Analytics](https://reader034.vdocuments.mx/reader034/viewer/2022050719/55a9b41b1a28abd8698b46b3/html5/thumbnails/30.jpg)
Thank You !
“Any Sufficiently Advanced Technology Is Indistinguishable From Magic “
- Arthur C. Clarke