advanced analytics & iot - neudesic ... neudesic partnered with one of the nation’s...

Click here to load reader

Post on 20-May-2020

0 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • Presented by:

    Orion Gebremedhin Director of Technology, Data & Analytics

    Marc Lobree National Architect, Advanced Analytics

    ADVANCED

    ANALYTICS & IOT ARCHITECTURES

  • THE

    RIGHT TOOL FOR THE RIGHT WORKLOAD

    Azure SQL DW

    Storage blob

    HDInsight

    SSIS

    RDBMS Data Stores

    Unstructured data

    Excel (Direct Access)

    SSIS

    ED W

    Flat File Upload

    Azure SQL database

    Local Data Sources

    On-Premises Reporting & Analytics

  • REFERENCE ARCHITECTURE

    HYBRID BIG DATA PROCESSING

    Cube

    SSIS

    ED W

    Tabular

    ES A

    Direct Access/ Report Model Level Integration

    Data Layer Level Integration

    On-Demand-Compute

    Cloud Storage

    AZCopyPowerShell SSIS

  • ON PREMISES

    BIG DATA IMPLEMENTATIONS

  • USE CASE:

    ETL OFFLOADING

    • Have you outgrown your data delivery SLAs?

    • Is your business frustrated with data delays?

    • Get the right data at the right time.

  • Neudesic partnered with one of the nation’s largest utility companies that recently

    deployed Smar Utility Meters for power customers, nearly a million meters sending usage

    data every 15 minutes.

    The result: an Azure hybrid big data processing solution that enabled the customer to perform gap analytics: a process for identifying gaps that exist in the power usage

    readings, over 7x faster than their previous solution! Billions of Smart Meter reads get

    processed to identify the nature and duration of the gaps to mitigate revenue losses.

  • USE CASE:

    REAL-TIME ANALYSIS

    • Got end users that need data now?

    • Provide business units the data they need at

    the time they need it.

  • REAL TIME

    TRAFFIC MANAGEMENT

    EventsHub

    StreamAnalytics

    Reference Data Vehicle Registration

    Toll Data

    Toll Way – Event Generator

    Toll Violations

    Toll Violation Tickets

  • Real-Time Analysis

    On-premises Using Data Lake to capture all data for everyone.

    HDFS

    Kafka Logs

    OLTP

    PM B DM MDM

    Spark MLlib

    Machine learning

    Kafka

  • USE CASE:

    INTERNET OF THINGS

    • What action does your IoT device drive?

    • Help guide end-users to the action they are

    looking to take.

  • VENDING

    MACHINE MANAGEMENT

    EventsHub

    StreamAnalytics

    Vending Transactions

    EventsHub

    Machine learning

    Batch Predictions

    Real-time Notifications

    EventsHub

    Vending Machine

    Vending Machine

    Vehicle Location Info

  • REAL TIME

    TRAFFIC MANAGEMENT

    EventsHub

    StreamAnalytics

    Reference Data Vehicle Registration

    Toll Data

    Toll Way – Event Generator

    Toll Violations

    Toll Violation Tickets

  • IOT

    WEARABLE MANAGEMENT Processing device data in real time.

    Azure Stream Analytics

    Power BI Dashboards

    Power BI Dataset Temporal

    Azure Event Hub or

    IOT Hub

    APIDevice

    HD Insight Spark SQL Analyze

  • USE CASE:

    ITERATIVE EXPLORATION

    • What can we do with all of this data?

    • Mine for answers-one question at a time.

  • ITERATIVE

    EXPLORATION

    Build expert systems, move to supervised learning, and evolve to reinforced learning.

    Azure Machine Learning API End Point

    Web Service used for Orchestration

    HD Insight Azure Data Warehouse

    Power BI

  • ITERATIVE

    EXPLORATION

    Monitor and remove noise from textual data.

    Stream Analytics

    Power BI Dashboards

    Machine Learning API End Point

    Azure SQL DB Keyword Analytics

    Power BI Dataset Statistical

    Power BI Dataset Temporal

    Media Services

    Event Hubs

    Web Service used for Orchestration

  • USE CASE:

    SELF SERVICE

    • Are your reports only telling half the story?

    • Quickly deliver large datasets for ad hoc analysis.

  • SELF SERVICE

    Allowing business to fulfill their analytics needs.

    Apache Hadoop Spark SQL AnalyzeSemi-structured Files

    SQL Server

    Service Bus

  • HYBRID

    SELF SERVICE

  • HYBRID

    SELF SERVICE

  • USE CASE:

    DATA AS A SERVICE

    • Got savvy end users that need more data?

    • Provide data scientists with what they need

    while making it easy for the business user.

  • Data-as-a-Service

    USING AZURE Using Data Lake to capture all data for everyone.

    Data Sources

    Loading Data Lake

    Raw Data Lake

    Building Data Streams

    Self-Service Catalog

    Azure Stream Analytics

    Power BI Dashboards

    Azure Blob Storage

    Azure Event Hub or

    IOT Hub

    APIDevice

    SQL

    Azure Data Lake Store

    Azure ML

    HDInsight Hive or Spark

    Data Factory

    Azure Data Catalog

    Azure Data Lake Store

    App Service

    Click Stream Logs

    Data Historian (PI Server)

    Data Factory

    Data Factory

  • Advanced Analytics Methodology

  • Solution Development Process

    Visual AnalysisData Acquisition Model(s) Selection Model Comparison

    Build Model + Web Service Location for SQL query

    Understanding Data Model Creation + Testing

    Integration in Data Strategy

    Business Objective

    Understanding Data Model Creation + Testing

    Integration in Data Strategy Consumption Layer

  • Model Selection: Supervised (we know the response).

    Parametric

    Regression • Linear

    • Polynomial

    • Stepwise

    • Binomial

    • Splines

    • Partial Least Squares

    • Generalized Linear Models

    Classification • Logistic

    • Linear / Quadratic Discriminant Analysis

    Non Parametric

    • K Nearest Neighbors

    • Decision Trees

    • Random Forests

    • Boosting

    • Neural Network

    • Support Vector Machines

    • Generalized Additive Models

    *Some models can change (parametric/nonparametric) and (regression/classification)

    • Moving Averages

    • Exponential Smoothing

    • ARIMA

    • Regressions

    Forecasting

  • Model Selection: MAPE & RMSE & R^2

    Mean Average Percent Error

    Root Mean Square Error

    Variation explained by Predictor

    We want to choose the model that reduces the test error and has a high percent value for how much the predictors explains the response

  • Examining Weather and Active Meters in the System by Time

    Seasonality of temperature

    Active Metes by timeTemperature by time

    Constant increase of active meters

  • Usage by Day of Week & Verse Temperature

    Day of Month

    U sa

    ge &

    T em

    p

    Hourly Usage TrendsDay of Week Trend

    Temp = Red Usage = Blue

  • Auto-Regressive Integrated Moving Average

    AR(p) = number of seasonal autoregressive terms I(d) = number of differencing terms MA(q) = number of seasonal moving average terms

    m = periods inside frequency

    ARIMA(p,d,q)x(P,D,Q)[m]

    Stationary Mean

    & Variance

    Avg. Temperature Time Series

  •  Information Management

     Big Data Storage

     Apache Hadoop

     Real-time intelligence

     Machine learning

     IoT

     Dashboards and Visualizations

     and more!

    Ideate, chart your “quick wins,” ask questions and get answers to your real Big Data challenges.

    It’s insightful, it’s easy and can be done from the comfort of your conference room.

    www.neudesic.com/meetneat

    NEXT STEP

    BECOME THE BI SUPERHERO

    http://www.neudesic.com/meetneat

  • Orion Gebremedhin [email protected]

    Twitter: @oriongm

    Marc Lobree [email protected]

    BIG DATA &

    Advanced Analytics Roadshow

    Questions?