new opportunities @the crossroads of m2m & big data · m2m and big data have ushered in...
TRANSCRIPT
NEW OPPORTUNITIES @THE CROSSROADS OF M2M & BIG DATA
Prakash HiremathPrincipal ArchitectEMC
2014 EMC Proven Professional Knowledge Sharing 2
Table of Contents
Overview .................................................................................................................................... 3
Introduction to M2M ................................................................................................................... 5
M2M Key activities ..................................................................................................................... 7
M2M Challenges ........................................................................................................................ 8
Introduction to Big Data .............................................................................................................. 9
Big Data Technology .................................................................................................................11
Combining M2M & Big Data ......................................................................................................13
M2M & Big Data Solution Architecture ......................................................................................14
M2M & Big Data Use Cases......................................................................................................15
Healthcare .............................................................................................................................15
Retail .....................................................................................................................................15
Manufacturing and Supply Chain ...........................................................................................15
Banking & Finance ................................................................................................................15
Insurance ..............................................................................................................................16
Energy & Utilities ...................................................................................................................16
References ...............................................................................................................................17
Disclaimer: The views, processes, or methodologies published in this article are those of the
author. They do not necessarily reflect EMC Corporation’s views, processes, or methodologies.
2014 EMC Proven Professional Knowledge Sharing 3
Overview
M2M is commonly known as a technology that enables machine to machine communication. A
machine could be any device that can be connected to the network and can securely exchange
information with other devices over wired or wireless networks.
M2M has evolved over the years. Today, there are many use cases for M2M; SMS code using a
cell phone to record television programs while away from home; switching on an air conditioner
before entering your house using your smart phone. All devices with a chip installed and that
can communicate with each other are known as smart devices. Examples of smart devices
include smart phone, smart TV, smart refrigerator, etc. Imagine all the devices on the network
can communicate with each other and can act smart. We don’t have to do anything manually.
Everything will be done automatically and through smart devices. Considered by many experts
as the next technology disruption, M2M technology can be implemented in healthcare,
manufacturing, logistics, automotive and energy sectors.
The rapid evolution of M2M brings with it a few challenges, especially around communication
protocols between devices. There is a need for a common open source protocol which can be
used by all devices to communicate with each other. Much work is already underway in that
space which will surely boost M2M.
Then there is Big Data which has already created buzz across many industries. What is Big
Data? Big Data is characterized by 3 Vs’; Volume, Velocity, and Variety of data. Traditional data
warehouse software offerings can manage data—mostly structured data—of up to terabytes in
size. However, data is growing very fast into petabytes and even beyond zettabytes in some
cases. Earlier, only companies like Google, Facebook, and Twitter had data beyond zettabytes
and needed new technology to process data faster. But this is no longer true. Most companies
have started tracking all the data that machines generate. Along with structured data coming
primarily from ERP and legacy systems, companies are now collecting unstructured data
generated by machines not related to core business but mostly related to product or user
behavior. Other factors will impact business, mostly third party entities such as weather,
government, or other things outside an organization’s control.
To perform analytics and accurately predict what is coming in the future or what has gone wrong
in the past which can aid business, the answer is Big Data technology. Combining structured,
semi-structured, and unstructured data typically beyond the capacity of traditional data
2014 EMC Proven Professional Knowledge Sharing 4
warehousing software, Big Data tecnology enables data to be processed much faster to create
valuable insights for business.
M2M and Big Data have ushered in technology disruption for industries similar to cloud
computing, mobility, and social. A plethora of new opportunities are created for enterprises
when two disruptive technologies meet at the crossroads. M2M can help communicate with
devices and gather large amount of data both internal and external. Internal data is commonly
known as enterprise’s own data and external data is something external to the enterprise or
something beyond control of the enterprise. Using Big Data, enterprises can analyze the various
collected data sets and provide deep insight that can create huge opportunity for business.
This Knowledge Sharing article explores the new opportunities created by merging M2M and
Big Data technology. Providing an introduction to M2M and Big Data technologies, this article
will cover use cases outlining the opportunities created at the crossroads of M2M and Big Data.
2014 EMC Proven Professional Knowledge Sharing 5
Introduction to M2M
Machine-to-machine (M2M) communication is a technology that is primarily focused on how
multiple machines can communicate with each other, either wired or wireless. These machines
are pervasive and can act as smart devices. Examples of machines are smart devices such as
smart phones, smart meters, smart home appliances (i.e. refrigerator, TV, air cooling units), and
so on.
Figure 1 depicts similar devices talking to a central hub where data is collected and value is
created. The example shown is of smart energy meters in homes whose consumption data is
sent to a central hub where data is stored for analytics. The data stored can be used to discover
patterns of use which can help balance energy consumption in a uniform manner so that
demand and supply can be met.
Figure 1: M2M Example
Figure 2 provides an example of M2M where devices talk to each other to make smart
decisions. Imagine you have a smart car, smart phone, and smart electrical appliances that can
communicate with each other or can be controlled remotely. You set up an alarm in your smart
phone which is communicated to a smart TV which will power on and display weather and
related news that are important for you to do your planned tasks. Your in-car system gets the
information of your meeting, knows where you are travelling, finds the best route for you, and
also syncs entertainment systems with your downloads on mobile. Imagine your phone sending
2014 EMC Proven Professional Knowledge Sharing 6
signals to home appliances that you will be arriving home in the next 20 minutes and the air
conditioner and geysers switch on automatically. You can connect any device to a network as
long as it has a chip in it. This is what Gartner describes it as Internet of Things (IoT) which is a
bigger perspective of M2M.
Figure 1: M2M Example
2014 EMC Proven Professional Knowledge Sharing 7
M2M Key activities
There are four core activities involved in M2M.
1. Data collection at source – Source could be any device that is capturing machine data.
2. Data Transfer – In this step, data is encrypted at source and sent securely over wired or
wireless networks, be it WAN, LAN, WiFi, or public internet. Data will be received at
destination—likely a centralized data center—decrypted, and stored in a datastore.
3. Data Assessment – Decrypted data is then analyzed by experts to drive conclusions.
This data is monitored continuously.
4. Take Action/Actionable Intelligence – Appropriate action is taken based on the
conclusion. Devices can even be controlled at source from centralized locations and
issue healing commands.
2014 EMC Proven Professional Knowledge Sharing 8
M2M Challenges
M2M technology is still evolving and will take some time to mature. Various challenges to
overcome include:
Protocols – There are too many protocols and vendors in this space. Standardization is
the key for variety of device communications and wide implementation of M2M solutions.
Security – It is important to encrypt and transmit data to a centralized hub and then
decrypt it at destination.
Data Volumes – Data collected from various devices need to be stored and processed.
Since data size can reach petabytes and beyond, robust data processing software and
solutions are needed.
2014 EMC Proven Professional Knowledge Sharing 9
Introduction to Big Data
Big Data is characterized by 3 Vs; volume, velocity, and variety of data. Traditional data
warehouse software could manage data—mostly structured data—of up to terabytes in size .
Data is growing very fast into petabytes and even beyond zetabytes in some cases. Earlier it
was thought that only companies like Google, Facebook, and Twitter which had data beyond
zetabytes needed new technology to process data faster. This is no longer true.
Most companies have started tracking all the data that machines generate. While structured
data continues to come primarily from an organization’s ERP and legacy systems, companies
are now collecting unstructured data generated by machines, not related to their core business
but mostly related to product or user behavior. Other factors will impact business, mostly third-
party entities like weather, government, or other things beyond their control. To perform
analytics and accurately predict what is coming or what has gone wrong in the past which can
aid business, the answer is Big Data.
Big Data combines structured, semi-structured, and unstructured data which, combined, is
typically beyond the capacity of traditional data warehousing software and processes. Big Data
analytics primarily uses Hadoop—which consists of MapReduce and HDFS for storage—to
process the large data. Results from a Big Data analytics exercise can create new opportunities
for business, enabling them to gain valuable insights for business. For example, retail
companies can:
gather data about customers, i.e. when they visit, their buying patterns, spending
patterns, etc. and can study this behavior over the time
combine that data with their marketing strategies and available inventory to sell more
products
leverage upsell and cross- sell opportunities
derive insights based on demographics of customers
use the data to find which day fewer of customers visited and can model marketing
strategies to generate more sales on particular days of the week
Big Data analytics can provide insights to many other domains such as healthcare,
manufacturing, HR, Sales, Energy, Telecom, Airline, and so on.
Big Data is creating new line of business for many companies. EMC created a new company,
Pivotal Labs, to primarily focus on big, fast data. Pivotal can process large amounts of data in
2014 EMC Proven Professional Knowledge Sharing 10
real time and provide insights to customers. For instance, an airplane in flight generates
terabytes of data per second and you need to process that data and provide analytics; big data
analytics is the way to go. This is only one of many examples of business use cases for Big
Data.
2014 EMC Proven Professional Knowledge Sharing 11
Big Data Technology
Big Data analytics involves processing large sets of various kinds of internal and external
organisation data. Key technologies for Big Data analytics are:
Hadoop: The Apache Hadoop software library is a framework that enables distributed
processing of large data sets across clusters of computers using a simple programming model.
It is designed to scale from single servers to thousands of machines, each offering local
computation and storage. Rather than rely on hardware to deliver high-availability, the library
itself is designed to detect and handle failures at the application layer, delivering a highly-
available service on top of a cluster of computers, each of which may be prone to failures.
Hadoop is designed to process terabytes and even petabytes of unstructured and structured
data. It breaks large workloads into smaller data blocks that are distributed across a cluster of
commodity hardware for faster processing. Hadoop is a Java-based framework, consisting of
two elements; Hadoop Distributed File System (HDFS) and MapReduce, a high-performance
parallel/distributed data processing framework.
NoSQL: NoSQL stands for Not Only SQL. NoSQL databases do not use the popular SQL
(Structured Query Language) to create tables and insert, delete, or update data. Many NoSQL
deployments handle data that simply can’t be handled by a relational database, such as sparse
data, text, and other forms of unstructured content. Unstructured content include social
media/networks, audio, video, Internet text, documents, etc. NoSQL databases are categorized
according to the way they store the data and fall under categories such as key-value stores,
document store databases, and graph databases.
In-memory & Columnar Databases: Computers typically store data on the hard disk and,
when you want to perform a task, pulls out the relevant data and applications needed on to the
computer's main memory which is where computations happen. In this case, data must be
accessed, transferred to memory, and then returned so the next batch of data that can be used.
As data volumes increase, the time needed simply for access increases and even more time will
be required for actual analysis of data. In-memory computing takes advantage of direct data
storage on the computer's random access memory (RAM). As a result, the data is already
available and can be accessed near-instantaneously when it needs to be analysed. Speed is
the most evident benefit of in-memory processing. SAP HANA and Oracle Exadata are
examples of in-memory databases.
2014 EMC Proven Professional Knowledge Sharing 12
As its name implies, a columnar database stores data in columns unlike traditional databases
which store data in rows. Storing data in columns provides benefits such as high data
compression and very fast data aggregation. Columnar querying’s performance efficiencies are
unmatched by row-oriented databases. Examples of columnar database are HP Vertica and
SAP Sybase.
2014 EMC Proven Professional Knowledge Sharing 13
Combining M2M & Big Data
M2M creates the opportunity for big data. Without big data there is no meaning to M2M which
helps collect data from various sources and in various formats. There is a need to handle this
data and store it to create value from it. However, it is not possible to handle this data with
traditional data warehouse software. This is where Hadoop and massively parallel processing
databases like Greenplum® are needed. As well, collected data stored in the cloud can use the
software stack from Pivotal Labs to create big data insight in real time.
Evolution in Big Data technology has created more opportunities for M2M implementations.
Many companies are enabling their products to communicate back to centralized data centers
and use a Big Data platform to process the data. This will help them take measures to improve
customer retention. The solution also provides a competitive edge as they can understand the
usage patterns of their customer and pre-emptively propose solutions to customer challenges.
M2M and Big Data implementations can be applied in various industry sectors. The following
section describes the industry-specific use cases.
2014 EMC Proven Professional Knowledge Sharing 14
M2M & Big Data Solution Architecture
As shown above, data is collected at source by the field devices that will then communicate to a
Gateway server. The Gateway server will be primarily responsible for collecting and encrypting
data from devices and transfer the data over the internet to the destination. Data travels through
public internet, passes through the organization firewall, into HDFS file system, and output data
is loaded into a reporting platform such as Greenplum. Analytics tools will then be used to
create a dashboard for actionable intelligence.
Instead of storing the data into the organization’s data center, it can be pushed to the cloud over
public internet and use a Big Data technology stack provided by the cloud provider to generate
actionable intelligence. Most cloud providers provide Hadoop and Big Data platforms.
2014 EMC Proven Professional Knowledge Sharing 15
M2M & Big Data Use Cases
Use cases for various industry domains include:
Healthcare
Wearable devices that can measure various parameters of the human body like pulse rate,
blood pressure, temperature, heartbeat, etc. are becoming common in the healthcare industry.
These wearable devices can communicate with a central data hub, transfer data, and use
analytics medicos to monitor health parameters and also advise patients on what and what not
to do. For example, analysis can help understand patterns in people from religious backgrounds
and geographies. This will help humans to take preventive measures, for instance changing
how people can be better prepared for weather changes.
Retail
Retail chains can capture customer data at point of sale to better understand buying patterns.
Data collected can also offer insight into forecasting for products. It can provide average shelf
life of products and can closely link marketing strategy to clear long-pending products. In fact,
close analysis of customer data can suggest a life event of a customer and personalize the
marketing strategy for each customer which will give a competitive edge and also help build
customer loyalty.
Manufacturing and Supply Chain
Traditionally, sensors are used in most manufacturing plants for automation. Data collected from
sensors provide insight on product health or even tracing the products in manufacturing plants.
Sensor and radio frequency identification (RFID) can also help trace product defects. Collecting
and analyzing the data from all sensors can help plant operations improve quality.
Another example is automatic replenishment of inventory. When inventory drops below a
threshold level, systems can communicate with supplier systems to send material to the
manufacturing plant.
Data collected from sales, manufacturing, and supply chain can help form organization strategy
to market and sell products, which aids in maximizing profits.
Banking & Finance
Automated teller machines (ATM) can be designed to communicate with a central data center
and transfer activities such as account withdrawal, account statement by bank customers, and
so on. It can also communicate the balance availability and request for reloading of cash. The
2014 EMC Proven Professional Knowledge Sharing 16
data collected over time can provide valuable insight in terms of ATM effectiveness in the
region. ATMs will also be able to talk to smart phones, sending balance amounts and
confirmation of withdrawal.
Insurance
In the insurance domain, M2M and Big Data can be used to determine the risk associated with
the person for whom insurance is proposed. Imagine collecting the data of a driver from
government agencies, then collect data from the automobile itself while the person is driving
and, finally, the region/highways where the driver is driving most. Big Data analytics enables all
the collected data to be combined and a risk factor devised so that insurance companies can
propose a suitable policy for customers. This is just one of many examples. M2M & Big Data
can be effectively implemented by companies which can send product data back for analysis,
i.e. home appliances, automobiles, etc.
Energy & Utilities
The energy and utilities domain is another area where a lot of M2M activity is happening. For
example, smart energy meters/water meters can communicate with a central hub and transmit
data, so there is no tampering of data and also offers insight to usage patterns of citizens.
Additionally, there is the smart building concept which can switch on power and electricity to a
room based on room booking which results in energy savings and contributes to green
initiatives. These and other M2M solutions can be extended further to leverage Big Data
technology and create actionable intelligence.
Finally, the smart city is where things are ultimately headed. All the machines can talk with each
other and data that is generated by this communication can provide many insights which can be
further used to make smarter decisions automatically.
2014 EMC Proven Professional Knowledge Sharing 17
References
1. http://www.fiercewireless.com/story/machina-value-m2m-big-data/2013-10-14
2. https://machinaresearch.com/static/media/uploads/Machina_Research_White_Paper_M2M_
Big_Data.pdf
3. http://www.tcs.com/SiteCollectionDocuments/White%20Papers/HighTech_Whitepaper_Tech
nology_Review_Trends_M2M_Communication_0212-1.pdf
4. http://en.wikipedia.org/wiki/Machine_to_machine
5. http://prakashhiremathblogs.blogspot.in/2012/07/big-data-technology-innoations.html
All figures are original and were created for this article.
EMC believes the information in this publication is accurate as of its publication date. The
information is subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION
MAKES NO RESPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO
THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Use, copying, and distribution of any EMC software described in this publication requires an
applicable software license.