introducing hortonworks dataflow use cases - the …thesolusgroupllc.com/hdfusecases.pdf · ·...
TRANSCRIPT
Introducing Hortonworks DataFlow Use Cases
Presenter Name Title
© Hortonworks Inc. 2011 – 2015. All Rights Reserved
Page 2 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Growth of Data in Motion
Much of the new data exists in-flight, between systems and devices as part of the Internet of Anything NEW
TRADITIONAL
Ability to consume data
The Opportunity Unlock transformational business value from a full fidelity of data and analytics for all data.
Geolocation
Server logs
Files & emails
ERP, CRM, SCM
Traditional Data Sources
Internet of Anything
Sensors and machines
Clickstream
Social media
Page 3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Simplistic View of Enterprise Data Flow
The Data Flow Thing
Process and Analyze Data Acquire Data
Store Data
Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Hortonworks DataFlow: Collect, Conduct & Curate
COLLECT Any Type of Data via a highly secure lightweight agent
CONDUCT the Data Reliably with point-to-point, bi-directional info flows
CURATE the Data transforming it while preserving metadata on where it came from and how it changed Powered by
Apache NiFi
Page 6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Apache NiFi, Onyara and Hortonworks “Niagara Files” software developed at NSA over 8 years became Apache NiFi in 2014
Onyara was founded in 2015 by the chief architects of Apache NiFi
Hortonworks acquired Onyara in August 2015, to further extend its Big Data management product line
Hortonworks DataFlow subscription available now
Page 7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Meeting IoAT Edge Requirements
GATHER
DELIVER
PRIORITIZE
Track from the edge Through to the datacenter
Small Footprints operate with very little power
Limited Bandwidth can create high latency
Data Availability exceeds transmission bandwidth
Data Must Be Secured throughout its journey
Page 8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Meeting Data Provenance Requirements
BEGIN
END LINEAGE
Tweet: #hadooproadshow
IT and Cloud Operators • Understand traceability, lineage • Enable recovery and replay
Compliance Regulations • Provide an audit trail • Remediation capabilities
Business • Value sources & IT investment
Page 9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Meeting Security Requirements
Thorough Encryption • Enterprise-grade authorization services, with
the ability to frequently change entitlements
• Different access levels for people and systems with different roles
Data Classification • Tracing data with meta tags
• Understanding the who/what/when/where
Tweet: #hadooproadshow
Page 10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Hortonworks Data Platform powered by Apache Hadoop
Internet of Anything
Hortonworks Data Platform (HDP) powered by Apache Hadoop
Enrich Context
Store Data and Metadata
Hortonworks DataFlow (HDF) powered by Apache NiFi
Perishable Insights
Historical Insights
Hortonworks Offers Both HDF and HDP
Hortonworks DataFlow and the Hortonworks Data Platform deliver the industry’s most complete Big Data solution
Page 11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
HDF Complements Hortonworks Data Platform
HDF conducts data into HDP
HDF secures and encrypts data before it arrives in HDP
HDF offers traceability on the data’s flow before it reached HDP
HDF models flows graphically to dynamically adjust data coming to HDP
Page 12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Hortonworks DataFlow Use Cases Administer Flows, Enhance Security and Manage Equipment
Page 13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Data Flow Management Data Ingestion Data as a Service Provenance Data Regulatory Compliance
DATA FLOW MANAGEMENT
Page 14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
DATA FLOW MANAGEMENT
Data Ingestion, with bi-directional intelligence and provenance metadata
DATA INGESTION
Most ingest tools are unidirectional—data streams in the same way no matter what They don’t preserve detail on in-flow data transformations
PROBLEM
HDF manages bi-directional, point-to-point data flows that are easily configured Data reaches its destination with its provenance data intact
SOLUTION
Users can update data flow logic to always receive the data they need Provenance data improves confidence in you insights
IMPACT “The NiFi user interface and ease of extension have made it extremely easy to get up and running and even customize.”
Craig Connell, CTO, Leverege
Page 15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
DATA FLOW MANAGEMENT
Providers of data as a service assign value to data using NiFi’s provenance metadata
DATA AS A SERVICE PROVENANCE
A new genre of companies provides data as a service They have limited ability to prioritize which data is most valuable
PROBLEM
NiFi’s data provenance capabilities help DaaS companies understand (in much more detail) how their data is consumed
SOLUTION
They can understand which information resources are valuable and which are not This helps them invest in capturing the most valuable data sources
IMPACT
Page 16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
DATA FLOW MANAGEMENT
Firms Comply with Financial Regulations by Showing Complete Chain of Custody
DATA REGULATORY COMPLIANCE
Financial firms such as retail banks, capital markets firms and insurance companies are required to show chain of custody for certain transactions
PROBLEM
Apache NiFi’s data provenance capabilities show a complete chain of custody, for compliance with rules such as Basel capital requirements
SOLUTION
Firms can go back to a point in time and show regulators exactly what happened to a key piece of data in a transaction
IMPACT
Page 17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Enhance Security Asset and People Security Secure Data Ingestion Fraud and Theft Protection
ENHANCE SECURITY
Page 18 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
ENHANCE SECURITY
ASSET AND PEOPLE SECURITY
Prescient Edge Helps Its Customers Protect the Physical Safety of Their Personnel
With [Apache NiFi], we're able to track the origin, transformation, and persistence of data throughout our analytic processes.”
Mike Bishop, Chief Systems Architect, Prescient Edge
Globally distributed firms and government agencies have personnel in risky areas Prescient Edge provides analytics to protect employees
PROBLEM
The company uses Apache NiFi for real-time analytics on emergent threats, allowing its customers to respond quickly and safeguard its teams and assets
SOLUTION
For its Executive Training Program, each traveler receives training on operational risks and the mitigation measures to reduce hazards and provide safety assurance
IMPACT
Page 19 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
ENHANCE SECURITY
A major US financial firm uses HDF to prioritize data ingest and speed time to protection
SECURE DATA INGESTION
Digital security depends on the ability to detect threats quickly. Protection algorithms evaluate metadata with equal priority, slowing time to protection
PROBLEM
Apache NiFi helps evaluate and prioritize security logs upstream, before they reach the analytics engine
SOLUTION
By prioritizing which data to send to its analytics engine, the company sees faster time to protection for its cyber assets
IMPACT
Page 20 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
ENHANCE SECURITY
A huge US retailer uses Apache NiFi to reduce theft and shrinkage by hundreds of millions annually
FRAUD AND THEFT PROTECTION
Thieves shoplift merchandise in the morning and then return the stolen goods later the same day for credit to their card
PROBLEM
Apache NiFi pulls inventory and transactional data into Hadoop more quickly, reducing the time to detect this fraudulent pattern
SOLUTION
The company expects to reduce shrinkage by hundreds of millions of dollars annually
IMPACT
Page 21 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Manage Equipment Equipment Repair Remote Security Protection
MANAGE EQUIPMENT
Page 22 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
MANAGE EQUIPMENT
Global oil company uses Apache NiFi to prioritize which sensor data to send ashore from offshore rigs
EQUIPMENT REPAIR
Offshore oil rigs have physical constraints on their hardware footprints and associated bandwidth Far more sensor data is generated than can be transmitted to shore
PROBLEM
Apache Nifi uses rules-based prioritization to determine which sensor data to transmit back for long-term storage and analysis
SOLUTION
Ability to distinguish important readings from standard readings helps the company isolate important signals and take action to improve efficiency and safety
IMPACT
Page 23 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
MANAGE EQUIPMENT
Firm with a high security profile enriches on-site video data to detect intrusions
REMOTE SECURITY PROTECTION
Digital security cameras present a “needle in a haystack” problem Individuals monitoring video feeds can be lulled by 100s of hours where nothing happens
PROBLEM
Hortonworks DataFlow can identify “trigger moments” when a human face appears in a video Then it can tag the video with a marker
SOLUTION
Clips with recognizable facial features can be shipped back to central clusters that compare security footage with facial features of known criminals
IMPACT
Page 24 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Apache NiFi User Quotes
“The NiFi user interface and ease of extension have made it extremely easy to get up and running and even customize. It is great that it also easily integrates with other parts of the Apache Big Data world like Spark, Kafka and Hadoop.” Craig Connell, Leverege, Chief Technology Officer
“NiFi's well designed, mature API has made our integration process remarkably straightforward. With it, we're able to track the origin, transformation, and persistence of data throughout our analytic processes.” Mike Bishop Prescient Edge Chief Systems Architect
“NiFi addresses dataflow challenges we have right now and provides upside for where we're heading. That it is designed for the global enterprise, is also a big win for us.” Alexandar Ryabov Wargaming.net Senior Director of Data Engineering