neo4j graphday seattle- sept19- in the enterprise

36
in the Enterprise #1 Database for Connected Data Jeff Morris Head of Product Marketing [email protected] 9/19/17

Upload: neo4j-the-fastest-and-most-scalable-native-graph-database

Post on 23-Jan-2018

124 views

Category:

Presentations & Public Speaking


0 download

TRANSCRIPT

in the Enterprise

#1 Database for Connected Data

Jeff Morris

Head of Product

Marketing

[email protected]

9/19/17

Neo4j Enterprise EditionNative Graph Platform

Graph Overview

Neo4j 2.3 Release SummaryGA October 2015

Intelligent Applications at Scale

• Higher concurrent performance at scale with fully off-heap cache

• Improved Cypher performance with smarter query planner

Developer Enablement: Productivity & Governance

• Schema enhancements:Property existence constraints

• String-enhanced graph search

• Spring Data Neo4j 4.0

• Numerous productivity improvements

DevOps Enablement for On-Premise & Cloud

• Official Docker support

• PowerShell support

• Mac installer and launcher

• Easy 3rd party monitoring with Neo4j Metrics

• New & improved tooling

3

Neo4j 3.0: A New Architecture Foundation

4

Cypher Engine

Parser

Rule-based optimizer

Cost-based optimizer

Runtime

Neo4jNeo4j

Application

Neo4j

New language driversNew binary protocol

Improved cost-based query optimizer

New file, config and log structure for tomorrow’s deployments

Native Language DriversBOLTNew storage engine with no limits

Enterprise Edition

Java Stored Procedures

Raft-based architecture • Continuously available

• Consensus commits• Third-generation cluster architecture

Cluster-aware stack• Seamless integration among drivers,

Bolt protocol and cluster

• No need for external load balancer• Stateful, cluster-aware sessions with

encrypted connections

Streamlined development• Relieves developers from complex infrastructure concerns

• Faster and easier to develop distributed graph applications

Neo4j Enterprise: Causal Clustering ArchitectureModern and Fault-Tolerant to Guarantee Graph Safety

5 Neo4j Advantage – Scalability

Neo4j 3.1 Highlights

SecurityFoundation

Database Kernel and Operations Advances

6

IBM Power8 CAPI Flash

Support

SchemaViewer

CausalClustering

State-of-the-ArtCluster

Architecture

Highlights of Neo4j 3.2May 2017 GA

Enterprise scalefor global

applications

Continuous improvement in

native performance

Enterprise governancefor the

connected enterprise

7

sa group

uk group

us_east group

hk group

Neo4j Performance Improvements by Version

0

2000

4000

6000

8000

10000

12000

14000

Neo4j 2.2 Neo4j 2.3 Neo4j 3.0 Neo4j 3.1 Neo4j 3.2

Complex Mixed-Workload Throughput

Esti

mat

ed

Neo4j 3.3

Global Iterative Graph Algorithms

PageRank Community Detection

2016 Presidential Debate #3 Twitter Graph

2016 Presidential Debate #3 Twitter Graph - Minus Bots

Further reading: https://medium.com/@swainjo/election-2016-debate-three-on-twitter-4fc5723a3872

Features in Community and Enterprise Editions

10

Both Editions—GRAPH Features Database Features Architecture Features

Labeled Property Graph Model ACID Transactions Language drivers for Java, Python, C# & JavaScript

Native Graph Processing & Storage High-performance Native API HTTPS plug-in

Graph Query Language “Cypher” High-performance caching REST API

Neo4j Browser w/ Syntax Highlighting Cost-based query optimizer RPM, Azure & AWS Cloud Delivery

Fast Writes via Native Label Index

Fast Reads via Composite Indexes

Enterprise Edition—GRAPH Features Database Features Architecture Features

Database storage reallocation Query monitoring with enriched metrics Enterprise Lock Manger accesses all available cores on server

Cypher query tracingCompiled Cypher Runtime to accelerate common queries

Causal Clustering, core and read-replica design

Node Key schema constraints User & role-based security Multi-Data Center Support for global scale

Property existence constraints LDAP & Active Directory Integration Driver-based load balancing

Kerberos Security plug-in Driver-based Causal Clustering API exposes routing logic

Bold is new in 3.2

Neo4j Supported Platforms

On-Premise Platforms Cloud Platforms and Containers

IBM POWER

For Development

… and others

Why Neo4j: Key Technology Benefits

ACID Transactions

• ACID transactions with causal consistency

• Security Foundation delivers enterprise-class security and control

Hardware Efficiency• Native graph query processing and storage

requires 10x less hardware

• Index-free adjacency requires 10x less CPU

Agility

• Native property graph model

• Modify schema as business changes without disrupting existing data

Developer Productivity

• Easy to learn, declarative graph query language

• Procedural language extensions

• Open library of procedures and functions

• Worldwide developer network

… all backed by Neo’s track record of leadership and product roadmap

Performance

• Index-free adjacency delivers millions of hops per second

• In-memory pointer chasing for fast query results

Shopping Recommendations

Examples of companies that use Neo4j, the world’s leading graph database, for recommendation and personalization engines.

Adidas uses Neo4j to combine

content and product data into a

single, searchable graph database

which is used to create a

personalized customer experience

“We have many different silos, many different data domains, and in order to make sense out of our data, we needed to bring those together and make them useful for us,” – Sokratis Kartelias, Adidas

eBay ShopBot Personal Shopping

Companion in FB Messenger

“ShopBot uses its Knowledge Graph to understand user requests and generate follow-up questions to refine requests before searching for the items in eBay’s inventory. In a search query for “bags” for example, purple nodes represent “categories,” green “attributes” and pink are “values” for those attributes.”– RJ Pittman Blog, eBay

Walmart uses Neo4j to give

customer best web experience

through relevant and personal

recommendations

“As the current market leader in graph databases, and with enterprise features for scalability and availability, Neo4j is the right choice to meet our demands”. - Marcos Vada, Walmart

Product recommendations Personalization

Linkedin Chitu seeks to engage

Chinese jobseekers through a

game-like user interface that is

available on both desktop and

mobile devices.

“The challenge was speed,” said

Dong Bin, Manager of Development

at Chitu. “Due to the rate of growth

we saw from our competitors in the

Chinese market, we knew that we

had to launch Chitu as quickly as

possible.”

Social Network

Classic Case Studies

Neo4j in the EnterpriseNative Graph Differentiation

Graph Overview

Discrete DataMinimally

connected data

Neo4j is designed for data relationships

Neo4j's Connections-First Positioning & Focus

Other NoSQL Relational DBMS Neo4j Graph DB

Connected DataFocused on

Data Relationships

Development BenefitsEasy model maintenance

Easy query

Deployment BenefitsUltra high performanceMinimal resource usage

Theme: Why Non-Native Graphs FailWhy Neo4j leads the graph market

Graph is an independent paradigm• Driving simplicity, adoption and business value solutions • Multi-model vendors increase complexity• Graph value is in the hops (more than 3)

Simplify• Express from idea to whiteboard• Language to translate to computer• Visualization and user experience• ACID Transactions in a native architecture• Scalable database stack that meets market expectations

16

Cypher: Powerful and Expressive Query Language

MATCH (:Person { name:“Dan”} ) -[:MARRIED_TO]-> (spouse)

MARRIED_TO

Dan Ann

NODE RELATIONSHIP TYPE

LABEL PROPERTY VARIABLE

Neo4j Advantage – Developer productivity

18

Example HR Query in SQL The Same Query using Cypher

MATCH (boss)-[:MANAGES*0..3]->(sub),

(sub)-[:MANAGES*1..3]->(report)

WHERE boss.name = “John Doe”

RETURN sub.name AS Subordinate,

count(report) AS Total

Project Impact

Less time writing queries• More time understanding the answers• Leaving time to ask the next question

Less time debugging queries: • More time writing the next piece of code• Improved quality of overall code base

Code that’s easier to read:• Faster ramp-up for new project members• Improved maintainability & troubleshooting

Productivity Gains with Graph Query LanguageThe query asks: “Find all direct reports and how many people they manage, up to three levels down”

UNIFIED, IN-MEMORY MAP

Lightning-fast queries due toreplicated in-memory architecture and index-free adjacency

MACHINE 1 MACHINE 2 MACHINE 3

Slow queries

due to index lookups + network hops

Using Graph

Using Other NoSQL to Join DataQ R

Q R

Relationship Queries on non-native Graph Architectures

19

NoSQL Databases Don’t Handle Relationships

• No data structures to model or store relationships

• No query constructs to support data relationships

• Relating data requires “JOIN logic” in the application

• No ACID support for transactions

… making NoSQL databases inappropriate when data relationships are valuable in real-time

Graph Transactions OverACID Consistency

Graph Transactions OverNon-ACID DBMSs

21

Maintains Integrity Over Time Eventual Consistency Becomes Corrupt Over Time

The Importance of ACID Graph Writes

• Ghost vertices• Stale indexes• Half-edges• Uni-directed ghost edges

Neo4j Graph Platform

23

Transactions Analytics

Data IntegrationAPI ETL SaaS

Da

tab

ase

To

olin

g

Dis

cove

r &

Vis

ua

lize

CUSTOMERS

BUSINESSUSERS

DEVELOPERS

ADMINS

DATASCIENTISTS

OTHER SYSTEMS

APPS AI / ML

The Connected Enterprise Value Proposition Fastest path to Graph Success

Graph Expertise

Graph Database Platform

Innovation Network

Enterprise-Grade Innovation Launchpad• Neo4j Enterprise Edition• HA, Causal Cluster, MDC• Better performance• Hardened product

The Next Innovation• Density of the network accelerates

innovation opportunity• Thousands of project successes• Partners, Service Providers,

Vendors, Academics, Researchers

Millions of Graph Hours • Shrink learning curve• Design advice• Contextual experience• Deploy & Ops support

24

Neo4jCommercial

Value

Case Studies

Neo4j Case Studies

Background

• Large Public University – “U-Dub”

• IT staff for 80K+ students and employees

• Transforming IT systems from mainframe to cloud

• Providing IT & data warehousing services to 3 campuses, 6 hospitals, and 6,300 EDW users

Business Problem

• Old Sharepoint metadata was too complicatedfor users, not flexible and not transparent

• $1B project to migrate HR system from mainframe to Workday needed to be smooth

• Future projects needed repeatable predictability

• Needed new glossary, impact analysis, analytics

Solution and Benefits

• Consulted with NDU peers, built simple model

• Built Visualizer with Elasticsearch, Neo4j & D3.js

• Improved predictability, lineage, and impact understanding for over 6,300 users

University of Washington EDUCATION & RESEARCH

Metadata Management, IT & Network Operations26

CE Customer since 2016 Q1

Business Problem

• Optimize walmart.com user experience

• Connect complex buyer and product data to gain super-fast insight into customer needs and product trends

• RDBMS couldn’t handle complex queries

Solution and Benefits

• Replaced complex batch process real-time online recommendations

• Built simple, real-time recommendation system with low-latency queries

• Serve better and faster recommendations by combining historical and session data

Background

• Founded in 1962 and based in Arkansas

• 11,000+ stores in 27 countries with walmart.comonline store

• 2M+ employees and $470 billion in annual revenues

Walmart RETAIL

Real-Time Recommendations27

Background

• Brazil's largest bank, #38 on Forbes G2000

• $61B annual sales 95K employees

• Most valuable brand in Brazil

• 28.9M credit card & 25.6M debit card accounts

• High integrity, customer-centric values

Business Problem

• Data silos made assessing credit worthiness hard

• High sensitivity to fraud activity

• 73% of all transactions over internet and mobile

• Needed real-time detection for 2,000 analysts

• Scale to trillions of relationships

Solution and Benefits

• Credit monitoring and fraud detection application

• 4.2M nodes & 4B relationships for 100 analysts

• Grow to 93T relationships for 2000 analysts by 2021

• Real time visibility into money flow across multiple customers

Itau Unibanco FINANCIAL SERVICES

Fraud Detection / Credit Monitoring 28

CE Customer since 2016 Q1EE Customer since Q2 2017

Background

• Large global bank

• Deploying Reference Data to users and systems

• 12 data domains, 18 datasets, 400+ integrations

• Complex data management infrastructure

Business Problem

• Master data silos were inflexible and hard to consume

• Needed simplification to reduce redundancy

• Reduce risk when data is in consumers’ hands

• Dramatically improve efficiency

Solution and Benefits

• Data distribution flows improved dramatically

• Knowledge Base improves consumer access

• Ad-hoc analytics improved

• Governance, lineage and trust improved

• Better service level from IT to data consumers

UBS FINANCIAL SERVICES

Master Data Management / Metadata29

CE Customer since 2016 Q1EE Customer since 2015

Background

• SF-based C2C rental platform

• Dataportal democratizes data access for growing number of employees while improving discoverability and trust

• Data strewn everywhere—in silos, in segmented departments, nothing was universally accessible

Business Problem

• Data-driven culture hampered by variety and dependability of data, tribal knowledge and word-of-mouth distribution

• Needed visibility into information usage, context, lineage and popularity across company of 3,000+

Solution and Benefits

• Offers search with context & metadata, user & team-centric pages for origin & lineage

• Nodes are resources: data tables, dashboards, reports, users, teams, business outcomes, etc.

• Relationships reflect consumption, production, association, etc.

• Neo4j, Elasticsearch, Python

Airbnb Dataportal TRAVEL TECHNOLOGY

Knowledge Graph, Metadata Management30

CE users since 2017

Background

• San Jose-based communications equipment giant ranks #91 in the Global 2000 with $44B in annual sales

• Needed high-performance system that could provide master-data access services 24x7 to applications company-wide

Solution and Benefits

• New Hierarchy Management Platform (HMP)manages master data, rules and access

• Cut access times from minutes to milliseconds

• Graphs provided flexibility for business rules

• Expanded master-data services to include product hierarchies

Business Problem

• Sales compensation system didn’t meet needs

• Oracle RAC system had reached its limits

• Inflexible handling of complex organizational hierarchies and mappings

• ”Real-time” queries ran for more than a minute

• P1 system must have zero downtime

Cisco COMMUNICATIONS

Master Data Management31

Background

• French Telecom

• Big Data Governance in support for GDPR

• Environment with Hadoop, Analytics, Recommendation engines, etc.

Business Problem

• Manage people, roles & rights, flow, audit, log management, processes, policies, lineage, metadata, lifecycles, security, etc…

• All because GDPR arrives in May 2018

Solution and Benefits

• Governance system oversees all systems

• Enforces correct policies

• Allows flexibility beyond Hadoop

• Architect has written Neo4j French manual

ORANGE TELECOMMUNICATIONS

Master Data Management / Metadata32

CE Customer since 2016 Q1EE Customer since 2015

Background

• Large Nordic Telecom Provider

• 1M Broadband routers deployed in Sweden

• Half of subscribership are over 55yrs old

• Each household connects 10 devices

• Goal to improve customer experience

Business Problem

• Broadband router enhancement to improve customer experience

• Context-based in home services

• How to build smart home platform that allows vendors to build new “home-centric” apps

Solution and Benefits

• New Features deployed to 1M homes

• API-based platform for easy apps that:

• Automatically assemble Spotify playlists based on who is in the house

• Notify parents when children get home

• Build smart shopping lists

TELIA ZONE TELECOMMUNICATIONS

Smart Home / Internet of Things33

EE Customer since 2016 Q4

Business Problem

• Needed new asset management backbone to handle scheduling, ads, sales and pushing linear streams to satellites

• Novell LDAP content hierarchy not flexible enough to store graph-based business content

Solution and Benefits

• Neo4j selected for performance and domain fit

• Flexible, native storage of content hierarchy

• Graph includes metadata used by all systems: TV series-->Episodes-->Blocks with Tags-->Linked Content, tagged with legal rights, actors, dubbing et al

Background

• Nashville-based developer of lifestyle-oriented content for TV, digital, mobile and publishing

• Web properties generate tens of millions of unique visitors per month

Scripps Networks MEDIA AND ENTERTAINMENT

Master Data Management34

Business Problem

• Needed to reimagine existing system to beat competition and provide 360-degree view of customers

• Channel complexity necessitated move to graph database

• Needed an enterprise-ready solution

Solution and Benefits

• Leapfrogged competition and increased digital business by 23%

• Handles new data from mobile, social networks, experience and governance sources

• After launch of new Neo4j MDM, Pitney Bowes stock declared a Buy

Background

• Connecticut-based leader in digital marketingcommunications

• Helps clients provide omni-channel experience with in-context information

Pitney Bowes MARKETING COMMUNICATIONS

Master Data Management35

Background

• World's largest hospitality / hotel company

• 7th largest web site on internet

• 1.5 M hotel rooms offered online by 2018

• Revenue Management System that allows property managers to update their pricing rates

Business Problem

• Provide the right room & price at the right time

• Old rate program was inflexible and bogged down as they increased the pricing options per property per day

• Lay the path to be an innovator in the future

Solution and Benefits

• 2016-era rate program embeds Neo4j as "cache"

• Created a graph per hotel for 4500 properties in 3 clusters

• 1000% increase in volume over 4 years

• 50% decrease in infrastructure costs

• "Use Neo4j Support!"

MARRIOTT TRAVEL & HOSPITALITY SERVICES

Pricing Recommendations Engine36

EE Customer since 2014 Q2

Neo4j Enterprise EditionNative Graph Platform

Graph Overview