graphtour - workday: tracking activity with neo4j (english version)

38
Tracking Activity with Neo4j

Upload: neo4j-the-fastest-and-most-scalable-native-graph-database

Post on 16-Mar-2018

99 views

Category:

Software


0 download

TRANSCRIPT

Page 1: GraphTour - Workday: Tracking activity with Neo4j (English Version)

Tracking Activity with Neo4j

Page 2: GraphTour - Workday: Tracking activity with Neo4j (English Version)

• Build Engineer (Build Engineering Team)

• Located in Paris, France

• Responsibilities

‒ Development of reusable Gradle plugins

‒ Administration of Artifactory

‒ Development of custom tools

‒ Support to engineering teams (mainly build-related)

‒ Sentinel (Server to track activity)

Who Am I ?

Workday Confidential

Page 3: GraphTour - Workday: Tracking activity with Neo4j (English Version)

The Build Engineering Mission

Page 4: GraphTour - Workday: Tracking activity with Neo4j (English Version)

• Define policies for engineering teams

(dependency locking, artifact promotion,

artifact metadata)

• Provide reusable tooling (Gradle plugins &

other custom tools)

• Administer shared services (Artifactory)

• Provide assistance to engineering teams

Build Engineering – Our mission

Workday Confidential

Page 5: GraphTour - Workday: Tracking activity with Neo4j (English Version)

• To ensure policies are followed

• Engineers enjoy a lot of freedom at Workday !

‒ Netflix: The Paved Road

• Is our tooling relevant ?

• Gain insight into how development teams are working

We need answers to those questions !

Why We Need Monitoring

Workday Confidential

Page 6: GraphTour - Workday: Tracking activity with Neo4j (English Version)

• Artifacts (Jars, Rpms, “Deliveries”, etc)

• CI Builds (in Bamboo, Team City and Jenkins)

• SCM changes (in BitBucket, GitHub, Gerritt, etc)

• Dependencies (between Artifacts, Builds, etc)

• JIRA issues (tracking of code)

• Promotions (of artifacts)

• Metadata in general

We’re interested in …

Workday Confidential

Page 7: GraphTour - Workday: Tracking activity with Neo4j (English Version)

• No unified system of records with all this information !

• The data is scattered across different systems (AF, CI, JIRA…)

• … is secured with different credentials (AD, LDAP)

• … is stored under different formats (JSON, XML, CSV, etc)

• … is not always easily accessible

• Accessing one data source is (usually) easy

• Accessing two data sources is already a bit trickier

• No unified query language for joining the aggregated data

Problem: The Data is Everywhere

Workday Confidential

Page 8: GraphTour - Workday: Tracking activity with Neo4j (English Version)

Requirements

Page 9: GraphTour - Workday: Tracking activity with Neo4j (English Version)

• Simple access to the information

• Unified and intuitive data model

• Powerful query language

• Data as accurate as possible

Frequent updates to the data

Updates must be fast (performance)

• Ability to easily refactor the data model

• Ability to expose this information to engineering teams (automation)

Requirements

Workday Confidential

Page 10: GraphTour - Workday: Tracking activity with Neo4j (English Version)

• We don’t want to rely on users to provide the information (unless we

have no other choice) !

• The information we need usually already exists or can be derived,

let’s use it !

But first of all !

Workday Confidential

Page 11: GraphTour - Workday: Tracking activity with Neo4j (English Version)

Sentinel – Architecture Overview

Page 12: GraphTour - Workday: Tracking activity with Neo4j (English Version)

Architecture Overview

Workday Confidential

REST API

Web UI

Data

Miner

JIRA

Artifactory

Bamboo

BitBucket

Data SourcesA foundation to solve current and future problems

Neo4j

Aggregation

Sanitization

Normalization

Page 13: GraphTour - Workday: Tracking activity with Neo4j (English Version)

• Command line tool (written in Groovy)

• Executable fat jar

• Runs from Bamboo every 15 mins

• Scans the data sources containing the information we need

• Preemptively extracts, sanitizes & normalizes the data

• Detects incremental changes (optimized for performance)

• Crash-proof

• A run executes 59 commands in sequence

• Scan time: 8 mins (min), 23 mins (average)

The Data Miner

Workday Confidential

Page 14: GraphTour - Workday: Tracking activity with Neo4j (English Version)

• A (NoSQL) graph database

• Graph paradigm is good for our need

• Very flexible and easy to use

• Schema-less

• Excellent performance

• All the useful data in one place

• Cypher (Query Language) !

The Neo4j Database

Workday Confidential

Page 15: GraphTour - Workday: Tracking activity with Neo4j (English Version)

• UI made of HTML dashboards & dynamic charts

• REST API

• Spring Boot, Thymeleaf, D3.js, Swagger

The Services We Expose

Workday Confidential

Page 16: GraphTour - Workday: Tracking activity with Neo4j (English Version)

Neo4j in a Nutshell

Page 17: GraphTour - Workday: Tracking activity with Neo4j (English Version)

• Nodes have properties (Comparable to a Map<String, ?>)

• … can have 0-N labels (Typing, Polymorphism)

Neo4j - Nodes

Workday Confidential

core

1.0.5 jar

Artifact ArtifactoryFile Workday id com.workday:core

group com.workday

artifact core

version 1.0.5

created 1458713182201

Page 18: GraphTour - Workday: Tracking activity with Neo4j (English Version)

• Relationships represent an edge between 2 nodes

• … have a name

• … can be directed

• … can have properties

Neo4j - Relationships

Workday Confidential

Artifact

core

1.0.5 jar

Git Commit

core

5ce1f767

HAS_COMMIT

Page 19: GraphTour - Workday: Tracking activity with Neo4j (English Version)

Neo4j query language

A node ()

A labeled node (:Person)

A relationship between 2 nodes ()--()

A directed labeled relationship ()-[:PARENT_OF]->()

MATCH (parent:Person)-[:PARENT_OF]->(child:Person)

RETURN parent.name, COLLECT(child.name)

Neo4j - Cypher

Workday Confidential

Page 20: GraphTour - Workday: Tracking activity with Neo4j (English Version)

Extracting the Data

Page 21: GraphTour - Workday: Tracking activity with Neo4j (English Version)

Everything Starts with Artifactory

Workday Confidential

• Official repository of Artifacts, Rpms, Docker images

• REST API to detect new artifacts in repositories

Page 22: GraphTour - Workday: Tracking activity with Neo4j (English Version)

Step 1: Artifacts

Workday Confidential

• URI: com/workday/core/1.0.5/core-1.0.5-javadoc.jar

Group com.workday

Module core

Version 1.0.5

Type jar

Classifier javadoc

ID: “com.workday:core:1.0.5:javadoc@jar”

Artifact

core 1.0.5 jar

javadoc

Page 23: GraphTour - Workday: Tracking activity with Neo4j (English Version)

Step 2: Module Versions

Workday Confidential

• The artifact relates to a “Module Version”

Group com.workday

Module core

Version 1.0.5

ID: “com.workday:core:1.0.5”

ModuleVersion

core 1.0.5

Page 24: GraphTour - Workday: Tracking activity with Neo4j (English Version)

Step 3: Modules

Workday Confidential

• The module version relates to a “Module”

Group com.workday

Module core

ID: “com.workday:core”

Module

core

Page 25: GraphTour - Workday: Tracking activity with Neo4j (English Version)

All Together With Relationships

Workday Confidential

Module

coreArtifact

jar

Artifact

javadoc

jar

Artifact

sources

jar

ARTIFACT_OF

Artifact

jar

Artifact

javadoc

jar

Artifact

sources

jar

ARTIFACT_OF

Version

1.0.5

Version

1.0.7

VERSION_OF VERSION_OF

Version

1.0.6

Page 26: GraphTour - Workday: Tracking activity with Neo4j (English Version)

Step 4: Artifact Dependencies

Workday Confidential

• Maven / Ivy descriptors Dependencies

• Dependencies DEPENDS_ON relationships

services

2.0.0DEPENDS_ON

gson

2.2.2

core

1.0.5DEPENDS_ON

Page 27: GraphTour - Workday: Tracking activity with Neo4j (English Version)

Step 5: Artifact Metadata

Workday Confidential

• Populated at build time (by a custom Gradle plugin)

• Captures information about

‒ Gradle, JDK, Build machine

‒ CI builds

‒ SCM changes

• Makes artifacts “self-documented”

Page 28: GraphTour - Workday: Tracking activity with Neo4j (English Version)

Manifest Metadata – SCM Info

Workday Confidential

• WD-Git-Origin ssh://[email protected]/core/core.git

• WD-Git-Commit e28a60b96f452680c57cb76798def09fd171011f

Artifact

core

1.0.5 jar

Git Commit

core

e28a60…

HAS_COMMIT

Page 29: GraphTour - Workday: Tracking activity with Neo4j (English Version)

Concrete Examples

Page 30: GraphTour - Workday: Tracking activity with Neo4j (English Version)

List of all Workday Artifacts

Workday Confidential

Group Module Latest

Version

Age

(days)

SCM url SCM

change

Build

URL

Latest

JIRAs

com.workday core 1.0.5 120.2 core.git e28a60b9 URL CORE-120

com.workday foo-services 1.3.0 29.1 foo-services.git 146ae135 URL FOO-57

com.workday bar-services 2.2.8 54.8 bar-services.git b538c156 URL BAR-70

… … … … … … … …

Public dashboard accessible with latest information (automatically up-to-date)

→ Where’s the build of this jar file ?

→ Where are the sources for this jar file ?

Page 31: GraphTour - Workday: Tracking activity with Neo4j (English Version)

Identifying Direct Dependents

Workday Confidential

MATCH (dependent:ModuleVersion)-[:DEPENDS_ON]->(dependency:ModuleVersion)

WHERE dependency.id = "com.workday:core:1.0.5”

RETURN dependent.id AS dependent

Service in the Sentinel REST API

Dependent

com.workday:foo-

services:1.3.0

com.workday:foo-

services:1.2.5

com.workday:bar-

services:2.2.8

com.workday:bar-

services:2.2.7

Page 32: GraphTour - Workday: Tracking activity with Neo4j (English Version)

Build Orchestration

Workday Confidential

Producing build Consuming build

Bamboo Build

CORE

BUILT

Artifact

core 1.0.5

jar

ModuleVersion

core 1.0.5

AR

TIF

AC

T_

OF

ModuleVersion

foo-services 1.3.0

AR

TIF

AC

T_

OF

Bamboo Build

FOO-SERVICES

Artifact

foo-services 1.3.0

jar

BUILT

DE

PE

ND

S_

ON

DEPENDS_ON

Page 33: GraphTour - Workday: Tracking activity with Neo4j (English Version)

Automated Release Notes

Workday Confidential

Version 1 Version 2

Bamboo Build

CORE #11

BUILT

Artifact

core v1

jar

Git Commit

core

5ce1f767

HA

S_

RE

VIS

ION

Git Commit

core

ee2a0e22

HA

S_

RE

VIS

ION

Bamboo Build

CORE #12

Artifact

core v2

Jar

BUILT

PA

RE

NT

_R

EV

ISIO

N

JIRA Issue

CORE-120

LINKS_TO

Page 34: GraphTour - Workday: Tracking activity with Neo4j (English Version)

Identify SCM Changes per JIRA

Workday Confidential

Find all mentions of a JIRA in commit messages

Input: JIRA issue

Output: Set of SCM changes

JIRA Issue

CORE-120

Git Commit

core

5ce1f767

Git Commit

core

5954ff88

Git Commit

core

ee2a0e22

Page 35: GraphTour - Workday: Tracking activity with Neo4j (English Version)

Rule: “No dynamic dependencies in Maven / Ivy files”

Rationale: Builds must be reproducible

Dynamic versions: 1.+, LATEST, [1.0, 2.0[

Detection of Rule Violations

Workday Confidential

HTML dashboard listing the latest violations

ModuleVersion

baz 4.2.10

ModuleVersion

pmd-checks

1.+

DEPENDS_ON

Page 36: GraphTour - Workday: Tracking activity with Neo4j (English Version)

Conclusion

Workday Confidential

• Service rolled out internally

• Neo4j is the perfect tool for capturing the data we’re interested in

‒ Very easy to refactor / enrich the data

• Cypher gives us insight from the aggregated data

• Solid foundation for future services

‒ Difficult part: Capturing the data

‒ Easy part: Leveraging the data by creating new queries

• Decisions based on facts, not (educated) guesses

• Holistic reporting

Page 37: GraphTour - Workday: Tracking activity with Neo4j (English Version)

Q & A

Thanks for attending

Workday Confidential

Page 38: GraphTour - Workday: Tracking activity with Neo4j (English Version)

TM