reifier spark summit 2014 slides

Post on 30-Jun-2015

110 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Presentation on Reifier, fuzzy matching using Apache Spark and Machine learning

TRANSCRIPT

© Nube Technologies

Fuzzy Matching With Spark

© Nube Technologies

About Us

ALICE: This is impossible!

THE MAD HATTER: Only if you believe it is.

© Nube Technologies

The problem

According to Gartner, businesses are losing upto 25% potential revenue due to lack of holistic multichannel view of data.

© Nube Technologies

The problem

© Nube Technologies

Challenges

● Quadratic nature of the problem● No standard notion of similarity● Omissions, typos and other issues

© Nube Technologies

Use case - Cross and Upselling

© Nube Technologies

Lead Generation

© Nube Technologies

BFSI

Personal Credit RatingsFraud detection

© Nube Technologies

Other Use Cases

Yellow PagesCatalog and Inventory Management

© Nube Technologies

Wishlist

Works with any kind of dataScalableNo manual configuration of rules or algorithms

© Nube Technologies

Spark Advantages

● Distributed● Scalable● In memory● Machine Learning● Sampling● No need to orchestrate multiple jobs

© Nube Technologies

Reifier - Label

Are these duplicates?(Y/N)

© Nube Technologies

Reifier Output

© Nube Technologies

Thank You !

top related