large-scale machine learning for e-commerce

20
Large-Scale Machine Learning for E-commerce Ankur Datta, PhD RIT-Boston Manager [email protected]

Upload: rakuten-inc

Post on 12-Apr-2017

47 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Large-Scale Machine Learning for E-commerce

Large-Scale Machine Learning for E-commerce

Ankur Datta, PhDRIT-Boston Manager

[email protected]

Page 2: Large-Scale Machine Learning for E-commerce

Data: Structured or Unstructured

2

Structured Data Unstructured Data

Page 3: Large-Scale Machine Learning for E-commerce

Product Catalog is Unstructured Data

3

Two Key Tasks:

Organizing and Searching Product Catalog

Source: http://www.macosxautomation.com/automator/examples/ex04/index.html

Page 4: Large-Scale Machine Learning for E-commerce

Product Catalog is Unstructured Data

4

Two Key Tasks:

Organizing and Searching Product Catalog

Page 5: Large-Scale Machine Learning for E-commerce

5

Organizing Product Catalog

Product CatalogTaxonomy

Machine Learning

Organize Information for browsing / search / data analysis

Application

Page 6: Large-Scale Machine Learning for E-commerce

Organizing Product Catalog using Classification

6

Lips Too Women's

'Too Sliver' Patent

Casual Shoes

Size 6

10 Crosby Women's

Ynez Pump

1883 by Wolverine

Women's Maisie

Oxford Tan/Taupe

Leather/Suede

1803 Women's

'Nome' Crocodile

Dress Shoes Size 9

Women’s

Shoes

Comfort

Shoes

Pump

s

Sneaker

s

Flats

Machine Learning Model

Input Output

Page 7: Large-Scale Machine Learning for E-commerce

Decision Tree

7

Page 8: Large-Scale Machine Learning for E-commerce

Machine Learning Model: Many Decision Trees

8

…… +++

f1(x) f2(x) fM(x)

Combined decision for x

w1

w2wM

Page 9: Large-Scale Machine Learning for E-commerce

Our Large-Scale Machine Learning System for Classificatio

n

1. Normalize text

2. Extract features

3. Many-levels of Deci

sion Trees serve as

classification model

s

9

Page 10: Large-Scale Machine Learning for E-commerce

Classification Results of New Product Titles

10

Product Title: Cross-Front Peplum Layered Dress

General: Women’s Clothing > Clothing

More Specific: Party & Cocktail Dresses > Dresses > Women’s

Clothing > Clothing

Product Title: Cut-Out Leather Platform Wedge Espadrilles

General: Shoes

More Specific: Pumps > Women’s Shoes > Shoes

Page 11: Large-Scale Machine Learning for E-commerce

Product Catalog is Unstructured Data

11

Two Key Tasks:

Organizing and Searching Product Catalog

Page 12: Large-Scale Machine Learning for E-commerce

12

Compact desktop

computer

Somewhere in US on

Wed, 13 Apr 2016 15:59:47 GMT ….

Page 13: Large-Scale Machine Learning for E-commerce

13

Compact desktop

computer

Lenovo thinkcentre

Page 14: Large-Scale Machine Learning for E-commerce

Lenovo all in one

14

Page 2

Page 3

Page 15: Large-Scale Machine Learning for E-commerce

Purchase!

15

Page 16: Large-Scale Machine Learning for E-commerce

16

Ideal

Situatio

n

Current

Situatio

n

?

How do we find the most relevant

products for a search query?

Text-based search alone does not do the job!

Page 17: Large-Scale Machine Learning for E-commerce

Learning to Rank

17

Machine Learning Model that learns to rank search results

Source: http://blog.csdn.net/eastmount/article/details/43080791

query

document

relevance

Relevance based on text

alone is not enough!

What else can we use?

How about user-behavior

signals?

Page 18: Large-Scale Machine Learning for E-commerce

User’s behavior signals

18

buyclick add

Page 19: Large-Scale Machine Learning for E-commerce

Results of Learning to Rank

19

Search Query: “40inch tv”

Regular Text Search Search with User-Signals and Learning to Rank

Not relevant

Not relevant

Page 20: Large-Scale Machine Learning for E-commerce

Summary

• E-commerce data is primarily unstructured data

– Product catalog, merchant and item reviews, search queries

• Proper organization and precise search of this data is necess

ary for good customer experience

– We built machine learning models for large-scale classification of prod

uct catalogs

– Also, we are learning from user behavior to improve our search releva

nce

20

… …+++

f1(x) f2(x) fM(x)

w1

w2 wM