making fashion recommendations with human-in-the-loop machine learning

Making fashion recommendations with human-in-the-loop machine learning

Brad Klingenberg, Stitch [email protected]

Machine learning meets fashionKDD2016 | San Francisco | August 2016

Three lessons

Personal styling recommendations with humans in the loop

Fashion with humans in the loop:

It works really well, but it’s complicated




Lesson 1: You have more than one feedback loop

Lesson 2: Human selection changes your objective function

Lesson 3: Even humans need feature selection

Humans in the loop for fashionat Stitch Fix

Stitch Fix

Styling at Stitch Fix

Personal styling

Inventory

Styling at Stitch Fix: personalized recommendations

Inventory Algorithmic recommendations

Machine learning

Styling at Stitch Fix: expert human curation

Human curation

Algorithmic recommendations

Lesson 0: Humans and machines are a winning combination

Combining art & science

Humans and machines are a winning combination.



Human judgement

● helps leverage unstructured data



Human judgement

● helps leverage unstructured data● provides a human element to the

process (empathy, creativity)



Human judgement

● helps leverage unstructured data● provides a human element to the

process (empathy, creativity)● frees the algorithm developer from

edge cases

Traditional recommenders

Learning through feedback

Humans in the loop


Humans in the loop


Human interaction

Training a model

What should you predict?

Training a model


Historical shipment data is plentiful

Training a model


Naive approach: ignore selection and train on historical shipment data

Advantages

● “traditional” supervised problem● simple historical data

Training a model

Problem 1: selection can censor your data

Censoring through selection

Problem: selection can censor your data

Censoring through selection

Problem: selection can censor your data

Arms flaunted

SuccessYes

No

Yes No

?

?

p

1-p

Success given selection

Problem 2: success probabilities can make for terrible recommendations


Problem: success probabilities can make for terrible recommendations



“Probability the stylist should

choose this item”



“Probability the stylist should

choose this item”

Probability the stylist will be able to send this to the right client


clients

Y_i = 1

Y_i = 0

Example: A low-coverage item


clients

Y_i = 1

Y_i = 0

An edgy dress. Not many people will like it, but easy for a stylist to identify these clients



clients

Y_i = 1

Y_i = 0

High score!



Example: A high-coverage item

clients

Y_i = 1

Y_i = 0


clients

Y_i = 1

Y_i = 0

More neutral item. Many clients will like it, but it is hard for stylists to identify these clients



clients

Y_i = 1

Y_i = 0Average score!


A useful recommendation?

High score

Low score

The lesson

Effective recommendations require understanding human selection

In both of these cases,

Effective recommendations require understanding human selection

Generally, selection data will be much larger and more complicated to collect and work with

● Negative cases: logging the set of things that was available to be selected but was not selected

● Presentation effects○ similar to the way search engine results are studied

Recall the styling process

Human curation

Algorithmic recommendations

Creating features for the human classifier

Unstructured data can be overwhelming, even for humans


Derived feature 1

Derived feature 2

Feature engineering: creating useful summaries for human consumption


Derived feature 1

Derived feature 2


It is important to focus on

● Interpretability● Evidence● Orthogonality

For fashion data especially, this is hard!


A / B

Ultimately, this is an empirical question

Run experiments with

● Production systems● Simulations for human classifiers

Thanks!

Questions?

making fashion recommendations with human-in-the-loop machine learning

Data & Analytics