predicting candidate performance from text nlp

29
Predicting Candidate Performance From Text (NLP + DBN) Ben Taylor, Chief Data Scientist

Upload: benjamin-taylor

Post on 12-Apr-2017

163 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Predicting Candidate Performance From Text (NLP + DBN)

 Ben Taylor, Chief Data Scientist

INTRODUCTIONS

Ben Taylor @bentaylordata

Background Personal

•  Sequoia Capital

•  Largest Video Interviewing Platform

•  Forbes #10 most promising companies

•  Global: 189 countries

NATURAL LANGUAGE PROCESSING (NLP)

GRIT MOTIVATION ENGAGEMENT PERFORMANCE

1 55 80 95%

0 75 10 22%

0 50 20 57%

1 20 90 91%

0 40 60 11%

Basic Tutorial On How To Build A Numeric Feature Model

BUILDING A MODEL

ESSAY GRIT MOTIVATION ENGAGEMENT PERFORMANCE

I want to work here 1 55 80 95%

I have great teamwork 0 75 10 22%

Synergy 0 50 20 57%

I have so much grit 1 20 90 91%

They fired that individual 0 40 60 11%

Now what?!?

BUILDING A MODEL

ESSAY PERFORMANCE

I want to work here 95%

I have great teamwork 22%

Synergy 57%

I have so much grit 91%

They fired that individual 11%

Now what?!?

BUILDING A MODEL

Map: Bad = 0 Good = 1 Better = 2 Best = 3

Tokenize: Female = 1 Male = 1

Female Male

1 0

0 1

I want to work here have great PERF. 1 1 1 1 1 0 0 95%

1 0 0 0 0 1 1 22%

0 0 0 0 0 0 0 57%

1 0 0 0 0 1 0 91%

0 0 0 0 0 0 0 11%

Tokenize the text into unique word columns

BUILDING A MODEL

ESSAY PERFORMANCE

I want to work here 95%

I have great teamwork 22%

Synergy 57%

I have so much grit 91%

They fired that individual 11%

I want to work here have great PERF. 1 1 1 1 1 0 0 95%

1 0 0 0 0 1 1 22%

0 0 0 0 0 0 0 57%

1 0 0 0 0 1 0 91%

0 0 0 0 0 0 0 11%

Bag of words modeling, sequence and ordering is lost

BUILDING A MODEL

Bag of words modeling, sequence and ordering is lost

BUILDING A MODEL

I want Want to to go work here PERF. 1 1 1 1 1 95%

1 0 0 0 0 22%

0 0 0 0 0 57%

1 0 0 0 0 91%

0 0 0 0 0 11%

Band-Aid: Concept of n-grams

BUILDING A MODEL

RESUME MODELING

Upload Your Resume Now painstakingly fill out this form containing all of the exact same information

F 3.95 Data Scientist Yale Sky divingM 2.93 HR Analyst SLCC PoetryF 3.41 Data Munger Harvard Cycling

1 3.95 5 310 560 2.93 7 520 911 3.41 6 240 56

Name: Sally TaylorGPA: 3.95Previous Job: Data ScientistSchool: YaleHobbies: Sky diving

Document modeling review

UNSTRUCTURED

STRUCTURED

MUNGED

Resume Extension

Resume format consolidation

GPA Inclusion (18%)

GPA Replacement

Predict Every Point Folds = 9 Fold = 1 Fold = 2… Y_pred

Mimicking the human recruiter Feature Hunt

ONE FEATURE AT A TIME

INCREMENTAL GAINS

DEEP LEARNING

Unstructured ENGINEERS AND MANUAL FEATURES ARE EXPENSIVE, USING DEEP LEARNING TO AUTOMATE

AUTOMATIC FEATURE GENERATION

Structured I want Want to to go work here PERF.

1 1 1 1 1 95%

1 0 0 0 0 22%

0 0 0 0 0 57%

1 0 0 0 0 91%

0 0 0 0 0 11%

ESSAY

I want to work here

I have great teamwork

Synergy

I have so much grit

They fired that individual

ENGINEERS AND MANUAL FEATURES ARE EXPENSIVE, USING DEEP LEARNING TO AUTOMATE

AUTOMATIC FEATURE GENERATION

ESSAY

I want to work here

I have great teamwork

Synergy

I have so much grit

They fired that individual

ESSAY

3 2 1 4 5

3 7 67 345

54

3 7 99 10234

78 203 501 14

1 2 3 4 5 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0

LSTM

RAW TEXT WORD SEQUENCE

ENCODING

BEGIN SCRATCHING AT LAYOUT

AUTOMATIC FEATURE GENERATION (LAYOUT)

CNN: bit.ly/pacon

INTERVIEW MODELING

27

Would you ever hire from just a resume?

INTERVIEW MODELING SOFT/TECHNICAL COMPETENCIES Resume can overstate and understate

Audio Video Text

QUESTIONS