#siop15 presentation on performance sorting using video interviews

19
Model Driven Candidate Sor0ng Based On Video Interview Cues Benjamin Taylor Chief Data Scien-st

Upload: benjamin-taylor

Post on 08-Aug-2015

62 views

Category:

Recruiting & HR


0 download

TRANSCRIPT

Model  Driven  Candidate  Sor0ng    Based  On  Video  Interview  Cues  

     

Benjamin  Taylor  Chief  Data  Scien-st                                            

Outline  •  Introduc)on  •  Case  study  objec)ve  •  Big  data  landscape    •  Problem  setup  •  Results/Conclusion  •  Future  work  

@bentaylordata  

Introduc0on  •  Chemical  Engineering  (BS/MS/PhD  Candidate)  

•  5  years  Intel/Micron  –  Photolithography,  process  control,  yield  modeling  

•  AIQ  Hedge  fund  –  600  GPU  chip  cluster,  algorithmic  stock  modeling,    –  distributed  metaheuris)c  algorithms  

•  HireVue,  Chief  Data  Scien0st  –  HR  analy)cs,  interview  modeling    

 

@bentaylordata  

Case  Study  Objec0ve  •  Given  400  recorded  video  interviews  for  sales  posi)ons  and  post  hire  performance  data  can  improved  sor)ng  efficiency  be  demonstrate  out-­‐of-­‐sample?    

V=400  

Input  Data  Set   Target  Data  Set,  n=400  

Personal  Email   Perf  [email protected]   Exceeds  

[email protected]   Meets  

[email protected]   Below  

[email protected]   Meets  

@bentaylordata  

big

data

ha

doop

Big  data  landscape  •  Big  data  plaVorms  have  mo)vated  innova)ons  around  unstructured  data  handling.  These  innova)ons  have  involved  new  algorithms  and  beWer  unstructured  wrangling  methods.    

@bentaylordata  

Big  data  landscape  •  Unstructured  data  

–  Data  that  does  not  have  a  predefine  data  model  or  schema,  i.e.  tool  logs,  resumes,  cover  le8ers,  images,  audio,  video,  Twi8er,  LinkedIn  

•  Structured  data  –  Data  that  fits  within  a  predefined  data  model.  Most  common  structured  data  formats  involve  a  column/row  architecture.  Most  familiar  examples  include  spreadsheet  soYware  such  as  Excel.  

@bentaylordata  

Problem  setup  •  Unstructured  data  challenge  

–  How  do  we  convert  the  video  into  a  manageable  machine  ready  format?  AKA  unstructured  >  structured  data.    

0.23,0.15,0.98,0.63,0.45,0.36…  

1D  Vector  representa.on  

Method?  

@bentaylordata  

F 3.95 Data Scientist Yale Sky divingM 2.93 HR Analyst SLCC PoetryF 3.41 Data Munger Harvard Cycling

1 3.95 5 310 560 2.93 7 520 911 3.41 6 240 56

Name: Sally TaylorGPA: 3.95Previous Job: Data ScientistSchool: YaleHobbies: Sky diving

UNSTRUCTURED

STRUCTURED

TOKENIZED

Problem  Setup  •  What  is  done  for  text  modeling?  

@bentaylordata  

Problem  Setup  •  Piecemeal  the  structuring:  final  outputs  are  scalars  

Audio  

Video  

Text  

Signal  Processing  

Personality  

Expression   Signal  Processing  

ts  

ts  

us  

us  us  

us  =  unstructured  data  ts  =  -me  series  data  

s  =  scalar  data  

s  

@bentaylordata  

Feature  Gen  

Raw  Audio  Indicators  

@bentaylordata  

•  Engagement  •  Mo)va)on  •  Distress  •  Aggression  

Model  

Personality  Models  

@bentaylordata  

Feature  Gen  

Video  Indicators  

@bentaylordata  

Signal  Processing  

F989   F990   F991  

scalar  

@bentaylordata  

Combining  All  Features  

X  56.341    -­‐200.45    0    1    

2  4  60.71  12    52.15    -­‐350.12    1    1    

Feature  Mapping:  As  the  features  are  produced  they  are  stored  in  a  matrix  where  each  column  represents  a  feature  and  each  row  represents  an  interview  

2  4  60.71  12    52.15    -­‐350.12    1    0    2  3  16.16  21    25.51    -­‐105.21    0    0    

NA  NA  NA  NA  NA  

How  To  Build  A  Model  

Model  

Best    Fitness?    

@bentaylordata  

A  Lesson  On  K-­‐folding  

@bentaylordata  

Folds  =  9  

Cut  your  data  up  into  fixed  folds  

A  Lesson  On  K-­‐folding  

@bentaylordata  

Folds  =  9   Fold  =  1   Fold  =  2…   Y_pred  

Fitness  Metric?  

Top  Performer  Accuracy   AUC  

@bentaylordata  

Results:  

Conclusion:  Using  structured  features  from  audio  and  video  we  are  able  to  show  predic)ve  sor)ng  value  in  our  out-­‐of-­‐sample  interviews.        

Model   AUC  score  Bernoulli  NB   0.75  

Other   0.79  

67.50%  reduc)on  in  interview  evalua)on  >300%  increase  in  concentra)on  

@bentaylordata  

Feature  Engineering  

Auto  Feature    Engineering  

Future  Work:  

Future  work  involves  offloading  the  feature  engineering  tasks  to  a  more  automated  Process  such  as  deep  learning  or  more  advanced  ensemble  modeling  methods.  

My  Contact  Info:    Twi^er:  @bentaylordata    Email:  [email protected]    LinkedIn:    bentaylordata  

 

@bentaylordata