how to model text like a rockstar

28
Ben Taylor Chief Data Scientist How To Model Text Like A Rock Star

Upload: benjamin-taylor

Post on 17-Jul-2015

114 views

Category:

Data & Analytics


3 download

TRANSCRIPT

Page 1: How To Model Text Like A Rockstar

Ben TaylorChief Data Scientist

How To Model Text Like A Rock Star

Page 2: How To Model Text Like A Rockstar
Page 3: How To Model Text Like A Rockstar

Chemical Engineering (BS/MS/PhD…*)

Twitter: @bentaylordataLinkedIn: bentaylordata

Page 4: How To Model Text Like A Rockstar
Page 5: How To Model Text Like A Rockstar

Modeling Numeric Data Is Easy

Page 6: How To Model Text Like A Rockstar

Text Applications?

StockResumesCover lettersLogs

Page 7: How To Model Text Like A Rockstar

The Basics Of Document Modeling

UNSTRUCTURED

STRUCTURED

Tokenized

Page 8: How To Model Text Like A Rockstar

Rich Text

From: Mamatha Devineni Ratnam <[email protected]>

Subject: Pens fans reactions

Organization: Post Office, Carnegie Mellon, Pittsburgh, PA

Lines: 12

NNTP-Posting-Host: po4.andrew.cmu.edu

I am sure some bashers of Pens fans are pretty confused about the lack

of any kind of posts about the recent Pens massacre of the Devils. Actually,

I am bit puzzled too and a bit relieved. However, I am going to put an end

to non-PIttsburghers' relief with a bit of praise for the Pens. Man, they

are killing those Devils worse than I thought. Jagr just showed you why

he is much better than his regular season stats. He is also a lot

fo fun to watch in the playoffs. Bowman should let JAgr have a lot of

fun in the next couple of games since the Pens are going to beat the pulp out of Jersey anyway.

I was very disappointed not to see the Islanders lose the final

regular season game. PENS RULE!!!

rec.sport.hockey

From: [email protected] (Matthew B Lawson)

Subject: Which high-performance VLB video card?

Summary: Seek recommendations for VLB video card

Nntp-Posting-Host: midway.ecn.uoknor.edu

Organization: Engineering Computer Network, University of Oklahoma, Norman, OK, USA

Keywords: orchid, stealth, vlb

Lines: 21

My brother is in the market for a high-performance video card that supports

VESA local bus with 1-2MB RAM. Does anyone have suggestions/ideas on:

- Diamond Stealth Pro Local Bus

- Orchid Farenheit 1280

- ATI Graphics Ultra Pro

- Any other high-performance VLB card

Please post or email. Thank you!

- Matt

--

| Matthew B. Lawson <------------> ([email protected]) |

--+-- "Now I, Nebuchadnezzar, praise and exalt and glorify the King --+--

| of heaven, because everything he does is right and all his ways |

| are just." - Nebuchadnezzar, king of Babylon, 562 B.C. |

comp.sys.ibm.pc.hardware

Page 9: How To Model Text Like A Rockstar

Weak Text Example

Page 10: How To Model Text Like A Rockstar

Load Example Dataset

Page 11: How To Model Text Like A Rockstar

Load Example Dataset

Page 12: How To Model Text Like A Rockstar

CountVectorizer

Page 13: How To Model Text Like A Rockstar

CountVectorizer

Page 14: How To Model Text Like A Rockstar

Term Frequency

Page 15: How To Model Text Like A Rockstar

Term Frequency

Page 16: How To Model Text Like A Rockstar

How Can I Be Amazing?

Page 17: How To Model Text Like A Rockstar

<notebook>

Page 18: How To Model Text Like A Rockstar

Weak Text Example

Now lets really mess this up, reduce one class by an order of magnitude

Page 19: How To Model Text Like A Rockstar

Does this model have any value?

Page 20: How To Model Text Like A Rockstar
Page 21: How To Model Text Like A Rockstar

Problem Setup

• Piecemeal the structuring: final outputs are scalars

Audio

Video

Text

Signal Processing

Personality

Expression Signal Processing

ts

ts

us

usus

us = unstructured datats = time series data

s = scalar data

s

Page 22: How To Model Text Like A Rockstar

FeatureGen

Raw Audio Indicators

@bentaylordata

Page 23: How To Model Text Like A Rockstar

• Engagement• Motivation• Distress• Aggression

Model

Personality Models

@bentaylordata

Page 24: How To Model Text Like A Rockstar

FeatureGen

Video Indicators

@bentaylordata

SignalProcessing

F989 F990 F991

scalar

Page 25: How To Model Text Like A Rockstar

@bentaylordata

Combining All Features

X

56.341 -200.45 0 1 2 4 60.71 12 52.15 -350.12 1 1

Feature Mapping:As the features are produced they are stored in a matrix where each column represents a feature and each row represents an interview

2 4 60.71 12 52.15 -350.12 1 0

2 3 16.16 21 25.51 -105.21 0 0 NANA

NA

NANA

Page 26: How To Model Text Like A Rockstar

Features Extracted

70% retention top performers, 75% reduction in total interview volume

10,000 interviews, 75% reduction, 2,500 interviews reviewed30% increase in sourcing to hit goals10,000=>13,000, 2,500=>3,250 Total savings: (10,000-3,250)/10,000 = 67.50%

Page 27: How To Model Text Like A Rockstar

PEOPLE ARE WHO THEY ARENOT WHAT THEY WRITE

VOICES . EXPERIENCES . PASSION . POTENTIAL

Tyler Penman1

Riley Kurts2

Jace Kendall3

Ric Fratus4

Jennifer Lee5

Benjamin Dickson6

Page 28: How To Model Text Like A Rockstar

@bentaylordata

[email protected]

Questions?