applications (2 of 2) - github pageslintool.github.io/.../session14-slides.pdf · applications (2...

Post on 26-Jun-2020

12 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Applications (2 of 2):Applications (2 of 2):Recognition, Transduction, Discrimination, 

Segmentation Alignment etcSegmentation, Alignment, etc.

Kenneth Church

Kenneth.Church@jhu.eduKenneth.Church@jhu.edu

Dec 9, 2009 1

Solitaire  Multiplayer Games: Auctions (Ads)http://www.scienceoftheweb.org/15‐396/lectures/lecture09.pdf

Right Rail

Right Rail: Avoid 

Dec 9, 2009 2

Mainline Addistortions from commercial interests

A Single Auction  A Stream of Continuous Auctionsg

• Standard Example of Second Price AuctionSta da d a p e o Seco d ce uct o– Single Auction for a Single Apple

• Theoretical Result– Second Price Auction  Truth Telling– http://en.wikipedia.org/wiki/Vickrey_auction– Optimal Strategy: 

• Bid what the apple is worth to you• Don’t worry about what it is worth to others• Don t worry about what it is worth to others

– First Price Auction  Truth Telling

• Does theory generalize to a continuous stream?Does theory generalize to a continuous stream?

Dec 9, 2009 3

Pricing: Cost Per Click (CPC)Pricing: Cost Per Click (CPC)

• Bi = your bid • Equilibriumi  y

• Bi+1 = next bid

• CTRi = your click through rate

– Advertisers• Awareness• Sales

• CTRi+1 = next click through rate

• CPCi = your price(if h d d li k )

Sa es• New Customers• ROI

– Users– (if we show your ad and user clicks)

• Improvement: CTR  Q (Prior)

• Single Auction:

Users• Minimize pain• Obtain Value

Market Maker• Single Auction: – CPCi = Bi+1 

• Continuous Stream:

– Market Maker• Maximize Revenue

• Continuous Stream: – CPCi = Bi+1 CTRi+1 / CTRi

• Truth Telling?Dec 9, 2009 4

Multi‐Player Games Many Technical OpportunitiesMany Technical Opportunities

• Economics– http://www.wired.com/culture/culturereviews/magazine/17‐06/nep_googlenomics?currentPage=all

• Machine Learning– Learning to Rank

– Estimate CTR (Q/Priors)Estimate CTR (Q/Priors)

– Sparse Data: • What is the CTR for a new ad?

– Errors can be expensive– Errors can be expensive• If CTR is too low for new ad  Penalize Growth

• If too high  Reward Bad Guys to do Bad Things

• Truth Telling for Continuous Auctions?• Truth Telling for Continuous Auctions?– Probably not, especially if participants can estimate Q better than 

market maker

• Machine Learning: Solitaire  Multi‐Player Games– Can I estimate Q better than you can?   Man‐eating tiger

Dec 9, 2009 5

ApplicationsApplications• Recognition: Shannon’s Noisy Channel Model

Speech Optical Character Recognition (OCR) Spelling– Speech, Optical Character Recognition (OCR), Spelling• Transduction

– Part of Speech (POS) Tagging– Machine Translation (MT)

• Parsing ???• Parsing: ???• Ranking

– Information Retrieval (IR)– LexicographyDi i i i• Discrimination:

– Sentiment, Text Classification, Author Identification, Word Sense Disambiguation (WSD)• Segmentation

– Asian Morphology (Word Breaking), Text Tiling• Alignment: Bilingual Corpora, Dotplots• Compression• Language Modeling: good for everything

Dec 9, 2009 6

Speech  Language

Shannon’s: Noisy Channel ModelShannon s: Noisy Channel Model

• I  Noisy Channel  OChannelModel

LanguageModely

• I΄ ≈ ARGMAXI Pr(I|O) = ARGMAXI Pr(I) Pr(O|I)Trigram Language Model Application

IndependentWord Rank More likely alternatives

We 9 The This One Two A Three Please In

need 7 are will the would also do

Channel Model

Application Input Output

Independent

need 7 are will the would also do

to 1

resolve 85 have know do…

all 9 The This One Two A Three

Application Input Output

Speech Recognition writer rider

OCR (Optical Call 9 Please In

of 2 The This One Two A Three Please In

the 1

Character Recognition)

all a1l

Spelling Correction government goverment

7

the 1

important 657 document question first…

issues 14 thing point toDec 9, 2009

Speech  LanguageUsing (Abusing) Shannon’s Noisy Channel Model: Part of g ( g) y

Speech Tagging and Machine Translation

• Speechp– Words  Noisy Channel  Acoustics

• OCR– Words  Noisy Channel  Optics

• Spelling CorrectionW d N i Ch l T– Words  Noisy Channel  Typos

• Part of Speech Tagging (POS): – POS Noisy Channel Words– POS  Noisy Channel  Words

• Machine Translation: “Made in America”– English  Noisy Channel  French

8

g y

Didn’t have the guts to use this slide at Eurospeech (Geneva)Dec 9, 2009

Dec 9, 2009 9

Spelling Correction

Dec 9, 2009 10

Dec 9, 2009 11

Dec 9, 2009 12

Dec 9, 2009 13

Dec 9, 2009 14

EvaluationEvaluation

Dec 9, 2009 15

PerformancePerformance

Dec 9, 2009 16

The Task is Hard without ContextThe Task is Hard without Context

Dec 9, 2009 17

Easier with ContextEasier with Context

• actuall actual actuallyactuall, actual, actually– … in determining whether the defendant actually will diewill die.

• constuming, consuming, costuming

i d i t d i d• conviced, convicted, convinced

• confusin, confusing, confusion

• workern, worker, workers

Dec 9, 2009 18

Easier with Context

Dec 9, 2009 19

Context ModelContext Model

Dec 9, 2009 20

Dec 9, 2009 21

Dec 9, 2009 22

Dec 9, 2009 23

Dec 9, 2009 24

Future ImprovementsFuture Improvements

• Add More FactorsAdd More Factors– Trigrams

Thesaurus Relations– Thesaurus Relations

– Morphology

S t ti A t– Syntactic Agreement

– Parts of Speech

b l• Improve Combination Rules– Shrink (Meaty Methodology)

Dec 9, 2009 25

Dec 9, 2009 26

Conclusion (Spelling Correction)Conclusion (Spelling Correction)

• There has been a lot of interest in smoothingThere has been a lot of interest in smoothing– Good‐Turing estimation

Knesser Ney– Knesser‐Ney

• Is it worth the trouble?

• Ans: Yes (at least for recognition applications)

Dec 9, 2009 27

Dec 9, 2009 28

Dec 9, 2009 29

Dec 9, 2009 30

Dec 9, 2009 31

Dec 9, 2009 32

Dec 9, 2009 33

Dec 9, 2009 34

Dec 9, 2009 35

Dec 9, 2009 36

Dec 9, 2009 37

Dec 9, 2009 38

Dec 9, 2009 39

Dec 9, 2009 40

Dec 9, 2009 41

Dec 9, 2009 42

Dec 9, 2009 43

Dec 9, 2009 44

Dec 9, 2009 45

Dec 9, 2009 46

Dec 9, 2009 47

Dec 9, 2009 48

Dec 9, 2009 49

Dec 9, 2009 50

Aligning WordsAligning Words

Dec 9, 2009 51

Dec 9, 2009 52

Dec 9, 2009 53

Dec 9, 2009 54

Dec 9, 2009 55

Dec 9, 2009 56

Dec 9, 2009 57

Dec 9, 2009 58

Dec 9, 2009 59

Dec 9, 2009 60

Dec 9, 2009 61

Dec 9, 2009 62

Dec 9, 2009 63

Dec 9, 2009 64

Dec 9, 2009 65

Dec 9, 2009 66

Dec 9, 2009 67

Dec 9, 2009 68

top related