ishs04

36
Computational Approach To Recognizing Wordplay In Jokes Julia M. Taylor & Lawrence J. Mazlack Applied Artificial Intelligence Laboratory University of Cincinnati

Upload: webuploader

Post on 06-May-2015

359 views

Category:

Entertainment & Humor


0 download

TRANSCRIPT

Page 1: ISHS04

Computational Approach To Recognizing Wordplay In Jokes

Julia M. Taylor & Lawrence J. MazlackApplied Artificial Intelligence LaboratoryUniversity of Cincinnati

Page 2: ISHS04

Introduction

Computational recognition of humor is difficultRequires natural language understandingWorld knowledge

This is an initial investigation into computational humor recognition using wordplayLearns statistical patterns of textRecognizes utterances similar in pronunciation to a

given wordDetermines if found utterances transform a text into a

joke

Page 3: ISHS04

Subclass of Humor: Joke

A joke is a short humorous piece of literature in which the funniness culminates in the final sentence

Most jokes have:Setup – the first part of the joke which establishes certain

expectationsPunchline – much shorter part of joke, causes some form of

conflict Force another interpretation Violate expectation

“Is the doctor home?” the patient asked in his bronchial whisper. “No,” the doctor’s young and pretty wife whispered in reply. “Come right in.”

Page 4: ISHS04

Computational Humor

Computation humor generators, examples: Light bulb joke generator Joke generator that focuses on witticisms based

around idioms Generator of humorous parodies of existing acronyms Generator of a humorous sentence based on

alphanumeric password Pun generators

Computational humor recognizer

Page 5: ISHS04

Wordplay Jokes

Depend on words that are similar in sound but have different meaning

Same pronunciation, same spelling Same pronunciation, different spelling Similar pronunciation, different spelling

The difference in meaning creates conflict or breaks prediction

Nurse: I need to get your weight today.Impatient patient: Three hours and twelve minutes.

weight=wait

Page 6: ISHS04

Statistical Language Recognition N-gramsModel that uses conditional probability to predict Nth word based on N-1

previous words.Probabilities depend on the training corpus Find a word with largest P(word |“is”)

A newspaper reporter goes around the world with his investigation. He stops people on the street and asks them: “Excuse me, what is your opinion of the meat shortage?” An American asks: “What is ‘shortage’?” A Russian asks: “What is ‘opinion’?” A Pole asks: “What is ‘meat’?” A New York taxi-driver asks: “What is ‘excuse me’?”

The Parisian Little Moritz is asked in school: “How many deciliters are there in a liter of milk?” He replies: “One deciliter of milk and nine deciliters of water.” – In France, this is a good joke; in Hungary, this is a good milk.

Page 7: ISHS04

Possible Methods for Joke RecognitionDetermine if a given text is a jokeGiven a joke, determine the punchline location

Hotel clerk: Welcome to our hotel

Max: Thank you. Can you tell me what room I’m in?

Clerk: The lobby

Page 8: ISHS04

Restricted Domain: Knock Knock JokesLine1: “Knock, Knock”

Line2: “Who’s there?”

Line3: any phrase

Line4: Line3 followed by “who?”

Line5: One or several sentences containing one of the following

Type1: Line3

Type2: A wordplay on Line3

Type3: A meaningful response to a wordplay of Line4

Page 9: ISHS04

Restricted Domain: Knock Knock JokesType1: Line3

--Knock, Knock--Who’s there?--Water--Water who?--Water you doing tonight?

Type2: A wordplay on Line3--Knock, Knock--Who’s there?--Ashley--Ashley who?--Actually, I don’t know.

Type3: A meaningful response to a wordplay of Line4--Knock, Knock--Who’s there?--Tank--Tank who?--You are welcome.

Page 10: ISHS04

Script-based Semantic Theory of HumorThe text is compatible with 2 different scriptsThe 2 scripts are opposite

–Knock, Knock–Who’s there?–Water–Water who?–Water you doing tonight?

Scripts overlap in phonetics representation of water and what are

Scripts differ in meaning

Page 11: ISHS04

Experimental Design

Definitions:Wordplay: a word that sounds similar but has a

different spelling (and meaning)

What are is a wordplay on waterKeyword: what wordplay is based on (Line3)

Water is a keyword

Recognize only Type1 jokes

Page 12: ISHS04

Experimental Design

--Knock, Knock--Who’s there?--Water--Water who?--Water you doing tonight?

Step1: joke format validationStep2: computational generation of sound-alike sequencesStep3: validations of meaning of a chosen sound-alike

sequence Step4: last sentence validation with sound-alike sequence

Page 13: ISHS04

Experimental Design

Training set: 66 Knock Knock jokesEnhance similarity table of lettersSelect N-gram training texts

66 texts containing wordplay from 66 training jokesTest set:

130 Knock Knock jokes 66 Non-jokes that have similar structure to Knock

Knock jokesWater is cold.

Page 14: ISHS04

Experimental Design

Similarity Table Contains combination of letters that

sound similarBased on similarity table of cross-

referenced English consonant pairs Modified by:

o translating phonemes to letters o adding vowels that are close in

soundo adding other combinations of letters

that may be used to recognize wordplay

Segment of similarity table

Page 15: ISHS04

Experimental Design

Training texts were entered into N-gram databaseWordplay validation: bigram table

Pairs of words from training texts with count of their occurrences(training texts 1) (texts were 1) (were entered 1) (entered into 1)…

Punchline validation: trigram tableThree words in a row from training texts with count of

their occurrences(training texts were 1) (texts were entered 1) (were entered into 1)

Page 16: ISHS04

Step 1: Joke Format Validation

Line1: “Knock, Knock”

Line2: “Who’s there?”

Line3: any phrase

Line4: Line3 followed by “who?”

Line5: One or several sentences containing Line3–Knock, Knock–Who’s there?–Water–Water who?–Water you doing tonight?

–Knock, Knock–Who’s there?–Water–Water who?–What are you doing tonight?

Page 17: ISHS04

Step 2: Generation of Wordplay SequencesRepetitive letter replacements of Line3

Similarity used for letter replacementsResulting utterances are ordered

according to their similarity with Line3Utterances with highest similarity are

checked for decomposition into several words

Words have to be in Webster's Second International (234,936 words)

Segment of similarity table

Page 18: ISHS04

Step 2: Generation of Wordplay Sequences

a e .23

a o .23

e a .23

e i .23

e o .23

en e .23

k sh .11

l r .56

r m .44

r re .23

t d .39

t th .32

t z .17

w m .44

w r .42

w wh .23

Page 19: ISHS04

Step 2: Generation of Wordplay Sequences

a e .23

a o .23

e a .23

e i .23

e o .23

en e .23

k sh .11

l r .56

r m .44

r re .23

t d .39

t th .32

t z .17

w m .44

w r .42

w wh .23

Page 20: ISHS04

Step 2: Generation of Wordplay Sequences

a e .23

a o .23

e a .23

e i .23

e o .23

en e .23

k sh .11

l r .56

r m .44

r re .23

t d .39

t th .32

t z .17

w m .44

w r .42

w wh .23

Page 21: ISHS04

Step 2 :Generation of Wordplay Sequences

a e .23

a o .23

e a .23

e i .23

e o .23

en e .23

k sh .11

l r .56

r m .44

r re .23

t d .39

t th .32

t z .17

w m .44

w r .42

w wh .23

Page 22: ISHS04

Step 2: Generation of Wordplay Sequences

a e .23

a o .23

e a .23

e i .23

e o .23

en e .23

k sh .11

l r .56

r m .44

r re .23

t d .39

t th .32

t z .17

w m .44

w r .42

w wh .23

Page 23: ISHS04

Step 2: Generation of Wordplay Sequences

a e .23

a o .23

e a .23

e i .23

e o .23

en e .23

k sh .11

l r .56

r m .44

r re .23

t d .39

t th .32

t z .17

w m .44

w r .42

w wh .23

Page 24: ISHS04

Step 2: Generation of Wordplay Sequences

a e .23

a o .23

e a .23

e i .23

e o .23

en e .23

k sh .11

l r .56

r m .44

r re .23

t d .39

t th .32

t z .17

w m .44

w r .42

w wh .23

Page 25: ISHS04

Step 2: Generation of Wordplay Sequences

a e .23

a o .23

e a .23

e i .23

e o .23

en e .23

k sh .11

l r .56

r m .44

r re .23

t d .39

t th .32

t z .17

w m .44

w r .42

w wh .23

Page 26: ISHS04

Step 2: Generation of Wordplay Sequences

a e .23

a o .23

e a .23

e i .23

e o .23

en e .23

k sh .11

l r .56

r m .44

r re .23

t d .39

t th .32

t z .17

w m .44

w r .42

w wh .23

Decomposition of whator is what orwhat or is different from water

wordplay found; return what or

Page 27: ISHS04

Step 3: Wordplay Validation

Check if the wordplay is meaningfulIf wordplay is at least two words

o Decompose wordplay into word pairs what or

o Check if word pair in the bigram database no

If wordplay is one wordo The word is in the dictionary

If wordplay is meaningful, Step 4. Otherwise, Step 2.

Page 28: ISHS04

Step 2: Generation of Wordplay Sequences

a e .23

a o .23

e a .23

e i .23

e o .23

en e .23

k sh .11

l r .56

r m .44

r re .23

t d .39

t th .32

t z .17

w m .44

w r .42

w wh .23

Decomposition of whatare is what arewhat are is different from water

wordplay found; return what are

Page 29: ISHS04

Step 3: Wordplay Validation

Check if the wordplay is meaningfulIf wordplay is at least two words

o Decompose wordplay into word pairs

what areo Check if word pair in the bigram database

yes

Proceed to Step 4.

Page 30: ISHS04

Step 4: Last Sentence Validation with WordplayWordplay is meaningfulCould occur

In the beginning of last sentence:

What are you doing?

In the middle of last sentence:

Please tell me what are you doing?

At the end of last sentence

The question started with “what are”.

Page 31: ISHS04

Step 4: Last Sentence Validation with Wordplay In the beginning of last sentence:

If wordplay is at least 2 wordso What are you doing?o Check if (what are you) and (are you doing) are in trigram table.

If wordplay is only one wordo Meter you doing?o Check if (meter you doing) is in trigram table

If at least one of the needed sequences in not in trigram table, Step 2.

Otherwise, the text is a joke.

Page 32: ISHS04

Step 4: Last Sentence Validation with Wordplay In the middle of last sentence:

If wordplay is at least 2 wordso Please tell me what are you doing?o Check if (tell me what), (me what are), (what are you) and (are

you doing) are in trigram table.If wordplay is only one word

o Please tell me meter you doing?o Check if (tell me meter) and (meter you doing) is in trigram

tableIf at least one of the needed sequences in not in

trigram table, Step 2. Otherwise, the text is a joke.

Page 33: ISHS04

Step 4: Last Sentence Validation with Wordplay At the end of last sentence:

If wordplay is at least 2 wordso The sentence ended with what are?o Check if (ended with what) and (with what are) are in trigram

table.If wordplay is only one word

o The sentence ended with meter?o Check if (ended with meter) is in trigram table

If at least one of the needed sequences in not in trigram table, Step 2.

Otherwise, the text is a joke.

Page 34: ISHS04

Results

66 training jokes 59 jokes were recognized 7 unrecognized, no wordplay found

66 non-jokes 62 correctly recognized as non-jokes 1 found wordplay that makes sense 3 incorrectly recognized as jokes

130 test jokes 8 jokes were not expected to be recognized 12 identified as jokes with expected wordplay 5 identified as jokes with unexpected wordplay 80 expected wordplay found

Page 35: ISHS04

Possible Enhancements

Improve last sentence validationIncreasing size of text used for N-gram trainingParserN-grams with stemming

Improve wordplay generatorUse of phoneme comparison

Use wider domainAll types of Knock Knock jokesOther types of wordplay jokes

Page 36: ISHS04

Conclusion

Initial investigation into computational humor recognition using wordplay

The program was designed toRecognize wordplay in jokes 67%Recognize jokes with containing wordplay 12%