knock knock jokes

Download Knock Knock Jokes

Post on 20-Feb-2015

869 views

Category:

Documents

5 download

Embed Size (px)

TRANSCRIPT

UNIVERSITY OF CINCINNATIMay 24, 2004 Date:___________________

Julia Michelle Taylor I, _________________________________________________________,hereby submit this work as part of the requirements for the degree of:

Master of Sciencein:

Computer ScienceIt is entitled:Computational Recognition of Humor in a Focused Domain

This work and its defense approved by:Dr. Lawrence Mazlack Chair: _______________________________ Dr. Carla Purdy _______________________________ Dr. John Schlipf _______________________________Dr. Michele Vialet _______________________________

_______________________________

Computational Recognition Of Humor In A Focused DomainA thesis submitted to the Division of Research and Advanced Studies of the University of Cincinnati in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE in the Department of Electrical and Computer Engineering and Computer Science of the College of Engineering 2004 by Julia Taylor B.S., University of Cincinnati, 1999 B.A., University of Cincinnati, 1999 Committee Chair: Dr. Lawrence Mazlack

Abstract. With advancing developments of artificial intelligence, humor researchers have begun to look at approaches for computational humor. Although there appears to be no complete computational model for recognizing verbally expressed humor, it may be possible to recognize jokes based on statistical language recognition techniques. This is an investigation into computational humor recognition. It considers a restricted set of all possible jokes that have wordplay as a component and examines the limited domain of Knock Knock jokes. The method uses Raskin's Theory of Humor for its theoretical foundation. The original phrase and the complimentary wordplay have two different scripts that overlap in the setup of the joke. The algorithm deployed learns statistical patterns of text in N-grams and provides a heuristic focus for a location of where wordplay may or may not occur. It uses a wordplay generator to produce an utterance that is similar in pronunciation to a given word, and the wordplay recognizer determines if the utterance is valid by using N-gram. Once a possible wordplay is discovered, a joke recognizer determines if a found wordplay transforms the text into a joke.

Acknowledgments

I would like to express my sincere gratitude to Dr. Lawrence Mazlack, who not only made this project possible, but also very enjoyable. His advice, patience, ideas, and many late evenings of arguments and inventions are only a few reasons in a very long list. Thank you!

I would like to thank the Thesis committee, Dr. John Schlipf, Dr. Michele Vialet and Dr. Carla Purdy. This work has greatly benefited from your suggestions.

Thanks are due to Electronic Text Center at the University of Virginia Library for the permission to use their texts in the experiments. To Dr. Graeme Ritchie, thank you for your comments in the initial stage of the project, and making your research available. I would also like to thank Adam Hoffman for allowing the flexibility in time that made it possible to complete this thesis. The list would not be complete without G.I. Putiy, who has been an inspiration for many years.

I would like to thank my parents, Michael and Tatyana Slobodnik, and my brother Simon for their love, encouragement, and support in too many ways to describe.

Last but not least, a sincere thank you to my husband, Matthew Taylor, without whose love, help, understanding and support I would be completely lost.

Table of Content List of Tables ................................................................................................. 4 1 Introduction ................................................................................................ 5 2 Background................................................................................................. 72.1 Theories of Humor................................................................................................... 7 2.1.1 Incongruity-Resolution Theory ........................................................................ 8 2.1.2 Script-based Semantic Theory of Humor ...................................................... 12 2.1.3 General Theory of Verbal Humor.................................................................. 17 2.1.4 Veatchs Theory of Humor ............................................................................. 21 2.2 Wordplay Jokes...................................................................................................... 24 2.3 Structure of Jokes .................................................................................................. 26 2.3.1 Structural Ambiguity in Jokes........................................................................ 26 2.3.1.1 Plural and Non-Count Nouns as Ambiguity Enablers................................. 26 2.3.1.2 Conjunctions as Ambiguity Enablers........................................................... 28 2.3.1.3 Construction A Little as Ambiguity Enabler ........................................... 28 2.3.1.4 Can, Could, Will, Should as Ambiguity Enablers........................................ 28 2.3.2 The Structure of Punchline ............................................................................. 29 2.4 Computational Humor .......................................................................................... 35 2.4.1 LIBJOG ............................................................................................................ 35 2.4.2 JAPE.................................................................................................................. 36 2.4.3 Elmo .................................................................................................................. 37 2.4.4 WISCRAIC....................................................................................................... 38 2.4.5 Ynperfect Pun Selector.................................................................................... 40 2.4.6 HAHAcronym .................................................................................................. 41 2.4.7 MSG .................................................................................................................. 42 2.4.8 Tom Swifties ..................................................................................................... 43 2.4.9 Jester ................................................................................................................. 44 2.4.10 Applications in Japanese .............................................................................. 44

3 Statistical Measures in Language Processing........................................ 463.1 N-grams................................................................................................................... 46 3.2 Distant N-grams ..................................................................................................... 49

4 Possible Methods for Joke Recognition ................................................. 504.1 Simple Statistical Method...................................................................................... 50 1

4.2 Punchline Detector................................................................................................. 51 4.3 Restricted Context ................................................................................................. 52

5 Experimental Design ................................................................................ 54 6 Generation of Wordplay Sequences ....................................................... 56 7 Wordplay Recognition ............................................................................. 61 8 Joke Recognition ...................................................................................... 648.1 Wordplay in the Beginning of a Punchline.......................................................... 65 8.2 Wordplay at the End of a Punchline .................................................................... 66 8.3 Wordplay in the Middle of a Punchline............................................................... 67

9 Training Text ............................................................................................ 679.1 First Approach ....................................................................................................... 67 9.2 Second Approach ................................................................................................... 68 9.3 Third Approach ..................................................................................................... 69 9.4 Fourth Approach ................................................................................................... 71 9.5 Fifth Approach ....................................................................................................... 72

10 Experimentation and Analysis........................................................... 7310.1 10.2 Training Set ..................................................................................................... 73 Alternative Training Set Data Test ............................................................... 76

10.3 General Joke Testing ...................................................................................... 76 10.3.1 Jokes in the Test Set with Wordplay in the Beginning of Punchline ..... 79 10.3.2 Jokes in the Test Set with Wordplay in the Middle of a Punchline ....... 81 10.4 Testing Non-Jokes........................................................................................... 82

11 Summary .............................................................................................. 86

2

12 Possible Extensions ..............................................