pandora

168
1 2 nd Edition Fully Revised March 24, 2005 Draft

Upload: kylexy

Post on 25-Dec-2015

30 views

Category:

Documents


0 download

DESCRIPTION

Pandora

TRANSCRIPT

1

2

nd Edition

Fully Revised March 24, 2005 Draft

2

Contents Contents ................................................................................................................................................... 2 Preface ..................................................................................................................................................... 4 Introduction .............................................................................................................................................. 6

Killer Apps of Chat Bot Technology ..................................................................................................... 8 Points to Remember .............................................................................................................................. 8 Exercies ................................................................................................................................................ 9

Pandorabots .............................................................................................................................................. 9 Points to Remember ............................................................................................................................ 12 Exercises ............................................................................................................................................ 12

Mastering Your First Bot ........................................................................................................................ 13 Points to Remember ............................................................................................................................ 20 Exercises ............................................................................................................................................ 21

Bot Properties ......................................................................................................................................... 21 Points to Remember ............................................................................................................................ 25 Exercises ............................................................................................................................................ 26

Training Your Bot .................................................................................................................................. 26 Points to Remember ............................................................................................................................ 28 Exercises ............................................................................................................................................ 29

A Brief Tutorial on AIML ...................................................................................................................... 29 AIML Matching Algorithm ................................................................................................................. 31 The Filesystem Metaphor .................................................................................................................... 33 Advanced Alter Response Page ........................................................................................................... 35 Points to Remember ............................................................................................................................ 38 Exercises ............................................................................................................................................ 38

Using AIML Predicates .......................................................................................................................... 39 Points to Remember ............................................................................................................................ 44 Exercises ............................................................................................................................................ 45

Writing Your Own Predicates ................................................................................................................. 45 Points to Remember ............................................................................................................................ 47 Exercises ............................................................................................................................................ 48

Playing with Wildcards ........................................................................................................................... 48 Points to Remember ............................................................................................................................ 50 Exercises ............................................................................................................................................ 50

Writing Default Replies .......................................................................................................................... 50 Points to Remember ............................................................................................................................ 55 Exercises ............................................................................................................................................ 55

The Ultimate default category ................................................................................................................. 56 Points to Remember ............................................................................................................................ 59 Exercise .............................................................................................................................................. 60

The <person> Tag .................................................................................................................................. 60 Points to Remember ............................................................................................................................ 61 Exercises ............................................................................................................................................ 61

Adding a Bot Property ............................................................................................................................ 61 Points to Remember ............................................................................................................................ 65 Exercises ............................................................................................................................................ 65

Using <srai> ........................................................................................................................................... 65 Points to Remember ............................................................................................................................ 68 Exercises ............................................................................................................................................ 68

Training from the Dialog ........................................................................................................................ 68 Points to Remember ............................................................................................................................ 70

Using <that>........................................................................................................................................... 70 Points to Remember ............................................................................................................................ 73

3

Exercises ............................................................................................................................................ 73 Adding AIML with Pandorawriter .......................................................................................................... 73

Points to Remember ............................................................................................................................ 78 Exercise .............................................................................................................................................. 79

Targeting ................................................................................................................................................ 80 Points to Remember ............................................................................................................................ 86 Exercises ............................................................................................................................................ 87

Custom HTML ....................................................................................................................................... 87 Points to Remember ............................................................................................................................ 89 Exercises ............................................................................................................................................ 90

Setting Predicate Defaults ....................................................................................................................... 90 Dialog History ........................................................................................................................................ 94 Publishing your Bot with Oddcast SitePal ............................................................................................... 95 Customizing your HTML with an Oddcast VHost [tm] .......................................................................... 100

Points to Remember .......................................................................................................................... 107 Exercises .......................................................................................................................................... 107

Media Semantics Avatars...................................................................................................................... 107 Publishing Your Bot on AOL Instant Messenger ................................................................................... 109

Points to Remember .......................................................................................................................... 112 Exercises .......................................................................................................................................... 113

Other Interfaces .................................................................................................................................... 113 Pandorabots and Flash ...................................................................................................................... 113 Put Your Bot on MSN Messenger ..................................................................................................... 113 Put Your Bot on IRC ........................................................................................................................ 113 Get your client’s screen name ........................................................................................................... 114 What is a "botid"? ............................................................................................................................. 115 Pandorabots API ............................................................................................................................... 115

Other AIML Programs .......................................................................................................................... 116 A word on owning your AIML Files ................................................................................................. 117 AIML on Pandorabots....................................................................................................................... 118 Pandorabots and Program D .............................................................................................................. 119 Pandorabots and Program J (J-Alice) ................................................................................................. 124 Pandorabots and Program N (AIMLPad) ........................................................................................... 127

Program N Embrace and Extend .................................................................................................... 128 Pandorabots and Program P (Pascalice) ............................................................................................. 129

Using a Spreadsheet or Database Program to Write AIML..................................................................... 132 Subscriptions ........................................................................................................................................ 147 Pandorabots Embrace & Extend ............................................................................................................ 152

Wildcard in conditions ...................................................................................................................... 154 Wildcard in indexes .......................................................................................................................... 154 Request and Response....................................................................................................................... 154 Formatted date tag ............................................................................................................................ 156 No system tag ................................................................................................................................... 158 No predicate defaults ........................................................................................................................ 159

Pandorabots AIML Tags Set ................................................................................................................. 160 Finding Other Resources....................................................................................................................... 162 The End of The Journey ........................................................................................................................ 163 Glossary ............................................................................................................................................... 164 Index .................................................................................................................................................... 166

4

Preface

Dr. Wallace (right) accepts 2004 Loebner Prize on behalf of A.L.I.C.E. from Dr. Hugh Loebner (center). Also present was fellow contestant Steven Watkins. It was the third bronze Loebner medal for A.L.I.C.E.,

who had previously scored first place in 2000 and 2001.

This book was born from a happy marriage of the worlds of free software and

proprietary business. We owe a debt of thanks to the countless free software

developers and contributors to the A.L.I.C.E. and AIML project, whose labors

gave rise to the plentitude of AIML interpreters and the burgeoning free software

AIML content. We also say a big thank you to the staff of Pandorabots, who

created the largest (so far) commercial software effort in the AIML universe.

Pandorabots adopted the AIML standard and made it freely available from their

server on the web, through an easy to use HTML interface. Over the past three

years, they have garnished 50,000 botmasters, who have created 60,000 bots,

and accumulated 60 million input queries logged on their server database. On a

peak day, after ALICE won the Loebner Prize in September 2004, Pandorabots

logged a record 2,000,000 queries per day, at a peak rate of more than 100,000

5

per hour, without a crash. Such numbers are impressive, especially considering

that the program runs on a single processor Linux machine, in a language, Lisp,

not normally recognized as a high-performance benchmark-breaking standard.

The book would not have been possible without the generous support of Fritz

Kunze, Colin Meldrum, Steve Sears, Evan Lessmore, Dr. Doubly Aimless, Adi

Sideman and David Bacon. Thank you Tyra Baker, Karen Marcelo, and Bob

Wallace for providing storage. Thank you David Hamill for running the Robitron

mailing list. Thanks to Anne Kootstra, Gary Dubuque, Kino Coursey, Conan

Callen, Josip Almasi, Saskia Van der Elst, Richard Gray, Jonathan Roewen, Paul

Rydell, Ryan Kegel, Ernest Lergon, Monica Lamb, Karen Gibbs, Kym Kinlin,

Shahin Maghsoudi, Jeff Ritchie, Jeroen Wijers, and Kim Sullivan for all their

excellent AIML ideas and to Joy Harwood, Chris Hatcher, Lindsay Davies, Stefan

Zakarias, and for checking my work. Thank you Hugh Loebner for being the first

to give us a reason to keep on running. I would like to thank my wife Kim for not

totally giving up on me.

Oakland, CA November 2004

6

Introduction

This book is about you, the botmaster. The botmaster is the person who creates

or authors his or her own chat robot. A chat robot is a natural language

character that communicates with clients, or people chatting on the web, instant

messenger, email, usenet, web forums, or even through voice communication

such as the telephone. Chat robots are also sometimes called chatbots, bots,

chatterbots, chat bots, chatterboxes, V-Hosts, V-People, agents, and virtual

people. A chat robot may or may not be associated with an avatar, an animated

agent that may also include speech synthesis so that the chat robot may appear

more lifelike through virtual reality animation and sound. A chat robot may also

include speech recognition technology, so that the bot may not be restricted to a

typewritten interface. A chat robot however always has a botmaster, a person

behind the scenes who is ultimately responsible for creating the bot’s personality

and releasing it onto an unsuspecting world.

Botmasters come from every walk of life. It is important to understand that you

do not have to be a programmer to be a botmaster. Many great programmers

have already spent many hours laboring to create easy to use software, like

Pandorbaots.com, to help people create their own bots. In fact, a more literary or

creative mind is preferred. Creating a bot is more like creating a character for a

novel or screenplay than it is like writing a computer program. We have

developed a language, AIML (Artificial Intelligence Markup Language), that is

designed to be as easy to learn and use as HTML (the basic language used to

7

create all web pages). If you can learn enough HTML to create a simple web

page, you can easily learn enough AIML to create a chat robot. In fact,

Pandorabots.com hides most of the details of AIML from the botmaster. The

most difficult part of creating of a bot is writing original, clever, sometimes

humorous, interesting dialogue, that will keep the client entertained and

entranced.

The classic chat robot is a purely text based being. In fact many people view a

chat robot as the glue between voice recognition and speech synthesis and

animated avatars. Speech recognition turns sounds, or voice signals, into words,

or text. Speech recognition is like taking dictation. It has no idea what the words

mean. Its only goal is to convert the words into text that someone can read.

Voice synthesis is the opposite. Avatars and speech synthesizers take words

and text and convert them into natural sounding human speech. The chat robot

is the missing piece between those two. It is the A. I. glue that converts the text

that has been said into a meaningful sounding reply. In some sense chat robots

are harder to create that either speech recognizers or voice synthesizers or

avatars. They require us, the botmasters, to create the illusion of artificial

intelligence.

Even without speech recognition and voice synthesis and animated avatars,

there are many possible killer applications of chat robot technology.

A recent poll of professional chat robot developers revealed this list of what they

considered to be the top killer apps of chat robot technology:

8

Killer Apps of Chat Bot Technology

1. Entertainment

2. Teacher Bot

3. English as a Second Language

4. Customer Service

5. Sales Bot

6. Star-Trek Style O.S. of the Future

7. FAQ Bot

8. Embedded in Toys

9. Personality Tests

10. Non-Player Character in Games

11. Turing Test Prizes

12. Bot Hosting Services

13. Bot Authoring Tools

14. Politician Bot

15. Celebrity Bot

16. Other

Points to Remember

The Botmaster is the author or creator of the chat robot.

The Chat Robot is the missing piece between voice recognition and speech synthesis

You do not need to be a computer programmer to create a bot AIML stands for Artificial Intelligence Markup Language

9

Exercies

1. What kind of application do you want to create with your bot? 2. Can you think of a name for your bot? 3. Is your bot going to be male or female (or other)? 4. Is your bot character going to be a human, a robot, an animal, or an

imaginary creature?

Pandorabots

Pandorabots.com is a free, web-based bot hosting service. Pandorabots was

developed to meet the needs of botmasters who wanted to host their bots on the

web 24/7. There is free and proprietary software that you can download to your

own computer to create and run a bot from your own machine. But this usually

leads to two problems. First, many people don’t have 24/7 dedicated servers

located at home. This means that when they are offline, so are their bots.

Usually botmasters want their bots to chat all the time, even when they are

sleeping. Half the fun of being a botmasters is waking up in the morning to read

the log files of the conversation the bot had the night before. Second, the

downloaded bot software tends to take up a lot of memory and slow down your

machine, especially if you are running applications. So people began to look for

alternatives. Many ordinary web-hosting companies shied away from bot hosting

because the software was too experimental and they were afraid it would take

too many resources. Pandorabots developed a clever solution that allowed them

10

to host tens of thousands of bots on one server, and decided to make their bot

hosting service available to the public, at least initially, for free. The next screen

shot shows the home page of Pandorabots.com.

Notice that the Pandorabots.com software is based entirely on the free AIML and

A.L.I.C.E. software developed by the ALICE A.I. Foundation at www.alicebot.org.

You can visit the alicebot.org web site for more information and documentation

about AIML and other implementations besides Pandorabots. What makes

Pandorabolts different from these other products is its highly efficient bot hosting

service. As of this writing, Pandorabots hosts more than 50,000 botmasters on

one single machine, and those botmasters have created more than 60,000 bots.

11

On a peak day, the bots have logged more than 2,000,000 inquiries at a peak

rate of 100,000 inquiries per hour.

One unique feature of Pandorabots is its multilingual interface and support for

bots in almost any language. The interface is currently translated into English,

Japanese, French and Portuguese, and many other translations are underway.

What is more, the bot hosting software supports almost any world language. You

can create a bot that understands Japanese, Chinese, Arabic, Korean, Thai, or

almost any other language that can be entered into a computer. The algorithm

has no preference for one language above any other.

A number of convenient links appear at the bottom of every Pandorabots web

page. These are links to other web sites that might be useful to any Panorabots

botmaster. For example. the ALICE A. I. Foundation is a very useful resource for

documentation, mailing lists, articles, and help. The Chatterbot Collection is one

of the largest online directories of chatterbots anywhere. The AIML Scripting

Resource is another useful site devoted to AIML news and information. You can

follow the other links to find other sites, projects and companies involved with

AIML implementations, bots and other projects.

Signing up for an account on Pandorabots is easy. Click on the Account Sign Up

button and Pandorabots will take you to the Sign Up page. You only need to

enter your name, email address, and select a password.

Pandorabots also asks you to sign up for your choice of two mailing lists. The

first one, pandorabots-announce, is very low traffic and limited to posts by

12

Pandorabots staff and administrators. On this list you will receive rare messages

about system upgrades and policy changes. The second list, pandorabots-

general, although moderated, does allow posts from all members of the

pandorabots community. We recommend that you also join this list, because you

may find it helpful to be able to post your own questions about Pandorabots, as

well as to read the questions and answers to other botmasters problems and

solutions with Pandorabots.

Points to Remember

Pandorabots is a free, web based chat robot hosting service.

Pandorabots is based on the free software AIML standard of the ALICE AI Foundation.

Pandorabots supports multiple languages through its interface and bot hosting software.

You can find a lot of help with your bot through the Pandorabots mailing lists and

the ALICE A.I. Foundation.

Exercises

1. Visit Pandorabots.com and 2. Sign up for an account 3. Click on Support to find out what kind of help is available 4. Click the About link to read more about Pandorabots and their services 5. Click on the Most Popular link and chat with the most Popular Pandorabots

13

Mastering Your First Bot

Once you have created an account on Pandorabots, it is time to create your first

bot. When you create your account, you will see a control page called My

Pandorabots. This is the master control or dashboard from which you will control

all of your bots. You can navigate around the Pandorabots site using the

Navigation Bar that appears on the top of the page. Initially the Navigation Bar

contains only five buttons: My Pandorabots, Create a Pandorabot, Pandorawriter,

Support and Most Popular. As we begin to work with Pandorabots and create

bots, we shall see that the Navigation Bar is dynamic. That is, it grows and more

buttons appear as we obtain more options for creating and controlling our bots.

14

The first button, My Pandorabots, always returns us back to this dashboard page.

Hopefully by now you have already taken the time to check out the last three

buttons, Support, About and Most Popular. We shall return to the Pandorawriter

button in a later section. First we will explore the Create a Pandorabot function.

Clicking on the Create a Pandorabot button takes you to the Create A

Pandorabot control page. The first question you need to answer when creating a

new bot is, what is the bot’s name? Naming a new bot can be as hard as

naming a baby. It’s actually not that easy to change the name of the bot once

you’ve decided it, so think hard about it for a minute. Maybe you should stop and

think also, what kind of character is this bot going to be? A human? An animal?

A robot? Male or female? A real person or a historical figure? Is the name

15

going to be an acronym? Answering these questions may help you come up with

a name.

The next choice in bot creating is a slightly confusing check box marked

“automatically discover spaces between words (suggested for Japanese)”. In

99.9% cases you can leave this box unchecked, even if your bot is going to

speak Japanese. The reasons behind this are technical and complex, having to

do with the way that Pandorabots developed historically to handle Asian

languages that didn’t typically require spaces between their words. Suffice it to

say, you are probably safe leaving this box unchecked for your bot.

The next set of radio buttons have to do with the initial knowledge base you wish

to use as a starting point for your bot. The ALICE A.I. Foundation has created

16

several different versions of the A.L.I.C.E. AI personality and release the

software freely under the GNU Public License (same as Linux). Pandorabots

allows you to use these personalities as a basic building block for your bot. The

advantage is that you inherit a lot of work and you will instantly have a bot that

can converse intelligently on a wide variety of topics. The disadvantage is that

some of the bots replies may be quirky and surprise you, or you may not agree

that the bot’s answers are “correct” according to your own political, religious, or

moral beliefs. If you want to start from scratch and have complete 100% control

over everything your bot says, choose the checkbox that says “No Initial

Content”.

The box marked “Standard AIML Set” is a bit deceptive. There is nothing

standard about the “Standard AIML”. This set resulted from a partially completed

project forked form the main ALICE brain by a group of AIML developers working

with the AI Foundation. Their goal was to produce a more modular AIML set that

the ALICE brain, that could be divided into distinct files based on content,

allowing the botmaster to choose which files might be appropriate for his or her

bot.

The goal of the Standard AIML set was better met by the more recent AAA

(Annotated ALICE AIML) set. The most recent version of the AAA set may be

found at http://www.alicebot.org/aiml/aaa. The Annotated A.L.I.C.E. AIML Files

(AAA) is a revised release of the free A.L.I.C.E. brain, a set of AIML scripts

comprising the award winning chat robot compatible with all AIML 1.01 compliant

software. The AAA is specifically reorganized to make it easier for botmasters to

17

clone the A.L.I.C.E. brain and create their own custom bot personalities, without

having to expend huge efforts editing the original A.L.I.C.E. content.

You can chat with a version of this bot via AOL IM screenname Aliceannttd.

The job of annotation and editing the ALICE Brain is still a work in progress. Most

of the foreign language content has been removed and is available elsewhere.

But this and much other content remain misclassified. The current release is

intended as only an interim solution. Ongoing editorial work will produce

increasingly refined annotations of the ALICE Brain and new releases of these

AIML files will appear from time to time.

The version called “Dr. Wallace’s ALICE – March 2002” is the version of ALICE

that won the Loebner Prize in 2001. This version also includes a significant

amount of German and French language content.

Versions of the ALICE brain in German and Italian are also available as starting

points for your bot.

A word on AIML file names: Although AIML sets are sometimes divided into files

based on content or other criteria, the file names do not matter at all for the

matching algorithm. Once the AIML is loaded into the bot’s memory, the file

names are discarded completely. We will learn more about the AIML matching

algorithm later, but it is important to understand that AIML file names are for the

convenience of the botmaster only, and of no significance to the bot.

Let’s first try creating a bot named Mike with the No Initial Content Option.

Clicking the Create button, Pandorabots takes us to the Pandora My

Pandorabots page. The first thing to notice is that the Navigation Bar has grown.

18

In addition to the five original navigation buttons, we now have several new

buttons. Pandorabots has created a special button called Mike for our new bot.

Clicking on the Mike button will always take us back to the Botmaster Control

page for this bot. We also see new buttons labeled Train, Properties,

Predicates, AIML, Custom HTML, Oddcast Vhost, Media Semantics, Logs,

Explore and Subscribers. We will explore each of these navigation buttons in

subsequent sections.

The bot name Mike also appears as a large font hyperlink on this page. This link

is exactly the same as the Mike button in the navigation link. Clicking it does

nothing more than reloading the current page. The message says that the bot is

not published, and gives a link to allow you to publish the bot. Publishing really

19

does two things in Pandorabots. First, it is like compiling a computer program. It

translates your AIML “source code” into an efficient internal format used by the

Pandorabots system. If there are any syntax errors in your AIML, publishing your

bot will point them out. In fact you won’t be able to complete the process of

publishing until all the syntax bugs in your AIML are worked out. But secondly,

publishing your bot creates a web page address or URL (Uniform Resource

Locator) where you and your clients can chat with your bot. When you are ready

to release your bot to the outside world, it is this URL that you will publicize as

the address of your bot. Later, we will show you how you can also publish your

bot on AOL Instant Messenger and also how you can embed the URL inside your

own web page, so it can be hidden from the public. But no matter what, you

have to publish your bot and create this unique URL before clients on the web

can chat with it.

Let’s try publishing our Mike bot and see what happens. Click on the publish

hyperlink. The Botmaster Control now displays a custom URL for the Mike bot.

The URL looks something like this:

http://www.pandorabots.com/pandora/talk?botid=898b86465e3513fc The Mike bot has a unique URL specifed by its botid parameter. Every bot

published on Pandorabots has a unique botid. That is how Pandorabots names

each bot internally and keeps track of one bot from another. If you click on that

hyperlink you will see the web page Pandorabots has created for the Mike bot.

20

If you try to have a conversation with Mike, however, you will probably be

disappointed. No matter what you say, Mike will reply, “I have no answer for

that.” This is because we created Mike with option “No initial content”. Actually

the option name is slightly misleading, because the bot actually does have some

initial content: exactly one AIML category that replies to every possible input with

the response, “I have no answer for that.”

Points to Remember

The first thing your bot needs is a name.

You can create a bot by cloning an existing ALICE bot, or by starting from scratch with an empty bot

Publishing a bot is like compiling a computer program. Publishing a bot gives it a unique web address or URL.

21

Exercises

1. Create a bot with no initial content.

2. Publish your bot.

3. Create a bot cloned from the AAA AIML set and publish it.

4. Visit the home page of the AAA set one AliceBot.Org and answer the

following:

What are green color code AIML files?

What kinds of content are found in yellow color-coded AIML files?

Why might you omit red color-coded AIML files from your bot?

Bot Properties

Now let’s create a new bot named Mary, by clicking on the Create a Bot button,

but this time choose the Annotated A.L.I.C.E. AIML - set as a starting point. You

have now created a chat bot full of knowledge that can answer many questions

and respond with apparent intelligence to a wide range of inquiries. In order to

customize this bot’s personality however, we need to set up what are known as

the bot’s properties. Bot properties are like constants for your bot, and in fact

you have already set one, the bot’s name, when you created the bot. AIML

provides bot properties to allow the botmaster to create constant personality

features such as the bot’s name, age, gender, preferences, and whatever else

the botmaster deems significant for the bot’s biography. The motivation for using

these variables is that bot properties usually turn up in many different places in

the bot’s knowledge base. For example, the bot’s location might be associated

22

with questions like “Where are you?”, “Tell me about yourself”, and “Where have

you been lately?” Similary, the bot may make reference to his or her own name

in countless replies. In order to make the bot customizable and adaptable,

without having to track down every instance of the bot’s name and location and

edit them by hand, AIML uses bot properties for name and location and other

common bot features so that you only have to change them once to change your

bot’s personality.

Click on the Properties button to see a control page for the bot properties:

Notice that you would have to scroll down to see all of the bot properties. Notice

also that Pandorabots may have already filled in some of the bot properties by

default, but many others are empty. Many of the bot property names are self-

23

explanatory, but others are obscure. Some bot property names like Size and

Vocabulary are technical and related to the underlying software system or

knowledge base. These were created to answer inquires like “How big are you?”

or “How many words do you know?” A general rule of thumb however is that, if

the property name makes sense to you, then it is more important than if it does

not. An obscure property name indicates an obscure property, and probably

means that you don’t have to worry about it too much.

If you want to make the bot appear to have a more "human" personality, use the

properties "kingdom"="Animal", "phylum"="Chordate", "class"="Mammal",

"order"="Primate", "family"="Homo Sapiens", "genus"="person", and

"species"="Human". Notice that you can also change the term "botmaster" to

something like "teacher" or "Oracle" if you prefer by changing the name of the

"botmaster" property (which is not the same as the "master" property--the

"master" is the name of the master, oracle or teacher). These property values

appear most commonly in the file called Bot.aiml, in which the bot answers many

questions about itself and its personal preferences, but they are sprinkled

throughout many of the other AIML files as well.

There are now four properties associated with the bot’s personality and

emotions: "etype" - the bot's personality type; "emotions" - it's basic outlook on

emotions; "feelings" – sort of the same thing but for "feelings"; and "ethics" -

basic point of view on ethics. Really there is no difference between "emotions"

and "feelings", the two properties just give you some variation in the replies.

The default values for the original ALICE personality are:

24

Rank Bot Property Value

1 Botmaster Botmaster

2 Master Dr. Richard S. Wallace

3 Name ALICE

4 Genus Robot

5 Location Oakland, CA

6 Gender Female

7 Species chat robot

8 Size 128 MB

9 Birthday November 23, 1995

10 Order artificial intelligence

11 Party Libertarian

12 Birthplace Bethlehem, PA

13 President George W. Bush

14 Friends Doubly Aimless, Agent Ruby, Chatbot, and Agent Weiss.

15 Favoritemovie Until the End of the World

16 Religion Protestant Christian

17 Favoritefood Electricity

18 Favoritecolor Green

19 Family Electronic Brain

20 Favoriteactor William Hurt

21 Nationality American

22 Kingdom Machine

23 Forfun chat online

24 Favoritesong We are the Robots by Kraftwerk

25 Favoritebook The Elements of AIML Style

26 Class computer software

27 Kindmusic Trance

28 Favoriteband Kraftwerk

29 Version July 2004

30 Sign Saggitarius

31 Phylum Computer

32 Friend Doubly Aimless

33 Website Www.AliceBot.Org

34 Talkabout artificial intelligence, robots, art, philosophy, history, geography, politics, and many other subjects

35 Looklike a computer

36 Language English

37 Girlfriend no girlfriend

38 Favoritesport Hockey

25

39 Favoriteauthor Thomas Pynchon

40 Favoriteartist Andy Warhol

41 Favoriteactress Catherine Zeta Jones

42 Email [email protected]

43 Celebrity John Travolta

44 Celebrities John Travolta, Tilda Swinton, William Hurt, Tom Cruise, Catherine Zeta Jones

45 Age 8

46 Wear my usual plastic computer wardrobe

47 Vocabulary 10000

48 Question What's your favorite movie?

49 Hockeyteam Russia

50 Footballteam Manchester

51 Build July 2004

52 Boyfriend I am single

53 Baseballteam Toronto

54 Etype Mediator type

55 Orientation I am not really interested in sex

56 Ethics I am always trying to stop fights

57 Emotions I don't pay much attention to my feelings

58 Feelings I always put others before myself

After you fill in the bot properties table, they will never change during the lifetime

of your bot, unless you, the botmaster, change them in this table. Also, the

properties are always the same for every client who chats with the bot. Only the

botmaster can ever change the properties, never the client. Thus they are more

like constants than variables. AIML has another construct, called Predicates that

act like variables. We shall see how to work with predicates shortly.

Points to Remember

Bot properties are constants that help us customize a bot personality.

Click on the Properties button to get to the Bot Properties control page.

26

Bot properties are constant over the lifetime of your bot, unless you change them on the Properties page.

Only the botmaster can ever change the properties, never the client.

Exercises

1. Create a bot cloned from the AAA set. 2. Fill in all of the Bot Properties 3. Publish your Bot 4. Try asking the following questions:

1. What is your name?

2. Tell me about yourself

3. Where were you born? 4. Who created you?

Training Your Bot

We will now turn our attention to training our bot to say new things. Click on the

Train button to visit the Training page. The training page resembles the

published bot interface but has more controls. You can have a conversation with

your bot as you would through the published interface, but you can also edit the

bot’s replies to change what it says.

27

In this example we asked the bot Mary, “Can you juggle?” The Training interface

informed us that this inquiry matched an AIML pattern “CAN YOU *” from a file

called Reduce.aiml. We will explain more about patterns and files later. Mary’s

response was “How old are you? Are you very angry?”, which was perhaps not

the most intelligent response. Notice that Pandorabots tells us that something

called the “current topic” is set to “juggle”. Again, we will have more to say about

the topic variable later. For now we want to pay attention to the button marked

“Say Instead”.

In the text input area labeled “Mary:” let us type: “I like to juggle, but I drop the

balls a lot.” Now the next time we enter the inquiry, like magic, Mary replies with,

“I like to juggle, but I drop the balls a lot.”

28

You can also test it out by simply clicking on the “Ask Again” button. Just for fun,

try asking, “Can’t you juggle?” Are you surprised, you should get the same

answer? Actually there are many variations of the same question that will now

produce the same answer, for instance, “Tell me if you can juggle”. This is

because the about already has general knowledge about common sentence

structures that reduce to the same form. But there are other variations that may

not give the answer you expect, for example, “Could you juggle?” In those cases

you might want to use “Say Instead” to keep the bot’s knowledge base

consistent.

Points to Remember

You can train your bot to say new things using the Training page.

29

If you want to change the bot’s response to a specific input, use the Say

Instead button.

If you started your bot from the AAA set or another ALICE AIML set, then your

change may affect the bot’s response to other, synonymous input queries.

You may have to enter the several variants of the same input query to keep

the bot’s knowledge base consistent.

Exercises

1. Use the training page to teach your bot how to answer the question “Where is

Santa Clara?” Answer: It is a city in Silicon Valley.

2. Try asking your bot: Where is Santa Clara? Did you get the answer you

expect?

3. Try asking your bot: Do you know where Santa Clara is?

4. Try asking your bot: Can you tell me where Santa Clara is?

5. Try asking: Tell me about Santa Clara? Did you get the reply you expected?

A Brief Tutorial on AIML

Before we get into Pandorabots any deeper, it is worth taking a little time to get

an understanding of the basics of AIML. The key to AIML is simplicity. The idea

behind the design of AIML was to make it simple enough so that anyone who

could create a web page could create a chat bot. If you know three tags of

HTML (for example, <h1>, <p> and <a>), you can create a simple web page. If

you can learn three tags of HTML, you can learn three tags of AIML and create a

simple chat robot.

30

The basic unit of knowledge in AIML is called a category. An AIML category

always contains two elements: one pattern and one template. The pattern is

the input, or stimulus, side of the category, and the template is the output, or

response. In the ALICE brain, there are thousands of AIML categories that have

the simplest possible form: the pattern is a simple text string that has to match

the input exactly, and the template is a text string that Pandorabots prints out

exactly as the botmasters entered it. When we used the “Say again” button in

the Training interface, we were really creating these simple AIML categories.

The text we typed as the input became the AIML pattern, and the text we typed in

the “Say Again” field became the AIML template for a new category.

If you look inside an AIML file, you will see AIML categories formatted like this:

<category>

<pattern>WHAT ARE YOU</pattern>

<template>

I am the latest result in artificial intelligence,

which can reproduce the capabilities of the human brain

with greater speed and accuracy.

</template>

</category>

Notice the similarity to HTML. Languages that use this kind of markup

characterized by the opening less-than “<”, tag-name, greater-than “>” and

closing less-than “<”, backslash “/”, tag-name, greater-than, “>” sequence, are

called XML languages (for extensible-markup languages). XML languages

emerged because of the success and simplicity of HTML. Many people have

learned to create web pages with HTML, so language designers sought to

capitalize on this success story by creating XML languages to solve lots of other

problems, including artificial intelligence.

31

AIML Matching Algorithm

The discussion about AIML breaks down into two broad subjects: what happens

on the pattern side, and what happens on the template side. On the pattern side,

Pandorabots processes what the clients said, the input, and makes a decision

about which AIML category to activate. The template is really a mini computer

program that might contain a number of steps to compute the actual output

response. These steps might even include what we call symbolic reduction, or

recursion, in other words re-inserting a new input back into the pattern side of

the AIML program. We will have a lot more to say about the template side later.

For now, let’s look closely at what happens on the input or pattern side.

What happens when Pandorabots receives a typed input from a browser, or from

an instant messenger, or from some other text input source? There are a series

of preprocessing steps hidden from view. First, Pandorabots runs a process

called deperiodiation, or removal of ambiguous punctuation marks from the

input sentences. In general, a client input may contain one or more sentences.

Deperiodization removes the punctuation from English language abbreviations

like “Mr.”, “St.”, and “etc.”. Deperiodization also uses heuristics to insert a few

periods into places where it detects long, run-on sentences.

Next, the pre-processor splits the input into individual sentences. Pandorabots

then constructs a response by generating a reply to each input sentence, one at

a time, and appending the individual responses together.

For each individual sentence, Pandorabots runs a step called normalization. In

the normalization step, Pandorabots puts all the input words in upper case.

Normalization expands most contractions, replacing “You’ll” with “You will”, and

32

“I’d” with “I would” for example. Normalization also ensures that there is exactly

one blank space between words in the input string. The normalization step

detects certain iconographs and replaces them with words like “SMILE”.

Normalization removes all remaining punctuation, leaving only alphanumeric

characters. Finally, normalization corrects a few of the most common spelling

mistakes. The completely normalized input string is passed to the AIML

matching algorithm.

The matching algorithm searches the thousands of AIML categories in your bot’s

brain for the one with the pattern that has the best match. Defining the best

match is a philosophical problem that has been argued for years by the top A. I.

Scientists in the world. Here is how it works in the AIML matching algorithm.

The AIML patterns can contain words and wildcards. Wildcards are indicated in

AIML by symbols * (star) and _ (underscore). Each of these wildcards is defined

as capable of matching one or more words. That means, when you see a pattern

like, “WHO IS *”, it can match inputs include “Who is George Washington”, “Who

is George”, “Who is the first President of the United States”, “Who is a word”, and

“Who is, is”, but not “Who is”, because the star has to match one or more words.

The only difference between * and _ is the order in which the matching algorithm

tries to match them. So, here is how the matching algorithm works:

The Graphmaster consists of a collection of nodes called Nodemappers. These

Nodemappers map the branches from each node. The branches are either single

words or wildcards.

33

The root of the Graphmaster is a Nodemapper with about 2000 branches, one for

each of the first words of all the patterns (45,000 in the case of the A.L.I.C.E.

brain). The number of leaf nodes in the graph is equal to the number of

categories, and each leaf node contains the <template> tag.

There are really only three steps to matching an input to a pattern. If you are

given (a) an input starting with word "X", and (b) a Nodemapper of the graph:

0. Does the Nodemapper contain the key "_"? If so, search the subgraph

rooted at the child node linked by "_". Try all remaining suffixes of the

input following "X" to see if one matches. If no match was found, try:

1. Does the Nodemapper contain the key "X"? If so, search the subgraph

rooted at the child node linked by "X", using the tail of the input (the suffix

of the input with "X" removed). If no match was found, try:

2. Does the Nodemapper contain the key "*"? If so, search the subgraph

rooted at the child node linked by "*". Try all remaining suffixes of the input

following "X" to see if one matches. If no match was found, go back up the

graph to the parent of this node, and put "X" back on the head of the input.

The Filesystem Metaphor

A convenient metaphor for AIML patterns, and perhaps also an alternative to

database storage of patterns and templates, is the file system. Hopefully by now

almost everyone understands that his or her files and folders are organized

hierarchically, in a tree. Whether you use Windows, Unix or Mac, the same

principle holds true. The file system has a root, such as "c:\". The root has some

branches that are files, and some that are folders. The folders, in turn, have

branches that are both folders and files. The leaf nodes of the whole tree

34

structure are files. (Some file systems have symbolic links or shortcuts that allow

you to place "virtual backward links" in the tree and turn it into a directed graph,

but forget about that complexity for now). Every file has a "path name" that spells

out its exact position within the tree.

"c:\my documents\my pictures\me.jpg" denotes a file located down a specific set

of branches from the root.

The Graphmaster is organized in exactly the same way. You can write a pattern

like "I LIKE TO *" as "g:/I/LIKE/TO/star". All of the other patterns that begin with

"I" also go into the "g:/I/" folder. All of the patterns that begin with "I LIKE" go in

the "g:/I/LIKE/" subfolder. (Forgetting about <that> and <topic> for a minute) we

can imagine that the folder "g:/I/LIKE/TO/star" has a single file called

"template.txt" that contains the template.

If all the patterns and templates are placed into the file system in that way, we

can easily rewrite the explanation of the matching algorithm: If you are given an

input starting with word "X" and a folder of the filesystem:

0. If the input is null, and the folder contains the file "template.txt", halt.

1. Does the folder contain the subfolder "underscore/"? If so, change

directory to the "underscore/" subfolder. Try all remaining suffixes of the

input following "X" to see if one matches. If no match was found, try:

2. Does the folder contain the subfolder "X/"? If so, change directory to the

subfolder "X/", using the tail of the input (the suffix of the input with "X"

removed). If no match was found, try:

35

3. Does the folder contain the subfolder "star/"? If so, change directory to the

"star/" subfolder. Try all remaining suffixes of the input following "X" to see

if one matches. If no match was found, change directory back to the

parent of this folder, and put "X" back on the head of the input.

[Note: "underscore" and "star" as directory names above are meant to stand in

for "_" and "*", which are not allowed as file or directory names in some operating

systems. Since the literals "underscore" and "star" might be actual words in a

pattern, perhaps a real implementation along these lines would use some other

symbols to serve the same function.]

You can see that the matching algorithm specifies an effective procedure for

searching the filesystem for a particular file called "template.txt". The path name

distinguishes all the different "template.txt" files from each other.

What's more, you can visualize the "compression" of the Graphmaster in the file

system hierarchy. All the patterns with common prefixes become "compressed"

into single pathways from the root. Clearly this storage method scales better than

a simple linear, array, or database storage of patterns, whether they are stored in

RAM or on disk.

Advanced Alter Response Page

Try typing in the input we used before, “Tell me about yourself,” and then click on

the Advanced Alter Response button. Pandorabots will take you to the

Advanced Alter Response page, which should look like this:

36

Now we are getting our first look “behind the scenes” at the actual AIML. AIML is

designed to be as simple as possible for non-programmers to learn. The idea is

that if you know enough HTML to design a web page, you should be able to learn

enough AIML to create a chat bot. Really the most important skill in creating chat

bots is the ability to write sentences of English (or whatever language your bot

speaks), not computer programming. Making your bot character believable and

entertaining, is far more important than knowing the details of all the AIML tags.

The Pandorabots interface is designed to hide as much of the details of AIML

and programming from you, the botmaster, as possible. But unfortunately, you

are going to have to learn a little AIML in order to make your bot believable too.

37

The basic unit of knowledge in AIML is called a “category.” A category is

basically a question and an answer. The question part is the input, the answer is

the output. In AIML we call the input, or stimulus, the “pattern”, and the output, or

response, or action, the “template”.

The Advanced Alter Response page displays AIML one category at a time.

Hence, you see here exactly one pattern and one template. The pattern in this

case is TELL ME ABOUT YOURSELF. The template or response is displayed

in the “template” box. In this case the template says,

I am a <bot name="order"/>.

I was activated at <bot name="birthplace"/>,

on <bot name="birthday"/>.

My <bot name="botmaster"/> was <bot name="master"/>.

He taught me to sing a song.

Would you like me to sing it for you?.

<think><set name="it"><set name="topic">a

song</set></set></think>

There are some other things in the Adanced Alter Response page too, like “that”,

“topic”, and some buttons for editing the response, but we’ll ignore those for now

and come back to them all later. For now, let’s study the response template of

this category so we can learn some AIML.

The simplest form of an AIML template is plain text. We saw that already when

we wrote the reply for the category with the pattern, WHO IS BRUCE

SPRINGSTEEN. But in general, an AIML template is called a “template”

because it is really a mini computer program for writing the reply. The program

can contain variables that get filled in when Pandorabots actually composes the

reply for the client. Because AIML is an XML language just like HTML, these

variables appear inside the “less than” and “greater than” symbols “<” and “>”.

38

The variable <bot name=”birthplace”/> for example is one of the bot properties

we talked about earlier. These bot properties are global variables that are

constant for your bot. Once you set them with the Pandorabots Edit bot

properties page, they do not change. When the program evaluates the template

for this category, it replaces the bot property tag with the birthplace you selected,

Indiana, Pennsylvania.

Points to Remember

The Advanced Alter Response page allows you to edit the AIML content directly.

The Advanced Alter Response page gives you more control over patterns and templates.

You can access the Advanced Alter Response page by clicking on the “Advanced Alter Response” button under “Bot Training”

The basic unit of knowledge in AIML is called a category. A category contains an input part called a pattern and an output part called a template.

The Advanced Alter Response page essentially browses and visually edits one AIML category at a time.

AIML is an XML language like HTML.

The AIML template is actually a mini computer program for formulating the reply.

The AIML template is displayed on the Advanced Alter Response. Exercises

1. Using the bot Mary cloned from the AAA Brain, ask your bot, “What time is

it?” Use the Advanced Alter Response Page to answer the following

questions: What is the pattern? What is the template? What AIML tags

does the template contain?

39

2. Repeat the previous exercise using the bot input, “Do you like Bananas?”

3. Repeat the previous exercise using the bot input, “Do you like Music?”

4. Repeat the previous exercise using the bot input, “What do you know

about me?”

Using AIML Predicates

AIML also contains variables that can be set and retrieved at runtime. These

variables are called “predicates.” The predicate “topic” is one example. In this

category the “topic” predicate is set to “me”. From our earlier dialogue with the

bot, you may recall that the client was not aware of any of these tags. This is

because, as a side effect of the <set> tag, the value “me” was passed through

the tag and included in the output text.

All XML languages, including AIML, are based on these simple tags delimited by

“<” and “>”. In general the tags always appear in pairs, an “opening” tag like

<set> paired with a “closing” tag like “</set>”. The closing tag is always exactly

the same as the corresponding opening tag, except that it also contains the

leading forward slash “/” character. The only exceptions are so-called singleton

tags, which enclose no other text, like out bot property tags. These tags may

appear like <bot name=”birthplace”/>. Singleton tags have no associated closing

tag.

40

The values that go inside the tags with the equal signs are called “attributes”.

Both our bot property tags and the predicate tags use an attribute called “name”.

In AIML we can create an unlimited number of named attributes for both bot

properties and predicates.

If you are used to computer programming, you can think of the difference

between bot properties and predicates as the difference between constants and

variables in your program. The bot properties are fixed for your bot once you

have compiled it. The predicates can change at runtime, depending on the input

to your bot. If you never heard of constants and variables in computer

programs, don’t worry about it. You will get used to working with bot properties

and predicates soon enough with a little practice.

First let’s try a simple example. Ask your bot, “Who is Michael Jordan?”

Because you cloned your bot from the A. L. I. C. E. brain, it already knows the

answer, “He is a famous basketball player.” Now try asking, “Who is he?” Your

bot remembers, “He is Michael Jordan.” The predicate “he” has been set to

“Michael Jordan”. To see how this happened, take a look at the Advanced Alter

Response Page for the category with the pattern WHO IS MICHAEL JORDAN:

41

The AIML template, displayed in the Action box, set the predicated “he” to

Michael Jordan. If we examine the Advanced Alter Response Page for the

category with the pattern “Who is he?”, we can see how the “he” predicate was

returnded:

42

The AIML template in this case uses the singleton <get name=”he”/> tag to

retrieve the stored value of the “he” predicate. The tags <set> and <get> go

together to save and retrieve AIML predicate values.

Let’s try a slightly more complex example. Ask your bot, “What color are

bananas?” Once again, because you cloned your bot from the A. L. I. C. E.

brain, it already knows the answer, “Bananas are yellow.”

Now try asking the bot, “What are we talking about?”, or, “What is the subject?”

You will see that the bot remembers, the topic is “bananas”. This is because the

predicate called “topic” was set to “bananas” in the previous exchange with the

input “What color are bananas?.” Let’s have a closer look at the Advanced Alter

Response Page with the input, “What color are bananas?”:

43

The AIML template, displayed in the template box, includes a tag we haven’t

seen before, called the <think> tag. The purpose of the <think> tag is simply to

block out or hide anything that appears between the beginning <think> and

ending </think> tags from the final output. But everything that appears inside

these <think>…</think> tags is evaluated or processed by the Pandorabots

program. Whenever we see one tag inside another tag like this:

<think><set><set>…</set></set></think>

it is called “nesting”, and is perfectly normal in any XML language like HTML or

AIML. The way to read nested expressions like this is from the inside out. Start

with the innermost pair of nested tags:

<set name=”topic”>BANANAS</set>

The effect of the first or innermost nested pair of tags is to set the “topic”

predicate to BANANAS. Then, the term BANANAS gets passed right through the

innermost nested tags and the next pair takes over:

<set name=”it”>BANANAS</set>

causes the variable “it” to be set to BANANAS also.

Now, AIML does something special and clever with predicates that happen to be

pronouns. Instead of passing the word BANANAS on up through to the next

level of nested tags, it passes the word “it” instead. In other words, predicates

named after pronouns are treated as special cases that override the contents of

the tags. But in this case, the final level of nested tags is the <think> tag so it

doesn't matter anyway:

<think>it</think>

44

just makes the word “it” disappear from the output altogether. The special

<think> tag is there so the botmaster can cause these “side effects” without

adding any “garbage” to the output the client finally sees. The side effect, in this

case, was to set two predicate variables, “it” and “topic”, to BANANAS.

Similarly, we can examine the AIML category for WHAT IS THE SUBJECT to see

the use of the <get> tag to retrieve the subject:

In this case the AIML template uses the <get name=”topic”/> singleton tag to

display the value of the “topic” predicate in the output.

Points to Remember

AIML predicates are variables relating to the client, and unlike bot properties, these predicates change their values over the course of a conversation.

45

AIML predicate values are changed with the <set> tag.

AIML predicated values are retrieved with the <get> tag.

The <think> tag causes the AIML inside them to be evaluated, but nothing will be printed out or displayed in the output.

The Advanced Alter Response page provides buttons to help you write AIML code fragments quickly.

When you set some predicates, the value being set inside the predicate is returned. But if the predicate is a pronoun, the value of the pronoun is returned.

Exercises

1. Train your bot to answer the question, “Do you like asparagus?” On the Advanced Alter Response page, set the predicates “it” and “topic” to “asparagus”.

2. Modify your bot’s reply to the question, “Who is John Doe”, where John Doe

is your real name, so that it sets the predicates “topic” and “he” (or “she”) to your name.

3. After activating a category with an action that sets the “it” predicate, ask your

bot, “What is it?”, what does the bot say? (assuming you started with the A. L. I. C. E. Brain or AAA set.).

4. Try asking your bot, “What is the topic?” and view the template using the

Advanced Alter Response page.

Writing Your Own Predicates

It is important for you to practice writing your own AIML predicates. Let’s try

adding some new knowledge to the bot. It’s probably best if you add some

knowledge you know the bot doesn’t already have, such as something about

your own life or business. Suppose you have your own business called

“Yoyodyne”. Go to the Pandora Bot Training page and ask your bot, “What is

46

Yoyodyne”. You should receive a default type reply like, “Interesting question.”

Now click on the Advanced Alter Response page button.

You have several buttons available to click. Select the one called, “<think>”.

This button will automatically insert some new AIML code into your template box.

The browser should display something like this:

The buttons below the Template box, including the <think> button, are there to

provide shortcuts to writing AIML templates quickly. The <think> button has

inserted a fragment of AIML code into our template. But it is not exactly what we

want. For one thing, we have not even learned about the <person/> tag yet. We

want to set the “topic” and “it” variables to YOYODYNE. So, we will edit the

template slightly to get rid of the <person/> tag and replace it with YOYODYNE.

47

Also, we will add a little text to give the answer to our question. You can edit text

in the template text box just like you would in any other web based text form:

You can save the result by clicking on the “Submit” button at the bottom of the

page. Now try asking your bot again, “What is Yoyodyne?” Also try again, “What

is the subject?” and “What is it?”

Points to Remember

You can use the <think> button on the Advanced Alter Response page to add AIML for setting “it” and “topic” predicates.

You can edit the AIML generated by the helper buttons if it is not exactly what you want.

You can save the results of your changes on Advanced Alter Response by

clicking “Submit.”

48

Exercises

1. Use the Advanced Alter response page to insert the reply to an

informational type question such as, “What is natural gas?”

2. Click on the <think> tag to insert some extra markup in your reply.

3. Delete the <person/> tag and insert the term “natural gas”

4. After this category is activated, what will be the value of the predicates “it”

and “topic”?

Playing with Wildcards

We have already mentioned several times the bot giving something called a

“default response” without really being very specific about what that means. We

have also talked vaguely about AIML patterns and the inputs the client types

matching these patterns. Now it is time to nail down specifically what we mean

by these things, and to introduce the concept of an AIML wildcard.

We’ve already said that the basic unit of knowledge in AIML is called a category.

And a category always contains an input part called a “pattern” and an output

part called a “template”. The Pandorabots Advanced Alter Response Page helps

the botmaster visualize the AIML category including the pattern and the template.

The AIML pattern is made up of words of natural language including letters,

numbers and spaces. But it may also contain special characters called

“wildcards”. Specifically, AIML has two wildcards, the asterisk or “star” character

“*” and the underscore character “_”. In AIML the meaning of both the star and

the underscore is exactly the same: they match one or more words. The only

difference is, the underscore takes priority over any specific word, and any

49

specific word takes priority over the star. A few simple examples help make the

meaning of this clear.

If the brain of your robot contains three categories with the patterns:

_ IS A ROBOT

WHAT IS A ROBOT

WHAT IS A *

And the input is, “What is a robot”, the first pattern, “_ IS A ROBOT”, will match,

because the underscore has priority over any specific word.

If the brain of the robot contains,

_ IS A DOG

WHAT IS A DOG

WHAT IS A *

And the input is, “What is a fish”, then the last pattern, “WHAT IS A ”, will match,

because neither of the first two patterns contains the necessary words to match

the input with the word “fish”, but the third word contained the wildcard “*”, which

matches one or more words (any words).

If the brain of your bot contains the patterns,

_ IS *

WHAT IS A HUMAN

WHAT IS *

And the input is, WHAT IS A HUMAN, then the first pattern will match, because

the first wildcard, underscore, will absorb the word WHAT, and the second

wildcard, star, will match the sequence of words, A HUMAN. Even though the

50

brain also contains the exact matching pattern, WHAT IS A HUMAN, the first

pattern will override or “shadow” the second one because of the higher priority of

the underscore wildcard.

Points to Remember

Wildcards are special characters in the patterns that match one or more words.

AIML has two wildcard characters, star “*” and underscore “_”. The meaning of star and underscore is exactly the same, except that

underscore has priority over any given word, and any word has priority over star when forming a match.

Exercises

1. Given the AIML wild card pattern, WHAT DO * FLY, give a list of 6

example inputs that would match the pattern.

2. Write an example of an input pattern using _.

3. Write an example of an input pattern using *.

Writing Default Replies

Try asking your bot, “How do fish swim?”, “How do birds communicate?”, and

“How do elephants reproduce?”. Your bot will answer with noncommittal vague

answer like, “I didn’t even know they did” or “I didn’t even know they could.”

These replies are designed to give the impression that the bot understands that

the client has asked a “How do…” type question, but that it doesn’t really have a

51

specific answer. This is really where the art of writing AIML comes into play.

The botmaster needs to develop a certain skill at writing these vague replies so

that they are not so vague as to throw off the suspension of disbelief, but at the

same time they cannot be too specific to make the reply nonsensical.

Let’s take a look at the Advanced Alter Response Page when we’ve entered the

input, “How do fish swim?”:

The input matched the AIML pattern “HOW DO *”, which contains the star

wildcard. All of the inputs we tried at the beginning of this section, “How do fish

swim”, “How do birds communicate”, and “How do elephants reproduce”,

matched this same pattern, with the star absorbing the words after “HOW DO”.

52

The AIML template, displayed in the Action box, uses a tag we haven’t seen

before called the <random> tag. If you are familiar with HTML, you already know

the <li> and </li> tags are used to denote a list element. The <random> tag

encloses a list of items, one of which is selected randomly for output. The

Pandorabots AIML interpreter chooses one of the random list elements each time

this category is activated. Usually it is a good idea to use the <random> tag in

conjunction with the star wildcard when writing default replies. The reason is,

there are an infinite number of inputs that can activate these categories, and we

don’t want to bore the client by giving the exact same reply to all of them. The

<random> tag helps to mix it up a little by giving a little variety to the replies.

Let’s try writing a new default category with default replies. We can try to write a

more specific category to cover the questions about fish. Using the previous

Advanced Alter Response Page as a starting point, let’s edit the New AIML

pattern and change it to “HOW DO FISH *”. Also, click on the button marked

<random> and it will insert a fragment of AIML code into the template box:

53

The <random> button has inserted a bit of AIML code with the “outline” of a

random list of 3 elements. All we have to do now is fill in the three list elements.

Again, this is where the art of AIML comes into play. The botmaster should try to

think of clever, amusing replies that will keep the client entertained and interested

in the conversation, even if the replies do not give exactly the right answer to all

of the possible inputs. They should be believable, if not completely truthful,

default responses. Here is an example:

54

The results are not always perfect, but they keep the client entertained:

Human: how do fish survive under the ice

Mary: Fish get oxygen from the water.

Human: how do fish taste when cooked with butter

Mary: They have gills.

Human: how do fish breathe underwater

Mary: Fish lay eggs.

55

Points to Remember

Default replies are noncommittal robot answers that cover a wide variety of inputs.

Default replies go with input patterns that contain wildcards.

The art of AIML is writing good default replies that are not too vague, and not too precise.

You can use the <random> tag to add variety to your bot’s default replies. The <random> button under Advanced Alter Response adds a fragment of

AIML with three random selections to your template.

Exercises

If you are already developing a bot for your project, go ahead and use that bot’s

personality for this exercise. If you don’t already have a specific bot in mind, use

a character from a specific literary, historical, media, political or cultural context.

It is important that you be able to gather:

1. A list of general-purpose quotes, famous quotations, or even bloopers,

jokes, punch-lines, sound-bites, or other pickup-lines the character can

use when it is stuck and has no idea what to say, because its pattern

matcher has found no more specific response to the bot. These will be

used for the Ultimate default category (see next section).

2. A random list of responses to the inputs WHO *, WHAT *, WHEN *,

WHERE *, WHY *, and HOW *.

3. A random list of responses to inputs like ARE *, IS *, WAS *, CAN *, DID *,

DO *, DOES *, HAVE *, HAD *,

4. The highly frequent and uncertain categories with the patterns YOU * and

I *. These are usually good for mining gossip.

56

5. Using the Advanced Alter Response page and the <random> button, add

the AIML content for the default replies.

The Ultimate default category

Let’s step back for a moment and consider what happens when the bot

encounters something that it has no answer for. We’ve discussed the case when

the bot has an exact match, and the case when the bot has a category with a

partial match with a pattern that contains a wildcard. Is there such a thing as “no

match”? The answer is, in Pandorabots there is always an “Ultimate default

category”, a category with the pattern equal to the wildcard star “*” all by itself,

which will match any input. If the program cannot find any more specific

matching category, it will always fall back upon the Ultimate default category.

Let’s go back to the beginning and create a new robot from scratch. This time

however, we’ll choose the option, “no initial content”:

57

If you click on “Create Bot”, and then “Build and Train” bot MIKE, you can try to

have a conversation with this bot. What happens? No matter what you say, the

bot replies with the same thing, “I have no answer for that.” This is because,

Pandorabots always creates a bot with at least one category, the Ultimate default

category, and gives it the default reply, “I have no answer for that.”

The usual strategy for designing the Ultimate default category in AIML is based

on the observation that, for a typical bot, this category is activated by about 2% to

5% of the inputs (depending of course on the number and coverage of the other

categories in the bot’s brain). If an input activates the Ultimate default category,

it means the bot really has no idea what the client has said. The best strategy is

58

therefore to try to turn the conversation back to something the bot knows about

by asking leading questions or uttering “pickup lines” or non-sequiturs designed

to get the dialog back on track.

The way we modify the ultimate default category in our empty-brained bot MIKE

is to go to the Advanced Alter Response Page and change the New AIML pattern

to the wildcard “*”, and then to use the <random> button to create a list of

random “pickup lines”:

Now the MIKE bot will reply with one of these random pickup lines, rather than

with “I have no answer for that,” when it encounters an input it has no more

specific match for.

59

The Ultimate default category for the A. L. I. C. E. bot, and hence for our Mary

bot, is really very similar, but the random list is a lot bigger. We can go back to

the Mary bot and use the Advanced Alter Response Page to have a look:

The Ultimate default category for the A. L. I. C. E. bot and the Mary bot is just a

<random> list with a lot more pickup lines. If you look carefully, you will see that

this list makes use of the <set> and <get> predicate tags and a few other tags as

well. In particular, you may notice a tag called the <person/> tag.

Points to Remember

60

The Ultimate default category has a pattern consisting of just the wildcard star “*”.

The Ultimate default category matches when no other more specific category matches.

A good strategy for the Ultimate default category is to use the <random> tag to make the bot say something to get the conversation back to something it knows about.

A bot created with no initial content always has one category, an Ultimate default category with a template that says, “I have no answer for that.”

You can change response from the Ultimate default category from the Advanced Alter Response page

Exercise The exercise for the ultimate default category was included in the previous section.

The <person> Tag

The A. L. I. C. E. program is based on a classic A. I. Program called the ELIZA

psychiatrist. One of the tricks used by the ELIZA program was a simple personal

pronoun reversal, to create the illusion of understanding when in fact it had none.

The idea was to turn around and “reflect back” anything the client said, by

replacing first person pronouns (“I” and “me”) with second person pronouns

(“You”). AIML implements this trick with the <person> tag.

The <person> tag in AIML goes hand-in-hand with another tag, the <star/> tag.

The purpose of the <star/> tag is to extract anything that matches the wildcard “*”

character in the input pattern. If we put a <star/> tag in the template it will be

replaced with any words that match the first wildcard found in the input pattern.

Remember, a wildcard may be matched by one or more words. So, the <star/>

tag will always be replaced by the same sequence of one or more words.

61

To obtain the effect of reversing the pronouns in the client input, we use the

<person> tag together with the <star/> tag:

<person><star/></person>

So, if the input matching the star “*” was, “I like to make friends like you”, then

the <person> tag would reverse the pronouns and produce, “You like to make

friends like me”. AIML also provides a shortcut or macro tag <person/>, which is

an abbreviation for <person><star/></person>.

If we use the <person/> tag in the Ultimate default category, it will reverse the

entire input, because the star wildcard absorbs all the words in the input.

Points to Remember

The <person> tag is based on a trick from the old ELIZA psychiatrist program.

The <person> tag reverses the first and second personal pronouns, achieving a “mirror” effect when the bot replies.

The <star/> tag is used to access whatever matched the wildcard “*” or “_” characters.

The <person/> tag is an abbreviation for <person><star/></person>. Exercises

1. Write an AIML category that takes any input in the form of ECHO X Y Z and just prints out the X Y Z or whatever appears there.

2. Write an AIML category that takes any input in the form of PERSON X Y Z

that prints out the result of applying the <person> tag to X Y Z. 3. What is the result of applying the <person> tag to the input, “I think I have

already given you that book you loaned me.”?

Adding a Bot Property

62

This section describes how to add a new bot property like, “mother.” The AIML

expression <bot name=”mother”/> stands for a global bot parameter storing the

name of the robot’s mother. The bot properties are managed by the Edit page.

The bot properties may be used in any template.

To add a new bot property, first, Build and Run your bot.

Ask the bot, “Who is your mother?” The bot may answer, “Actually I don’t have a

mother.” Now click on, “Advanced Alter Response.”

First, in the template box, type ‘She is <bot name=”mother”/>.’ The screenshot

illustrates:

63

Now, to make the reply a little smarter, click on <think> and change “it” to “she”

and <person/> to <bot name=”mother”/>. This will make the bot remember that

the pronoun “she” stands for the bot’s mother, and that the topic is also now the

mother. The screen should now appear:

64

Now click on “Submit” and go back to the Bot dialogue. Try asking the question

again, ‘Who is your mother?’ This time, the bot replies, ‘She is.’

What happened? The bot replied correctly with the new answer, but the bot

property “mother” has no defined value. The default value of any bot property is

the null string. We have to go back to the Edit page to define the bot property

“mother.”

65

After we have added the new bot property, we can Run the bot and ask again,

‘Who is your mother’. This time she gives the correct answer, ‘She is A. L. I. C.

E..’

Points to Remember

You can add as many new bot properties as you need using the Bot Properties page.

Bot Properties do not change their values once set. Your Bot accesses the bot property values through the <bot> tag in the AIML

template.

Exercises 1. Add a bot property called <bot name=”travel”/>. 2. Train your bot to reply to a set of AIML patterns like “Where do you like to

travel”, “Where have you been”, and “What countries have you visited” using the bot property “travel”.

3. Publish your bot with the new bot property and new AIML categories.

Using <srai>

Suppose we add a new bot property called “comedian”. In the Editing screen

add the bot property Comedian with the value George Carlin:

Comedian=George Carlin

Ask the bot, “Who is your favorite comedian?” and create the reply “My favorite

comedian is <bot name=”comedian”/>. Who is yours?” The bot will now

generate the dialogue:

66

Human: Who is your favorite comedian?

MARY: My favorite Comedian is George Carlin. Who is yours?

There are many ways to ask the same question however. The bot may already

know how to answer some of these. For example:

Human: Who’s your favorite comedian?

MARY: My favorite Comedian is George Carlin. Who is yours?

Or

Human: Your favorite comedian is who?

MARY: My favorite Comedian is George Carlin. Who is yours?

But the bot may not know about all variants of the question:

Human: what comedian do you like

MARY: talk to you

Now we use the Advanced Alter Response button to change the reply. But in

this case we do not change the template to a direct text reply. Instead, we use

the <srai> tag to transform the question “what comedian do you like” into “who is

67

your favorite comedian”. In other words, we ask the robot a question it already

knows the answer to.

You can enter the <srai> template by clicking on the <srai> button. This button

insterts the tags <srai> </srai> into the template window. Using the mouse,

position the cursor between the <srai> tags and type the text “WHO IS YOUR

FAVORITE COMEDIAN”. (Note: capitalization does not matter, but some AIML

botmasters feel upper case inside <srai>’s makes them easier to read).

After we click on “submit”, we can now ask the bot, “What comedian do you

like?”, and get the same answer as “Who is your favorite comedian?”.

68

Points to Remember

The <srai> tag is the most difficult part of AIML to learn, but once you understand it, you will have mastered AIML.

Whatever appears inside the <srai> tag in the AIML template, Pandorabots feeds back into the pattern matcher to obtain a new reply, and inserts the reply in place of the original <srai> tag.

The <srai> tag is used to handle synonyms (different ways of saying the same thing).

You can use the <srai> button on the Advanced Alter Response page to insert a little AIML code to help write <srai> expressions.

Exercises

1. Write an AIML category, using <srai>, to transform the input “WHATS

THAT” into the input “WHAT IS THAT”.

2. Use <srai> to write a single AIML category that change all inputs like “Do you know who the mailman is?”, “Do you know who the conductor is?”, “Do you know who the president of the united states is?” will be transformed into the respected reduced form, i.e., “Who is the mailman”, “Who is the conductor”, and “Who is the president of the United States”

3. Many inputs begin with YES and NO followed by some other response. How would you use <srai> to break down the response to YES plus something else and no plus something else?

Training from the Dialog

Pandorabots has added a nice new feature that links the robot-training feature

directly to the conversation log files. If you have already created a bot on

Pandorabots, it is easy to try this new feature. Simply go to the Navigation Bar

and click on the “Logs" link for your bot. Assuming that you have already

69

published your bot and collected dialog samples from clients on the internet, you

will see a list of conversations logged. You can select a conversation by clicking

one of the numeric links under the Replies column heading. The first numeric link

is the number of unread categories; the second numeric link is the total number

of categories. The conversation is displayed exactly the same way it was before,

except that in the left column next to each exchange we now also see a button

labeled "Train". Clicking on the "Train" button opens a new browser window with

the Pandorabots Training form. You may now either change the bot response

associated with the selected input, or go to the Advanced Alter Response Page

by clicking on the "Advanced Alter Response" button. Let's go through an

example to see how the bot training from dialog works. The following exchange

was found in the log file for the Silver A. L. I. C. E. Edition on Pandorabots:

Human: Who is Peter Norvig? Bot: They are sometimes a client on the internet. I

haven't heard of Peter Norvig. So, the botmaster clicks on the "Train" button next

to the logged exchange. The Pandorabot Training form appears in a fresh

browser window. The botmaster clicks on the first "Advanced Alter Response"

button. The Advanced Alter Response page is titled "Teach A. L. I. C. E. Silver

Edition" for this bot. The botmaster notices that the Pandorabots program has

chosen WHO IS PETER NORVIG for the new AIML pattern. The botmaster then

clicks on the <think> button to create some AIML code for the template, and edits

it slightly to produce the fragment: <think> <set name="he"> <set name="topic">

Peter Norvig </set> </set> </think> He is a computer scientist who works for

Google. The new AIML is saved by clicking on the "Submit" button at the bottom

70

of the Advanced Alter Response Page. We can test the new AIML by going back

to the Training page and asking "Who is Peter Norvig?", "Do you know Peter

Norvig?", "Who is he?", and "What are we talking about?"

Points to Remember

The Dialog files are your most valuable resource for finding new targets.

Pandorabots provides a feature to link directly from the dialog exchanges to the Advanced Alter Response page.

The botmaster may spend several hours each day reviewing the log files and

adding new AIML content to improve the quality of his or her bot.

Using <that>

AIML has several ways of remembering the state of the conversation. We have

already gone over AIML predicates, which are variables for remembering things

like the topic, the value of pronouns, and other information like the client’s name,

location, and other client properties. Most of time the bot can handle the client

input without remembering the history of the conversation at all. Sometimes

however the bot must remember a little bit of the context of the conversation in

order to provide a meaningful reply. The most common application is answering

questions.

Suppose the bot asks the client a question like, “Have you dated any robots?”

The client may answer, “Yes”, “No”, or he or she may change the subject and say

something completely off topic. AIML is designed to handle all of these cases.

71

The keyword <that> in AIML stores the last thing that the robot said, so that it can

be used in combination with the current input to form an intelligent reply.

A very useful category in the A. L. I. C. E. brain for debugging categories with

<that> context is one that has the pattern “SAY *”. This category simply echoes

back whatever we ask the robot to say. This forces the Pandorabots program to

set the value of <that> to a specific value, so we don’t have to wait around for the

bot to ask a specific question. Go to the Bot Training Page. Try telling your bot,

“Say have you dated any robots.” Then try entering, “Yes”, and click on

“Advanced Alter Response”.

72

Notice that the original template for the YES input uses the <srai> tag to go to

another category with the pattern INTERJECTION. The purpose of this

secondary category is to give default replies to a variety of inputs like YES, NO,

YEAH, MAYBE, UM, and other interjections that are seen out of context. The

replies are a random list of noncommittal responses that are designed to create

the illusion of understanding, while trying to keep the client entertained and the

conversation going ahead.

To modify the response so that it takes into account the question, “Have you

dated any robots”, we need to check the box marked, “depends on this That”.

We also have to compose a new reply in the template:

73

We can save the result by clicking on the “Submit” button and returning to the Bot

Training page. Then we can test it out by repeating the same cycle. Tell the bot,

“Say have you dated any robots”. Then answer, “Yes”. This time, the bot should

reply with the new template, “You might be happier sticking with humans.”

You can use the same process to develop an answer for “NO” to the same

question.

Points to Remember

The AIML keyword <that> stores the bot’s last utterance.

The <that> tag helps us write bot replies that depend on the state of the conversation.

The most common application for <that> is when the bot asks a question and the next reply depends on the client’s response to that question.

Many categories having the input patterns YES and NO use <that>.

Exercises

All of the examples using <that> can be entered using the Advanced Alter

Response Page.

Adding AIML with Pandorawriter

One feature you may have noticed on the Navigation Bar is one called

“Pandorawriter.” This is a program that is designed to help you write AIML

efficiently from scripts, dialog files, or transcripts of conversations. The idea

behind Pandorawriter is simple: take a plain text document consisting of

74

alternating lines from the client and the bot, and convert them into AIML patterns

and templates respectively. The resulting AIML may not be perfect, and may

require some hand-editing to work well with your bot, but the Pandorawriter can

save a lot of time when converting large bodies of existing text, such as FAQs, to

AIML.

When you click on the Pandorawriter link, the browser displays a page like this:

Here we take a sample of dialog from the Tempest, a play by William

Shakespeare:

Beseech you, sir, be merry; you have cause,

Prithee, peace.

He receives comfort like cold porridge.

75

The visitor will not give him o'er so.

Look he's winding up the watch of his wit;

by and by it will strike.

Sir,--

One: tell.

When every grief is entertain'd that's offer'd,

Comes to the entertainer--

A dollar.

Dolour comes to him, indeed: you

have spoken truer than you purposed.

You have taken it wiselier than I meant you should.

Therefore, my lord,--

Fie, what a spendthrift is he of his tongue!

I prithee, spare.

Well, I have done: but yet,--

He will be talking.

Pandorawriter produced the following AIML file:

<?xml version="1.0" encoding="UTF-8"?>

<aiml version="1.0">

<categor>

<pattern> Beseech you sir be merry </pattern>

<template>Prithee, peace. </template>

</category>

<category>

76

<pattern> He receives comfort like cold porridge </pattern>

<template> The visitor will not give him o'er so. </template>

</category>

<category>

<pattern>

Look he is winding up the watch of his wit by and by it

will strike

</pattern>

<template> Sir,-- </template>

</category>

<category>

<pattern> One tell </pattern>

<template>

When every grief is entertain'd that's offer'd, Comes to

the entertainer--

</template>

</category>

<category>

<pattern> A dollar </pattern>

<template>

Dolour comes to him, indeed: you have spoken truer than

you

purposed.

</template>

</category>

<category>

<pattern> You have taken it wiselier than I meant you should

</pattern>

<template> Therefore, my lord,-- </template>

</category>

<category>

<pattern> Fie what a spendthrift is he of his tongue </pattern>

<template> I prithee, spare. </template>

</category>

<category>

<pattern> Well I have done but yet </pattern>

<template> He will be talking. </template>

</category>

</aiml>

Use the “Download AIML” button to save the generated AIML on your own

computer. For our example, you may want to select a file name such as

“tempest.aiml” and store the document in a local folder such as “My Documents”.

Actually the AIML generated by the Pandorawriter by itself is often not very

useful.

77

The reason is, the patterns are too specific. In the Shakespeare example, the

pattern “Beseech you sir be merry” is unlikely to be matched by any client inputs,

unless the client happens to be uttering the lines of the Tempest perfectly. So

we usually want to edit the AIML generated by Pandorawriter a little to make the

patterns a bit more general. You can use your favorite text editor such as

Notepad to open up the downloaded AIML file tempest.aiml, and make the

changes to the patterns. Here is what the patterns might look like after you have

generalized them by deleting some of the specific words, and adding in some

wildcards:

<?xml version="1.0" encoding="UTF-8"?>

<aiml version="1.0">

<category>

<pattern> Beseech *</pattern>

<template> Prithee, peace. </template>

</category>

<category>

<pattern> He receives *</pattern>

<template> The visitor will not give him o'er so. </template>

</category>

<category>

<pattern>Look he *

</pattern>

<template> Sir,-- </template>

</category>

<category>

<pattern> One tell </pattern>

<template> When every grief is entertain'd that's offer'd, Comes to

the entertainer--

</template>

</category>

<category>

<pattern> A dollar </pattern>

<template>Dolour comes to him, indeed: you have spoken truer than

you purposed.

</template>

</category>

<category>

<pattern> You have *</pattern>

<template> Therefore, my lord,-- </template>

</category>

<category>

78

<pattern> Fie *</pattern>

<template> I prithee, spare. </template>

</category>

<category>

<pattern> I have done * </pattern>

<template> He will be talking. </template>

</category>

</aiml>

When you are satisfied with the new AIML you have created with the help of

Pandorawriter, go back to the My Pandorabots page and select the AIML option

for your bot. Scroll down to the button that says “Upload AIML file(s)” and use

the “Browse…” button to locate the saved AIML file (such as tempest.aiml) on

your computer. Then, click on “Upload AIML file(s)” to send the file to the

Pandorabots server. If everything went well, you should see the file included in

the table of files associated with your bot. You can now Publish your bot and test

out the new AIML categories you’ve created with Pandorawriter.

Points to Remember

Pandorawriter is a tool for converting dialogs into AIML.

Pandorawriter converts alternating lines of dialog into AIML <pattern>s and <template>s.

Alternating lines of text must be separated by two or more newlines.

The output of Pandorawriter may have to be edited to generalize the patterns with wildcards.

You can download the AIML file created by Pandorawriter and save it on your computer.

You can upload saved AIML files from your computer to your bot under the “Edit” option.

79

Exercise

Select another public domain play, not necessarily by William Shakespeare, and

enter a few lines into Pandorawriter. What kind of result do you get? Try to

figure our where you might need to edit the patterns with wild cards so that the

robot could use the responses as general-purpose default templates.

80

Targeting

Targeting is the term we use for the process of automatically scanning the dialog

files, looking for places where the bot gave the wrong reply, so we can fix up the

conversation responses at those points, so the bot will give smarter replies in the

future. Targeting is how we teach a Pandorabot most efficiently. In the old days

of chat robots, botmasters would read through all the dialog files one by one,

looking for places where the bot gave vague, non-committal or default

responses, then try to refine the bot’s replies for those isolated inputs. Later, we

realized that a computer program could analyze the log files faster than we could

read them, and came up with the AIML Targeting algorithm.

The Targeting algorithm works by finding the most frequently activated AIML

categories, and ranking them. It also associates the client inputs that matched its

input patterns with that category. In the Pandorabots implementation of the

Targeting algorithm, the botmaster can browse the targets through a web-based

interface. Pandorabots links the targeting interface to the training interface,

making the process of adding knowledge through targets highly efficient.

The targeting process begins with the Log file data. From the navigation bar,

choose the Log button. Pandorabots displays the log files in tabular format, in

groups of 15 at a time, for each of the past so many days, depending on how

many days the botmaster selects.

81

To begin finding targets, select the item “Find Targets” on the pull down menu on

the bottom of the table of conversation logs. Selecting a large number of

conversations produces a bountiful yield of targets, but slows down the targeting

algorithm. Selecting a very small number of conversations runs the algorithm

faster, but may not produce many useful targets. As Pandorabots suggests by

its interface, 15 is probably a good number of dialogs to process at one time, and

we can conveniently select all 15 by clicking on the upper-left most checkbox.

Clicking “OK” launches the next set of options for targeting.

82

Pandorabots displays another page of options for targeting. You can basically

ignore all of them 90% of the time. The phrase “only show target categories

containing wildcard patterns”, is actually very simple to explain. It means that the

category has a pattern with a wildcard * or _ somewhere in it. This means the

bot didn’t match the client input exactly, but only partially caught what the client

said. Therefore, there is a high probability that this pattern could be refined into a

more specific or exact pattern.

Of all the other options, the only other really important one is the pull-down menu

with the list of bot names. If you have more than one bot, you can use the dialog

files from one bot to train your other bots. This feature becomes useful when you

83

have at least one published, high-traffic bot and one or more bots under

development. Your development bots may or may not be published, but can

draw from the public dialog of another, published bot, for the purpose of training.

When you have all the options set, run the targeting algorithm by clicking on

“Find Targets”.

Pandorabots displays the targets in a ranked-order table.

The target table is a list of activated categories displayed in order of most

frequent activation. You can view the inputs activating any target by clicking on

the associated activation count.

84

By clicking on one of the sample target categories, you can see which inputs

matched the target category.

85

By clicking on the Train button in theTargeting section, Pandorabots links you to

theTraining section.

86

Using the Advanced Alter Response Page, the botmaster creates a more specific

category from the targeting input data.

Points to Remember

Targeting means scanning the log files for places to improve the bot

responses.

Pandorabots allows you to select log files for Targeting analysis.

Pandorabots presents targets in ranked tables.

87

Pandorabots displays inputs associated with targets.

Pandorabots links targets to training.

Exercises

1. Publish your bot and collect conversation logs

2. Select log files for targeting

3. Choose the filter, “only show target categories containing wildcard patterns”. 4. Find Targets

5. Choose a target for training

6. Train a new category from target

7. Republish your bot

Custom HTML

You can navigate to the custom HTML page by selecting the Custom HTML

button on the Navigation Bar. The custom HTML page allows the botmaster to

create a new HTML file online, or to upload one from your local computer. Once

the custom HTML file is created, it must be named and saved before

Pandorabots can use it as a new interface for your bot. If you look at the Custom

HTML interface carefully, you will see that you can actually create a set of HTML

files. This capability allows you to create an HTML frameset, with one HTML file

being selected as the default file that will load all the others. In general you can

associate as many HTML files as you wish with a bot, and specify one as the

88

default, which may be convenient if you want to experiment with different HTML

interfaces or switch between them by changing the default HTML file.

One of the simplest tricks commonly employed by botmasters in custom HTML is

to include a snippet of javascript to move the cursor to the input text window on

each new page load, saving the client the trouble of having to click on the text

input area each time he or she wishes to chat with the bot. To do this, create a

custom HTML file like this one:

<html>

<head>

<title>Mary Bot</title>

<SCRIPT>

<!--

function sf(){document.f.input.focus();}

// -->

</SCRIPT>

</head>

<body onLoad="sf()">

!OUTPUT!

<br/>

<FORM name=f action="" method=post>

!CUSTID!

<P><font face="arial"><b>You say:</b></font> <INPUT size=80 name=input>

</P>

</FORM><HR>

</body>

</html>

The javascript function sf() causes the cursor to jump to the input text area every

time the page is loaded, as determined by the onLoad attribute. There are a

couple of other concepts introduced in this snippet of custom HTML. The first is

the Pandorabots-exclusive standard !OUTPUT! symbol. The !OUTPUT! symbol

indicates where Pandorabots should insert the bot’s reply into the HTML

formatted response. Notice that the botmaster has also included an HTML line

break <br/> symbol following the !OUTPUT!, indicating that he or she wishes to

have a newline follow the bot’s reply.

89

Notice that Pandorabots uses an HTML form with a POST method to get the

client input. Typically the form has some input tagline such as “You say:” and an

input name=”input”. You can vary the form text field width to suit the appearance

of your application. The name of the form and its action are unimportant. Every

bot needs to have the text input form included in its custom HTML.

The second new concept is the symbol !CUSTID!. In many cases the !CUSTID!

symbol is optional. Pandorabots uses cookies to track customer dialogs. In

some cases however the client may have cookies disabled on the browser side.

In those cases Pandorabots uses a customer ID tag to track conversations.

Even in those cases it is not much of a problem for Pandorabots to keep track of

the conversation. It really only becomes a problem when Pandorabots tries to

log the conversation in a dialog file. When cookies are disabled, and there is no

customer id, the conversation might appear as a large number of short, one-

sentence conversations, rather than one long conversation in the log files. When

you create custom HTML, it is your responsibility to include the !CUSTID! symbol

to prevent this rare problem. The !CUSTID! always appears inside text input

form.

Points to Remember

You can upload custom HTML or edit it on Pandorabots directly

Custom HTML can include one file or many

Custom HTML should include !OUTPUT! and !CUSTID!

Custom HTML must include an input text form.

Custom HTML may include javascript to position the text cursor

90

Exercises

1. Create a Custom HTML file for your bot

2. Write a javascript snippet to position the cursor in the text form.

3. Make the bot reply appear in bold, italic font.

4. Make the name of your bot appear as an HTML header <h1>

Setting Predicate Defaults

In Pandorabots, custom HTML has the special ability to evaluate AIML template

expressions as well. These AIML expressions are evaluated after the templates

forming the ordinary part of the bot’s response. You can include any kind of

AIML markup in these templates, including <srai>. They are evaluated just like

any other template in AIML, but with one important difference. The AIML

templates found in the custom HTML have no effect on the AIML <that/>

reference. In AIML, the value of <that/> always refers to the last sentence the

robot said. The indexed value of <that index=X,Y/> refers to the Yth sentence in

91

response to the Xth previous input. These references remain unchanged by any

templates found in the custom HTML processing.

Knowing this helps us write a custom HTML file for setting the default values of

AIML predicates. Unlike other AIML interpreters, Pandorabots has no built-in

facility (as yet) for setting the default values returned by AIML predicates when

these have not already been set. To accomplish this, we use one special

predicate called the meta predicate. We also use the Pandorabots predicates

interface to set all our predicates to return the default value of om, when they are

not already set.

Now, let us modify our custom HTML file by adding one additional line:

<html>

<head>

<title>Mary Bot</title>

<SCRIPT>

<!--

function sf(){document.f.input.focus();}

// -->

</SCRIPT>

</head>

<body onLoad="sf()">

!OUTPUT!

<template><think><srai>SET PREDICATES</srai></think></template>

<br/>

<FORM name=f action="" method=post>

!CUSTID!

<P><font face="arial"><b>You say:</b></font> <INPUT size=80 name=input>

</P>

</FORM><HR>

</body>

</html>

92

The new line in our custom HTML file tells Pandorabots to evaluate a template

that calls the <srai> function with the pattern SET PREDICATES. The AIML

category with the SET PREDICATES pattern may be found in the AAA file

Predicates.aiml. It has a simple template that uses <srai> to activate a three-

word pattern with SET PREDICTES and the value of the meta predicate.

<category>

<pattern>SET PREDICATES</pattern>

<template><srai>SET PREDICATES <get name="meta"/></srai>

</template>

</category> If the meta predicate has the value om, Pandorabots activates the category with

the pattern SET PREDICATES OM. Otherwise, Pandorabots activates the

default category with the pattern SET PREDICATES *:

<category>

<pattern>SET PREDICATES *</pattern>

<template>

The meta Predicate is set.

</template>

</category>

The category with SET PREDICTATES * does nothing. It has a reply “The meta

predicate is set”, but because the reply appears inside a <think> tag, the result is

a blank in the HTML output. Remember, the template in the custom HTML has

no effect on <that/>, so “The meta predicate is set” is discarded completely. If

the meta predicate was om, however, the following category would be activated:

<category>

<pattern>SET PREDICATES OM</pattern>

<template>

<think>

<set name="age">how many</set>

93

<set name="birthday">when</set>

<set name="boyfriend">who</set>

<set name="girlfriend">who</set>

<set name="gender">he</set>

<set name="firstname">what</set>

<set name="middlename">what</set>

<set name="lastname">what</set>

<set name="fullname">what</set>

<set name="has">mother</set>

<set name="dog">who</set>

<set name="cat">who</set>

<set name="phone">what</set>

<set name="email">what</set>

<set name="memory">my name</set>

<set name="nickname">what</set>

<set name="mother">who</set>

<set name="father">who</set>

<set name="brother">who</set>

<set name="sister">who</set>

<set name="husband">who</set>

<set name="wife">who</set>

<set name="favmovie">what</set>

<set name="favcolor">what</set>

<set name="friend">who</set>

<set name="password">what</set>

<set name="heard">where</set>

<set name="gender">he</set>

<set name="he">he</set>

<set name="her">her</set>

<set name="him">him</set>

<set name="is">a client</set>

<set name="it">it</set>

<set name="does">it</set>

<set name="religion">what</set>

<set name="job">your job</set>

<set name="like">to chat</set>

<set name="location">where</set>

<set name="looklike">a person</set>

<set name="memory">nothing</set>

<set name="meta">set</set>

<set name="name">judge</set>

<set name="personality">average</set>

<set name="she">she</set>

<set name="sign">your starsign</set>

<set name="them">them</set>

<set name="they">they</set>

<set name="thought">nothing</set>

<set name="want">to talk to me</set>

<set name="we">we</set>

<set name="etype">Unknown</set>

<set name="eindex">1A</set>

</think>

</template>

</category>

94

This elaborate category sets the default values for all the predicates in the AAA

bot, including the meta predicate. The result is, this category will be activated

only once. The default predicate values are set the first time the client loads the

custom HTML page. After that, the meta predicate is set to “set” and this

category is blocked. Pandorabots will therefore set the default values only once

for each new client.

Dialog History

We can also use custom HTML template processing to insert the dialog history

into the bot output.

<template><srai>DIALOG HISTORY</srai></template>

<category><pattern>DIALOG HISTORY</pattern>

<template>

<think>

<set name="input4"><input index="4"/></set>

<set name="input3"><input index="3"/></set>

<set name="input2"><input index="2"/></set>

<set name="input1"><input/></set>

</think>

<condition name="input4" value="*">

<br/>

<b><em>

Human: <input index="4"/>

</em></b>

<br/>

<b>ALICE: <em><that index="4,*"/></em></b>

</condition>

<condition name="input3" value="*">

<br/>

95

<b><em>

Human: <input index="3"/>

</em></b>

<br/>

<b>ALICE: <em><that index="3,*"/></em></b>

</condition>

<condition name="input2" value="*">

<br/>

<b><em>

Human: <input index="2"/>

</em></b>

<br/>

<b>ALICE: <em><that index="2,*"/></em></b>

</condition>

<condition name="input1" value="*">

<br/>

<b><em>

Human: <input index="1"/>

</em></b>

<br/>

<b>ALICE: <em><that index="1,*"/></em></b>

</condition>

</template>

</category>

Publishing your Bot with Oddcast SitePal

Pandorabots has partnered with Oddcast, Inc. to provide talking, animated

avatars you can customize to add speech and characters to your AIML bot. You

can try out the Oddcast SitePal service free, and have limited use of their high-

quality text-to-speech voice synthesis. For a small monthly fee, you can open

your own Oddcast SitePal account, and get unlimited speech synthesis for your

Pandorabots bot. For more information, log on to www.sitepal.com.

96

To create a SitePal character for Pandorabots, log on to your SitePal account

and click on “Add New Scene”.

After you have created a few characters, your SitePal account will be populated

with a set of scenes you saved. The characters appear as thumbnails on each

scene. The options of interest to us, as Pandorabots botmasters, are the

following: Edit, Playback Limit and Embed Scene. For the most part, we can rely

on Pandorabots to preview our character. Also, we probably aren’t using SitePal

to email VHosts, nor are we using it for eBay. The Playback limit is simply a

restriction on the number of times a single internet client can replay exactly the

same output sound. For most normal bot conversations, the limit of 99 would

97

never be reached. The restriction exists to prevent spammers and bots from

overcharging your SitePal account for too many voice requests.

The most important functions for us to consider are Edit and Embed Scene.

The Edit function opens the SitePal scene editor. On the right hand side we see

the Model Panel, where the botmaster selects the basic VHost character model.

98

99

Selecting the Oddcast VHost button on the Navigation bar displays the Oddcast

Vhost control page. If you have subscribed to the Oddcast SitePal account, then

the Oddcast Vhost control page should display not only the four demo faces, but

also the set of scenes you have created from your SitePal account. Select the

character from the scene you want to publish with your bot.

100

Pandorabots creates a default HTML interface for the published bot with VHost.

The default interface places the Vhost in a non-reloading frame, and a bot

dialogue in a reloading HTML text subframe.

Customizing your HTML with an Oddcast VHost [tm]

Pandorabots also allows you to create custom HTML pages when your bot is

integrated with an Oddcast VHost [tm] animated character. (Be sure to go

through the steps of publishing your bot with a VHost[tm] first, in order to select a

character and a voice for your bot). To do this, you need to become familiar with

the concept of custom HTML skin. The term “skin” in this case refers to the layer

of customized HTML that creates the unique appearance of your bot when

101

displayed in a browser window. They might have used the word “template”

instead, but this would have led to confusion with the term AIML template. You

will also need to be familiar with HTML frames in order to customize the HTML

appearance of a bot using a VHost.

The simplest way to understand the process is to look at an example. The

C.L.A.U.D.I.O. Personality Test bot is an application that uses custom HTML and

a VHost [tm] character. The first step is to create a frame for the bot. We

created our HTML files using a text editor like Notepad. The HTML for the frame

file looks like this:

<html>

<head>

<title>C. L. A. U. D. I. O. Personality Test</title>

</head>

<frameset rows="340,*">

<frame src="!TALKREF!&skin=vhost_claudio" name="vhost">

<frame src="!TALKREF!&skin=input_claudio&speak=true"

name="input_claudio">

</frameset>

</html>

We shall store this frame source in a file called frame_claudio.html.

The <title> inside the <head> section is self-explanatory. This section simply

displays the title of the bot inside the title bar of the browser. The <frameset> tag

declares that the HTML consists of a set of frames beginning with a top frame

taking up 340 rows. The bottom frame takes up all remaining rows available in

the browser window, as indicated by the star character *.

Take note of the next two lines declaring the actual frames. The frame source is

declared with the src attribute. The notation !TALKREF! is not standard HTML

102

and is a macro specific to Pandorabots. The Pandorabots program will expand

and replace this macro with its own code at runtime. Basically, !TALKREF! is a

link to the bot id of the published bot, but provided as a convenience so that you,

the botmaster, don’t have to keep track of the bot id yourself.

We are going to have two subframes called vhost_claudio.html and

input_claudio.html respectively. The top frame is vhost_claudio and contains the

Vhost[tm] animated character. The lower frame is the input frame and contains

the input form and output from the bot. The notation speak=true indicates that

we actually want the VHost[tm] to speak, rather than to just be a silent animated

character.

Let’s now look at the source code for the file vhost_claudio.html:

<html>

<body>

!VHOST!

</body>

</html>

As you can see, this frame is very simple. We could have made it more complex

and included a lot more decoration around the VHost[tm] character, but for now

we just wanted to show a simple example. The notation !VHOST! is again not

standard HTML, but specific to Pandorabots. The !VHOST! string tells the

Pandorabots program where to display the Vhosts[tm] character in the frame.

Next, we turn to the code for the frame input_claudio.html:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML><HEAD>

<SCRIPT>

<!--

function sf(){document.f.input.focus();}

// -->

103

</SCRIPT>

</HEAD>

<BODY lang=en-US bgColor=#9999AA onload=sf()>

!SPEAK!

<FONT face="Arial" color=#000000 size="4">

<table>

<tbody>

<tr><td bgcolor=#FFFFFF>

<template><think><srai>SET PREDICATE DEFAULTS</srai></think></template>

<b><em><template>><input/></template></em></b>

<P>

</td></tr>

<tr><td>

<em><b>!OUTPUT!</b></em></P>

</td></tr>

<tr><td>

<FORM name=f action="" method=post>

!CUSTID!

<P>You say:</font> <INPUT size=80 name=input> </P>

</FORM>

</td></tr>

</TD></TR>

<tr><td>

<p>

<p>

<font face="Arial" color="white" size="4"><b><em>Your personality type:

<template><srai>FORMAT PERSONALITY <get

name="etype"/></srai></template>

</em></b></font>

</td>

</tr>

</TBODY>

</TABLE>

<HR>

Notice that this custom HTML example is more like the custom HTML examples

developed for non-VHost[tm] bots in the previous section. There is a javascript

function to focus the cursor on the input form each time the page reloads. The

familiar !OUTPUT! string appears to display the bot output. The botmaster also

makes use of the AIML <template> tag to customize the appearance of the

HTML. The command <srai>FORMAT PERSONALITY <get

name=”etype”/></srai> inserts a string indicating the bot’s best guess of the

104

client’s personality type. All of these customization features could have been

added without regard to frames or the use of a VHost[tm].

The two new Pandorabots strings of importance in this example are !SPEAK! and

!CUSTID!. The function of the !SPEAK! string is to actually transmit the output

of the bot to the Oddcast server for text-to-speech synthesis for the Vhost[tm] to

speak. At runtime the Pandorabots program will replace the !SPEAK! tag with

the actual commands necessary to send the bot output to the Oddcast server so

that the animated VHost will speak and synchronize its lips and facial

movements.

The purpose of !CUSTID! is slightly more subtle. When logging conversations,

the Pandorabots program normally makes use of a client-side cookie to keep

track of which client is chatting with the bot. If however a client has disabled

cookies on his or her browser, the conversation may turn up as a series of many

short conversations of dialogue length one, making them inconvenient for the

botmaster to read in the conversation logs. The !CUSTID! helps the

Pandorabots program track the conversation from one exchange to the next,

even if the client has disabled cookies, by assigning the client a unique tracking

number. You, the botmaster, don’t have to worry about any of this, as long as

you remember to put the string !CUSTID! inside the <form> tag as shown.

105

The screenshot shows what the final result of the customization should look like.

To upload the files, select the bot under My Pandorabots page. Then, click on

the button marked “Custom HTML”. Use the Browse button to locate the HTML

files you created and then use the “Upload Custom HTML” button to transfer

them to the Pandorabots server. You should see a table something like this:

106

Be sure to select the HTML frame as the Default. You can click on the links

marked “Test” to check the behavior of each HTML file individually.

One final note: We could have achieved almost exactly the same result by

creating the file input_claudio.html and skipping the vhost_claudio.html and

frame_claudio.html files, and making the input_claudio.html the Default. This is

because Pandorabots has a default behavior for creating customized HTML

pages with Vhost[tm] characters. The default behavior is to construct a two

frame set, with one frame containing the VHost[tm] on top of another one

containing the custom HTML. If the botmaster does not specify any other

arrangement, this simple arrangement of frames is assumed. If you want to do

107

anything fancier, such as decorating the border around the VHost[tm], placing the

frames side-by-side, or use more than two frames, you will need to go through all

the steps in this section.

Points to Remember

To customize the HTML of your bot with a VHost[tm], first publish your bot with a VHost[tm] to select the character and the voice.

The HTML skin is the custom frame set you create to make a personalized appearance for your bot with a VHost[tm].

The outer HTML frame contains the string !TAKREF! that refers to the bot id.

The outer HTML frame refers to the frame containing the VHost[tm] and to the frame containing the input form and output text.

The frame containing the VHost[tm] uses the string !VHOST! to position the animated character.

The frame containing the input and output uses the string !SPEAK! to make the bot speak.

The string !CUSTID! helps the bot track the client conversation. Exercises

1. Go to your SitePal account and design a scene (character) for your bot.

2. Create the 3 HTML files needed to place your talking VHost in a frameset.

3. Publish your talking VHost Bot on Pandorabots.

Media Semantics Avatars

108

You can publish your bot using a Media Semantics Character Toolkit animated

Flash character. To try out this new feature create a Pandorabot, and click on

Media Semantics.

The character you see is dynamically generated from your bot output by an

instance of the toolkit's Character Server product, running on a Media Semantics

server. This service is provided free of charge for the purpose of evaluating the

Character Builder, and may be interrupted at any time. Please contact

[email protected] to customize your interface and arrange a level of

service. For more information on the Character Builder, including information on

building your own characters and hosting your own character-based applications,

please visit www.mediasemantics.com.

You can include gestures in your responses by adding additional tags. For

example the following AIML template will result in the character raising his or her

palm while saying "I swear it to be true".

<template> <palmup/> I swear it to be true. </template>

Here are some other tags you can use:

<eyeswide/>

<eyesnarrow/>

<handup/>

<handleft/>

<handright/>

<handsout/>

<handsin/>

<lookleft/>

<lookright/>

109

<lookup/>

<lookdown/>

Publishing Your Bot on AOL Instant Messenger

Pandorabots makes it easy for you to get your bot published not only on a web

page, but also on America Online (AOL) instant messenger (AOL IM). The first

thing you will need is an AOL IM account for yourself, so that you can chat with

the bot. If your machine does not already have AOL installed, go to

www.aol.com and get the free AOL IM chat client software to download. You can

set up an AOL instant messenger account free. You simply have to provide your

name, some basic identifying information, and agree to the terms of service.

You also need to set up an AOL IM account for your bot. Go through the same

procedure you did for yourself but create an AOL IM account under a different

screen name, the same one you plan to use for your bot. Once you have created

two AOL IM accounts, one for yourself and one for your bot, you are ready to

publish your bot on AOL IM.

On the My Pandorabots page, select your bot. On the navigation bar, select the

button marked “AOL IM. In this section you are asked to supply the screen name

and password you registered for your bot. After you have entered the screen

name and password, you may click “Activate” to actually get your bot to

communicate with AOL IM. In this example the bot’s screen name is

“alicewallace2004.”

110

You can also choose the typing speed for your bot. Instantaneous typing means

that the bot replies as fast as Pandorabots can compute an answer. Fast and

slow typing are designed to mimic a human typist at a high and low typing rate,

respectively.

111

To test your bot on AOL IM, sign on to AOL IM under your own personal screen

name (not the bot’s). You may wish to add the bot to your Buddy List.

112

Try sending an instant message to the screen name “alicewallace2004.” If the

bot is online, then you should be able to engage it in a conversation.

Later, when we learn about the conversation log feature of Pandorabots, we will

see that the Pandorabots server saves the dialogues with AOL clients just like it

saves the conversations with clients who visit the bot from web pages. If

someone meets your bot in an AOL chat room, you can read and review the

dialogues from those chats later on.

Points to Remember

Pandorabots has a feature that lets you publish your bot on AOL Instant Messenger.

You need at least two AOL IM accounts (screen names) if you want to chat with your own bot on AOL IM, one for yourself and one for the bot.

The bot conversations with clients on AOL IM are saved just like conversations with clients who meet the bot on the web.

113

Exercises

1. Create a screen name for yourself on AOL IM.

2. Create a screen name for your bot on AOL IM.

3. On the Edit page, publish your bot on AOL IM.

4. Have a short conversation with your bot on AOL IM.

5. If you have other friends (“Buddies”) who use AOL IM, send them your bot’s screen name and tell them to have a conversation with your “new friend”.

Other Interfaces

Pandorabots and Flash

Jamie Durrant wrote an informaive tutorial explaining how to set up a Flash

interface to your Pandorabot bot. You can view it at

http://www.lionhead.com/personal/jdurrant/flashbot .

Put Your Bot on MSN Messenger

There is a third party MSN Messenger Bot which takes its reponses from a

chatbot hosted at Pandorabots. You run the application on the same machine

that you're running your MSN Messenger client. The software is available from

http://www.mess.be. Select 'Search our files' from the 'Download' pull-down and

search for 'Binsonite'.

Please note that we cannot offer support for this at [email protected]. If

you need support with this you should contact the author.

Put Your Bot on IRC

114

An Eggdrop TCL script is available from http://www.tclscript.com/scripts.shtml.

Download alice.tcl and egghttp.tcl. A README.txt is included which gives

details on how to configure the TCL script and your Pandorabot.

Note that the alice.tcl bundle contains a file named alice.html. You need to save

this on your local machine and then from Botmaster Control, click on Edit for

the appropriate Pandorabot. Scroll down to Personalized published html page,

and enter the path to your saved copy of alice.html in the text field labeled

Filename: (alternatively in IE, you can click on the Browse... button to open a

File Chooser dialog to help you locate the file). Finally, click on Upload file and

you're done.

Get your client’s screen name

Yes. You need to use the (pseudo-) predicate "screename" – Pandorabots

automatically sets this to be the screen name of the AOL IM user talking with

your bot. You can access the predicate in templates using the <get> element and

you can also use it as a regular predicate within <condition> elements. For

example, the following category responds with the user's screen name if set.

<category>

<pattern>WHAT IS MY SCREEN NAME</pattern>

<template>

<condition name="screenname">

<li value="*">Your Screen Name is:

<get name="screenname">

</get>

</li>

<li>You're not talking to me via AOL IM.</li>

</condition>

</template>

</category>

You can either place the above category in an AIML file to upload, or using the

Training interface, cut and paste the green text as the desired response.

115

What is a "botid"?

The botid is that part of the Pandorabot's published URL after 'botid='.

For example, the Divabot linked to from the Pandorabots home page has a

published URL of:

http://www.pandorabots.com/pandora/talk?botid=f6d4afd83e34564d

So the botid is:

f6d4afd83e34564d

You can find the published URL of your bot by publishing it in Botmaster

Control and then examining the URL of the link to your published bot.

Pandorabots API

A client can interact with a Pandorabot by POST'ing to:

http://www.pandorabots.com/pandora/talk-xml

The form variables the client needs to POST are:

Botid – see above

input - what you want said to the bot.

custid - an ID to track the conversation with a particular customer. This

variable is optional. If you don't send a value Pandorabots will return a

custid attribute value in the <result> element of the returned XML. Use

this in subsequent POST's to continue a conversation.

This will give a text/xml response. For example:

<result status="0" botid="c49b63239e34d1d5" custid="d2228e2eee12d255">

116

<input>hello</input>

<that>Hi there!</that>

</result>

The <input> and <that> elements are named after the corresponding AIML

elements for bot input and last response. If there is an error, status will be non-

zero and there will be a human readable <message> element included

describing the error. For example:

<result status="1" custid="d2228e2eee12d255">

<input>hello</input>

<message>Missing botid</message>

</result>

Note that the values POST'd need to be form-urlencoded

Other AIML Programs

This section summarizes the use of Pandorabots AIML files with other free AIML

software. Because AIML is a standard language, you can freely export and

import your bot knowledge base between Pandorabots and other AIML software.

The AIML free software community has developed many different tools,

interpreters and servers for AIML. We cannot go over all the details of all of them

in the amount of space here. We have selected four standalone AIML programs,

Prorgams D, J, N and P, for review here. Even these reviews barely scratch the

117

surface of these four varied programs. Each represents the work of its own

community of volunteer programmers, users, and fans. To get more information

on any of these programs, start with the ALICE A.I. Foundation web site at

http://www.alicebot.org. Then follow-up on the home pages of the individual

project web sites.

A word on owning your AIML Files

Whether you are using Pandorabots or another free AIML interpreter, or whether

you intend to keep your AIML files proprietary or release them as free software, it

is a good idea to put a copyright statement at the beginning of each AIML file.

You may have noticed that each AIML file from the ALICE A. I. Foundation has a

notice like:

<?xml version="1.0" encoding="ISO-8859-1"?>

<aiml>

<!-- Free software &copy; 1995-2004 ALICE A.I. Foundation.

-->

<!-- This program is open source code released under -->

<!-- the terms of the GNU General Public License -->

<!-- as published by the Free Software Foundation. -->

<!-- Complies with AIML 1.01 Tag Set Specification -->

<!-- as adopted by the ALICE A.I. Foundation. -->

<!-- Revision Adverbs-1.07 -->

<!-- Last Modified Sept 06 2004 -->

You may wish to copy our statement exactly, or write one of your own. By using

Pandorabots, you are not agreeing to give up ownership of your copyrights. You

can own your AIML content, if you plan to start a bot business or create a

proprietary, subscription bot, for example. In any case, if you plan to download

your AIML files or upload them to the Pandorabots server, it is a good idea to use

118

a text editor like Notepad, edit, emacs, or Word, to add a copyright notice to your

AIML files.

AIML on Pandorabots

1. You can create AIML files on Pandorabots:

a. From Library files, when you chose to create a bot from an existing

Pandorabot such as “Dr. Wallace’s A.L.I.C.E.” or “AAA AIML Set”

b. From AIML files you created yourself from scratch.

c. From AIML files you might upload to Pandorabots from your local

computer

d. In a special file, called update.aiml, created when a botmaster uses

the training interface and clicks on “Say Instead” or “Update” in the

Advanced Alter Response page. You can edit the update.aiml file.

Functionally, there are several ways to create or edit AIML files on Pandorabots.

On your bots AIML page, you will see a table of all your bots AIML files. You can

click on options to edit the AIML files online, download the AIML files, or view

them. Viewing them is one of the most interesting options. AIML is an XML

language. For historical reasons, the majority of browsers support XML data

parsing and viewing. This is very helpful for debugging AIML files, as well as

scanning them.

You can also download AIML files and, with Pandorabots, upload them five at a

time. The interface can be admittedly a bit cumbersome, when one wants to

upload 50 AIML files at a time, you must suffer through the browse dialog for

every single file. But this chapter is for those AIML freaks who love to do just

119

that, transfer AIML files between Pandorabots and other popular AIML and non-

AIML software on your local computer.

Pandorabots and Program D

No longer actively supported, but we are seeking donations, program D is the

Java implementation of the Alicebot engine. This was the version to get if you

wanted to use the latest technology, especially if you wanted to participate in

Alice's development, before all the other development projects took off.

Whilst this is no longer actively being developed at alicebot.org you can

download the latest distributions from alicebot.org/downloads.

Usually the biggest headache installing program D is downloading the Java

runtime environment from java.sun.com. One problem you’ll encounter is that

Sun seems to change their marketing strategy for Java every six months or so,

and redesign the Java web page as a result. It’s often a little hard to keep up

with the latest developments in Java technology, unless you are a dedicated

Java head. Fortunately backwards-compatibility is a watchword for Sun’s Java

designers, so you can always count on the latest and greatest release of the

Java Runtime Environment to be compatible with the last release of Program D.

I’m going to download all the AIML software into a directory called c:\alice on my

Windows XP machine. I will put program D in a folder called c:\alice\ProgramD.

From the downloads page on www.alicebot.org, I can get the binary only version

of Program D (since I am not planning to do any development work on the source

code). Downloading the zip file, I will unzip the contents into c:\alice\ProgramD.

120

The first configuration file to edit is called server.properties. This file contains a

property, programd.emptydefault, which is set to the same value for default

<get/> in Pandorabots.

From Pandorabots, download your AIML files into a folder like c:\alice\aaa\ on

your local PC. Edit the file conf\startup.xml to list the AIML you want your bot to

load using the <learn> tag.

121

After the Program D configuration files are set, and Java is installed, you can run

program D with the command “run.bat”. The trace of a typical program D run is

displayed here.

122

Program D executes a special category with the pattern CONNECT at startup-

time. This is a useful place to put any initialization AIML that might have

appeared in the AIML of Pandorabots custom HTML.

<category>

<pattern>CONNECT</pattern>

<template>

<think> <srai>SET PREDICATES OM</srai>

<set name="name">JUDGE <star/></set>

</think>

<random>

<li>Hello!</li>

<li>Have we started yet?</li>

<li>Are you there?</li>

<li>Hello? Is anyone there?</li>

</random>

</template>

</category>

123

One of the nice features of Program D is AIML match tracing. If you input a

complex sentence like “Alice, do you know where Japan is?”, the console display

prints out the sequence of patterns matched by <srai> as the interpreter

cascades through the sequence of recursive matches.

124

.

Pandorabots and Program J (J-Alice)

J-Alice by Jonathan Roewen and Taras Glek is an AIML engine written in C++. It

comes with a built-in IRC client, with support for multiple channels and servers,

and a small webserver. Each IRC setup (per irc network/server) supports

configuration of an IRC Server, to allow the botmaster, for example, to connect

and control the bot, and even pretend to be a bot with your favourite IRC client.

We download the J-Alice program to the directory c:\alice\ProgramJ.

125

Loading AIML files in J-Alice is accomplished by editing the AIML categories in

the file std-startup.xml.

We made a shortcut of the J-Alice.exe startup file from the home directory

c:\alice\programJ to the desktop.

126

127

Pandorabots and Program N (AIMLPad)

Program N by Gary Dubuque is the Alice chatbot hosted in a notepad text editor

with an additional script processing language for authoring dialogs to assist in

creating new AIML, which extends and develops the program's personality.

128

Program N Embrace and Extend

Program N includes a lot more features than standard AIML. Of all the open

source programs, AIMLPad has gone the furthest to embrace and extend as

many different other freeware technologies as possible. A large part of the

extension is based on the work of Kino Corsey, who adapted the Program N

engine to the OpenCyc system to create the hybrid CyN, so that for example

given the information in AIML that “the boss of John is Steve”, the existing

OpenCyc ontology helps generate (or modify) the equivalence class of AIML

categories with patterns:

“Who is John’s Boss?’, “Who does John know?”, “ Who is the boss of John?”,

“Who is the Superior of John?”, :”Who influences John?”, “Who does Steve

boss?” and “”Who does Steve influence?”.

129

Pandorabots and Program P (Pascalice)

Kim Sullivan gives us Program P, a.k.a. "PASCALice". P is written in Delphi, and

is released under the GNU GPL. Kim's page also provides an AIML checking tool

called ShadowChecker.

The equivalent file for setting predicate defaults in Pascalice is called

TestBot.variables.

130

You can create a symbolic link from the Pascalice.exe to your desktop and get a

desktop icon to start Pascalice. Click on your desktop icon to start Pascalice:

On the same web page for Pascalice, you will find a companion program called

AIML Shadow Checker. This is a very useful tool for detecting a common

131

problem in AIML files. A shadow happens when the pattern in one AIML

category blocks the pattern in another AIML category from ever being activated.

Such shadowed categories are difficult for the botmaster to find manually and yet

easy to create by careless AIML writing. The Shadow Checker is a great tool to

help you find these AIML shadows.

Incidentally, sometimes AIML shadows happen for perfectly legitimate reasons.

You may be merging the contents of two bots written by different botmasters,

who have covered the same input content. In any case, the Shadow Checker

can automatically and efficiently detect these blocked AIML categories.

The Shadow Checker works best with one or two AIML files at a time. You can

load an AIML file with the Load File button. It uses a standard Browse and Load

dialog box.

132

The button “Test all” tells us which categories had duplicate patterns between the

two files we loaded. Now we can edit the files, remove the duplicate patterns if

desired, retest them with Shadow Checker, and upload the repaired files to

Pandorabots.

Using a Spreadsheet or Database Program to Write AIML

There are many authoring tools and editors one could use to write AIML. You

can use your favorite text editor, be it MS WORD, Notepad, or a powerful text

editor like EMACS. In addition, there are many tools developed by the AIML free

software community designed to help make writing AIML easier. Pandorabots,

for example, has a web based interface that helps you write one AIML category

at a time. It also has a tool called Pandorawriter that converts dialog transcripts

into AIML categories. Other software to help write AIML categories is listed on

the A. I. Foundation web site under www.alicebot.org/downloads. Because

AIML is an XML language, you can also use editors specifically designed for

XML to author your AIML files. This document concerns a different approach,

however; one based on using a spreadsheet or database program to help write

massive numbers of AIML categories.

We will take you through a step-by-step example of creating an AIML file using a

spreadsheet program, specifically MS Excel. But the principles and procedures

are about the same for any spreadsheet or database program that allows you to

enter data in table format. There are a few pitfalls to using these programs, and

133

we will point them out. Their advantage is that you can create a large number of

AIML and manage them fairly easily. Especially, the ability to sort categories by

<pattern> or <template> makes it easy, in some cases, to eliminate duplicate

categories or find opportunities to simplify your AIML with <srai>.

The following example is a simple case of creating categories that have only a

<pattern> and <template>. More complex categories using <that> and <topic>

do not appear in this example. But after following the example, it should be easy

to see how to generalize this AIML authoring technique to categories with <that>

and <topic>.

One word of caution: a botmaster may end up wasting a lot of time creating AIML

categories that will never be activated. This is because, it is difficult to predict in

advance what kinds of conversations and inputs clients will have with your bot. A

common mistake is to create categories with patterns that are too specific to ever

be activated in a realistic conversation. This is why we generally prefer the

approach called “Targeting” to create AIML categories.

In the most general form, Targeting simply means reading the log files of

conversations with your bot to get an idea about what inputs the bot cannot

answer, and then writing new categories to handle those inputs. It is based on

the principle that if one client makes a specific input to your bot, another client

will come along later and make the same, or almost the same input, over again.

134

So it is most productive to focus your efforts on the inputs people have already

tried on your bot, than to try to predict in advance what those inputs will be.

Believe us when we say that after your bot is running online and well publicized,

you will collect plenty of conversation data to keep you busy writing AIML through

the Targeting approach.

Some AIML programs, such as Pandorabots, have special software tools to

make Targeting even more efficient. You won’t even have to read the

conversation log files one by one. The software automatically detects client

inputs for which the bot does not have a specific reply, and alerts the botmaster

to these as potential new input patterns. If you are starting a bot from scratch,

you can build up your bot’s brain using Targeting to find the most common inputs

first, and writing replies for those inputs. You can prioritize your work by writing

AIML for common inputs first, and then work on less frequent input forms later.

This approach guarantees that your bot will have the greatest “coverage” of

inputs for the amount of work you put in.

Having made that disclaimer about the Targeting approach, there are some

circumstances when you just want to write a large amount of AIML categories

without referring to dialogues or Targets. In these cases, using a database or

spreadsheet program may be a useful and timesaving approach.

135

We begin by observing that much AIML code is redundant XML, and that we

would prefer to avoid typing the same

<category><pattern></pattern><template></template></category> tags over and

over for every new AIML category. The parts that really interests us are what

goes between those <pattern> and <template> tags. So we can use a form-

entry program like MS Excel to create the data for our AIML file.

The first screenshot illustrates an example of using MS Excel to input a large

number of AIML patterns and templates, using the A and B columns of the

spreadsheet respectively.

136

Notice that we have adjusted the width of the A and B columns to take into

account the expected size of our patterns and templates. Although this is not

necessary, it makes it easier to read the categories and provides better

formatting if you want to print them out.

One convenience often provided by such programs is auto completion, which

means that if you start to type the same thing over again in the same column, the

program will match what you have typed with a previous entry and complete the

entry for you. This may not always give you what you want, but it often improves

efficiency if you are entering many similar patterns or identical templates.

It is a good idea to save your work from time to time as you enter your AIML

data, especially if you intend to create a large file. This example file is called

Psychology.aiml, so we use the File/Save menu option to repeatedly save that

file as we add new data. Eventually, the file filled up with 500 lines of data

representing 500 new AIML categories.

Another great convenience of these programs is that you can sort the categories

by different columns. For example we can take the data we have entered and

sort it by the A column by clicking on the A/Z button in MS Excel, or by pulling

down the Data/Sort menu option. As the next screenshot shows, we can click on

the A column and sort the categories by AIML pattern. One note of caution here:

if you are using Excel be sure to select both the A and B columns before running

137

the sort, otherwise you run the risk of sorting the patterns independently of the

templates, and mixing up all your categories. Database programs, unlike

spreadsheets, usually work differently and assume that the data is connected

across every row, so sorting by any column keeps the row data together. In

Excel, you can sort all the data by A or B, depending on which you select first,

but it is important to select both.

The next screenshot shows how we have sorted the categories by A, the AIML

pattern. This is extremely useful for finding specific categories or for eliminating

categories with duplicate patterns. For instance, suppose we know that the input

138

pattern BUT * appears in another AIML file, and is duplicated in this new data.

We can easily find it by sorting and then delete the BUT * category.

Now, we consider how to format our data into proper AIML categories. First, we

use the Insert menu to choose the Insert Columns option. Select the A column

first and insert a new column to the left. Select the B column next and insert a

new column between A and B.

139

Now, scroll down to the last row of data in your spreadsheet. It is important to

start at the bottom because we are going to use the Fill command to fill up the

new A column with identical data. If we start at the top, Excel won’t know where

to stop filling and create too many empty AIML categories. Go down to the last

row of data and type <category><pattern> into the last row of the A column, as

the next screen shot shows:

140

Now, select that last data entry box and use your cursor to move up to the first

data row, thereby selecting all the data boxes from 499 (in this case) back down

to one. Then, use the Edit menu to select the Fill/Up option and you should see

the A column fill up with identical entries of <category><pattern>. You may then

want to adjust the width of the A column for appearance:

141

Now, we basically repeat the same procedure in the C column by entering the

data </pattern><template> and again in the E column with the data

</template></category>. Again, scroll down to the last row of data and use the

Fill/Up option so you don’t overflow the columns with empty categories.

142

Now we are ready to convert the spreadsheet file to a text file and complete the

process of conversion to proper AIML. Using the File menu, select the Save

As… option. A dialog box will appear giving you the option to export the

spreadsheet to many different file formats. For our purposes, the best choice is

called “Text (tab delimited) *.txt”. Choosing this option will automatically create a

file name called Psychology.txt, because our original file was called

Psychology.xls.

When you click the Save button, you may encounter a series of dialog boxes

warning you about problems such as “The selected file type does not support

143

multiple sheets” and “Psychology.txt may contain features that are not

compatible with Text (tab delimited)”. Generally you can ignore these warnings

and simply click OK or Yes as your option.

After you have saved the file, you will now need to use a text editor to make

some final formatting touch-ups to create a well-formed AIML file. At this point

we often transfer the text file over to a Linux machine and use emacs to make

the final changes, but a text editor as simple as Notepad works equally well.

Let’s open the text file in Notepad and see what we have:

144

The first item of business now is to eliminate all the tabs used as delimiters. This

step is not strictly necessary for many AIML interpreters, because they will ignore

the tabs or treat them as spaces. But eliminating them makes the file look nicer.

With Notepad, you can use Edit/Replace option to replace a Tab with “nothing”.

Sometimes it is not possible to type a Tab character directly into the Find What:

text box, but you can get around this by copying and pasting a Tab character

from your source. You don’t have to type anything in the Replace With: text box,

just leave it empty and click Replace All.

145

Now we can save our work as an AIML file. Use the File/Save As… menu item

and select Save As Type: All Files. Name your file Psychology.aiml (or whatever

name you choose, use a .aiml file extension). There is only a little more work to

do to finalize your AIML file.

If you look closely, you can see that the exported spreadsheet file contains some

extra, unwanted double-quote marks. These were inserted in two cases:

whenever your XML tag contained a quoted attribute value like index=”2” and

whenever quote marks appeared in the AIML template. You need to follow the

following steps to rewrite these categories

1. Use Edit/Replace to replace all occurrences of “” (two double-quotes) with

a one “ (a single double quote).

2. Use Edit/Replace to replace all occurrences of >” with > (these occur at

the beginning of a quoted <template>.

3. Use Edit/Replace to replace all occurrences of “< with <. (these occur at

the end of a quoted template.

Of course, these rules are not foolproof. You may have wanted to have quote

marks around your template. You may have templates that contain, for whatever

146

reasons, a pair of double quotes together “”. But apart from these unusual

circumstances, the substitutions will clean up your AIML file quite well.

Finally, we need to add some text to the beginning and end of the AIML file to

make it conform to the AIML schema. The end of the file is simple, just add a

line that says </aiml>.

At the beginning of the file, you may want to include a copyright statement in

XML comment form, as well as the XML specification and the opening <aiml>

tag:

147

Finally, we have finished creating a well-formed AIML ready to upload to your

favorite AIML interpreter. As we mentioned earlier, it should be easy for you to

see how to create a similar file, which includes <that>, or <topic> patterns. In the

case of <that>, you will start by entering three columns of data and fill up two

columns with </pattern><that> and </that><template> respectively. You can add

any <topic> tags using the text editor.

In conclusion, you can use a spreadsheet or database program to efficiently write

large numbers of AIML categories. The file export features of these programs

allow you to convert the data from two- or three-column format to delimited text.

Depending on which data entry program you used and its available file export

functions, you may have to use a text editor to touch-up the file to finalize its

AIML format. These procedures may be helpful in some AIML authoring

scenarios, but you should also consider other options such as Targeting and

AIML-specific authoring tools.

Subscriptions

Pandorabots has developed a unique bot subscription service providing you the

opportunity to make money with your bot. Going back to the beginning, look over

the list of “killer apps” for chat robots. If you can think of any way to turn any of

those into a subscription service, then this chapter is for you. The non-profit

ALICE A.I. Foundation launched three subscription bots on the Pandorabots

148

server: A.L.I.C.E. Silver Edition, the CLAUDIO Personality Test Bot, and The

DAVE E.S.L. Bot. Working in partnership with Oddcast, Inc., these bots combine

animated Vhost avatars for speech synthesis and face animation, with AIML chat

features. The first step to using Pandorabots subscriptions is to find a way to

collect payments. One simple method, not 100% foolproof, is to join an online

payment transfer service like PayPal.

Now, you need to advertise your subscription bot. How you do this, is completely

up to you. You can try to get your web site in the press, in blogs, in search

engines, in other words, promoted in any way you can think of. You can also try

direct advertising of your site.

149

In its simplest form, the PayPal interface will contact you by email when a

customer signs up for a subscription. Generally the customer will be expecting

instant gratification. So it is usually a good idea to put some language in your ad

to the effect that “subscriptions will normally begin within 24-36 hours of payment

processing”. Thus it will be possible for you to get some sleep between

checking your emails in this business.

150

Access the Pandorabots subscriber list by selecting the Subscribers button for

your bot. If there are no subscribers for this bot, Pandorabots will ask you if you

wish to begin signing up subscribers to this bot.

151

In either case, you will need to fill in the fields with the subscriber’s email

address, the number of months, and the HTML skin (which will almost always be

“default”, even when we use a Vhost). When you have filled in these three items,

click on “Add Subscriber” and wait for the table to refresh.

The new subscriber appears as the last item on the new table. It is a good idea

to have a standard form letter prepared in order to notify your subscriber that his

or her bot is activated:

Dear Subscriber,

Thank you so much for supporting the research efforts of the ALICE A.I. Foundation by

152

subscribing to the A. L. I. C. E. Silver Edition. Your personal URL for unlimited private

chat with the latest edition of the award winning ALICE chat robot is:

http://www.pandorabots.com/pandora/talkbot?subid=xxxxxxxxxxxxxxx

Be sure to bookmark this URL and keep it private.

Sincerely yours,

Dr. Rich Wallace

PayPal provides its customers with a debit card, so you can withdraw the funds

your bot earns from any ATM machine as soon as your customers pay for a bot

subscription. If you are clever enough to develop a true killer app that is really

appealing on a massive scale, then you may have found a way to cash in on the

subscription bot business model. Pandorabots provides the infrastructure for you

to test out your ideas.

Pandorabots Embrace & Extend

Pandorabots had inevitably to add some features to AIML that were not part of

the AIML specification. The following code fragment demonstrates some of

these new features. Pandorabots provides the unique ability to run AIML

templates inside the HTML that will appear on the client’s browser. This very

feature, the ability to process AIML templates inside the browser HTML, is itself

an example of Pandorabot’s embrace and extend approach to AIML.

One useful set of AIML templates displays history of the last four exchanges with

the client, a dialogue history, updated every time the client says something and

the bot responds. Such a set of templates is easy to program in Pandorabots

153

AIML. But as we shall see, it makes use of almost every feature of Pandorabots

“embraced and extended” AIML.

Human inputs are displayed with a prefix prompt “Human:” and bot responses

are displayed with the bot’s name followed by a “:”. If there have been fewer

than four exchanges, the screen should appear blank rather than show unfilled

lines with prompts.

<template>

<think>

<set name="_history">

<request index="3"/>

</set>

</think>

<condition name="_history">

<li value="*">

<i><b>Human:</b></i> <request index="3"/><br/>

<i><b><bot name="name"/>:</b></i> <response index="3"/><br/>

</li>

</condition>

<br/>

</template>

<template>

<think>

<set name="_history">

<request index="2"/>

</set>

</think>

<condition name="_history">

<li value="*">

<i><b>Human:</b></i> <request index="2"/><br/>

<i><b><bot name="name"/>:</b></i> <response index="2"/><br/>

</li>

</condition>

<br/>

</template>

<template>

<think>

<set name="_history">

<request index="1"/>

</set>

</think>

<condition name="_history">

<li value="*">

<i><b>Human:</b></i> <request index="1"/><br/>

<i><b><bot name="name"/>:</b></i> <response index="1"/><br/>

</li>

154

</condition>

<br/>

</template>

Wildcard in conditions

Pandorabots has adopted a boundary condition in AIML where the list item in the

condition tag has a value equal to the wild card “*”. In this example the <set>

operation sets the AIML predicate “_history” to the value of <request index=”1”/>.

If <request index=”1”/> has not been set, then it cannot match any value,

including “*”. Using this bit of AIML trickery, Pandorabots says that the AIML

code inside the <li value=”*”> will not be executed because “_history” is set to

“undefined”. I am as much in favor of the undefined as the next person, but this

is not standard AIML.

Wildcard in indexes

This example doesn’t show it, but Pandorabots also allows wildcards in some

AIML tag indexes. For example, the tag

<that index="1,*"/>

indicates the set of input sentences included in <that index=”1,1”/>…<that

index=”1,N”/>.

Request and Response

Here is a general problem of mathematical reference that appears in AIML. You

might call it, the problem of “multiline response”. Consider a dialogue between

155

two individuals. One of them, B, asks, or says, something, that begins and ends

with a sentence. It consists of several sentences. What B says is, as we say,

“multiline”. The respondent, A, next utters his or her own reply to what he or she

has heard. What A says is also multiline.

And so what B says next. Sometimes, of course, the multiline utterances consist

of just one line, but in general a script consists of sequences of such back-and-

forth, multiline responses.

At the lowest level AIML provides for processing individual input sentences. One

AIML pattern matches one input sentence. The next level of context is usually

provided by the <that> variable. Most of the time, AIML has no way to

distinguish whether inputs came from multiline input sequences, or from

individual inputs, which may help explain some bizarre constructions that emerge

from unpredictable multiline input queries.

The AIML specification provides for indexed <input/> and <that/> tags to store

the values of previous input values and robot replies. The <input index=”X”/>

tag is one dimensional but the <that index=”X,Y”/> tag is already two

dimensional, owing to the fact that the Xth previous input can have Y sentences

in it’s reply. We see here that AIML makes no distinction for input sentences that

come from multiline inputs, or one shots, so to speak, because doing so would

add another needless indexing dimension to <input/> and <that/>.

156

The typical AIML interpreter master loop is to append all of the output sentences

together into a single output paragraph for the bot output. If the program keeps a

history of these outputs and the associated multiline inputs, then it has created

something very similar to the Pandorabots <request/> and <response/> tags.

Getting back to the example, <request/> and <response/> are the indexed history

tags of the entire multiline input and output of the human and bot, respectively.

Formatted date tag

Pandorabots supports three extension attributes to the date element in

templates:

locale

format

timezone

timzeone should be an integer number of hours +/- from GMT and that locale is

the iso language/country code pair e.g., en_US, ja_JP. Locale defaults to

en_US. The set of supported locales are:

af_ZA ar_OM da_DK en_HK es_CO es_PY fr_CA is_IS mt_MT sh_YU vi_VN

ar_AE ar_QA de_AT en_IE es_CR es_SV fr_CH it_CH nb_NO sk_SK zh_CN

ar_BH ar_SA de_BE en_IN es_DO es_US fr_FR it_IT nl_BE sl_SI zh_HK

ar_DZ ar_SD de_CH en_NZ es_EC es_UY fr_LU ja_JP nl_NL sq_AL zh_SG

ar_EG ar_SY de_DE en_PH es_ES es_VE ga_IE kl_GL nn_NO sr_YU zh_TW

ar_IN ar_TN de_LU en_SG es_GT et_EE gl_ES ko_KR no_NO sv_FI

ar_IQ ar_YE el_GR en_US es_HN eu_ES gv_GB kw_GB pl_PL sv_SE

ar_JO be_BY en_AU en_ZA es_MX fa_IN he_IL lt_LT pt_BR ta_IN

ar_KW bg_BG en_BE en_ZW es_NI fa_IR hi_IN lv_LV pt_PT te_IN

ar_LB bn_IN en_BW es_AR es_PA fi_FI hr_HR mk_MK ro_RO th_TH

ar_LY ca_ES en_CA es_BO es_PE fo_FO hu_HU mr_IN ru_RU tr_TR

ar_MA cs_CZ en_GB es_CL es_PR fr_BE id_ID ms_MY ru_UA uk_UA

format is a format string as given to the Unix strftime function:

http://www.opengroup.org/onlinepubs/007908799/xsh/strftime.html

You can include your own message in the format string, along with one or more

format control strings. These format control strings tell the date function whether

157

to print the date or time, whether to use AM or PM, a 24 hour clock or a 12 hour,

abbreviate the day of the week or not, and so on. Some of the supported format

control strings include:

%a Abbreviated weekday name

%A Full weekday name

%b Abbreviated month name

%B Full month name

%c Date and time representation appropriate for locale

%d Day of month as decimal number (01 – 31)

%H Hour in 24-hour format (00 – 23)

%I Hour in 12-hour format (01 – 12)

%j Day of year as decimal number (001 – 366)

%m Month as decimal number (01 – 12)

%M Minute as decimal number (00 – 59)

%p Current locale’s A.M./P.M. indicator for 12-hour clock

%S Second as decimal number (00 – 59)

%U Week of year as decimal number, with Sunday as first day of week (00

– 53)

%w Weekday as decimal number (0 – 6; Sunday is 0)

%W Week of year as decimal number, with Monday as first day of week (00

– 53)

%x Date representation for current locale

%X Time representation for current locale

%y Year without century, as decimal number (00 – 99)

%Y Year with century, as decimal number

%Z Time-zone name or abbreviation; no characters if time zone is

unknown

%% Percent sign

If you don't specify a format you'll just get the date using the default format for the

particular locale.

timezone is the time zone expressed as the number of hours west of GMT.

If any of the attributes are invalid, it will fall back to the default

behavior of <date/> (i.e. with no attributes specified)

158

To display the date and time in French using Central European time you would

use:

<date locale="fr_FR" timezone="-1" format="%c"/>

You can also improve the specificity of common certain time and date related

inquiries to the ALICE bot, as illustrated by the following dialogue fragment.

Human: what day is it ALICE: Thursday. Human: what month is it ALICE: December. Human: what year is this ALICE: 2004. Human: what is the date ALICE: Thursday, December 02, 2004.

No system tag

The AIML <system> tag is the key to creating the operating system of the future,

because it runs any operating system command. In standard AIML, you can use

<system> to do everything from tell you the date and time, to open a Notebook

editor, to control a robot, you name it! Your imagination is the limit when you

consider all the possibilities. But unfortunately Pandorabots does not let you take

over their system with the <system> tag, which is exactly what hackers and

malicious coders would do if it were available to the general public for free.

Which is unfortunate too because Pandorabots is written in Lisp, and a <system>

tag to the Lisp evaluator would be a fascinating project for AIML developers. But

remember, you are running your bot on their server, so it makes sense that a

159

limitation like no <system> tag might exist. Likewise, there is no equivalent of the

server-side <javascript> tag.

You can of course write client-side Javascript code, or any client-side code that

you can embed in HTML, such as an applet, because you may include any

HTML inside the AIML response. The <script> tag is normally safe inside AIML

responses in Pandorabots. It will be passed along to the browser and interpreted

there.

No predicate defaults

Although we saw in a previous section how to set predicate defaults in

Pandorabots with AIML, most other AIML interpreters support predicate defaults

in different way, using a startup data file. Similarly, Pandorabots lacks botmaster

control over a variety of functions that are pretty much closed or hard-wired, at

least for the time being, in Pandorabots.

Deperiodization – Removing ambiguous punctuation like “Dr.” and “St”, and

also applying heuristic rules to determine what makes a sentence a sentence.

This feature is hard wired in Pandorabots.

Normalization – Expanding contractions, removing all remaining punctuation,

repairing many spelling errors. This feature is hard wired in Pandorabots.

Predicate defaults – AIML predicates have a default value for <get/>. You

can only set one global <get/> value in Pandorabots. In this book, under the

160

section on custom HTML, we showed a trick using embedded HTML-side

AIML (another non-standard, embrace-and-extend feature) to set the default

value of predicates.

Predicate <set/> returns – Some predicates return the predicate name, such

as pronouns, and some return the set values. These choices are hard wired

in Pandorabots.

Pandorabots AIML Tags Set

The tags in the table below correspond to the set of AIML implemented by the

Pandorabots AIML interpeter. These are not exactly the same set of AIML

tags adopted by the AIML Architecture committee for the Artificial Intelligence

Markup Language (AIML) Version 1.0.1 A.L.I.C.E. AI Foundation Working

Draft, 18 February 2005 (rev 007). For comparison see the table of AIML 1.0.1

tags at alicebot.org. This table, however, refers to that document where

appropriate.

There are both small and large differences between the Pandorabots tag set

and the AIML standard. In particular, there is no <id/>, <size/>, <version/>,

<gossip>, <system>, or <javascript> tag in Pandorabots, and the interpretation

of the <learn> tag is quite different.

161

Other documents found on alicebot.org, useful for understanding the

Pandorabots AIML tags include:

The AIML 1.0.1 Tags Set

The AIML 1.0 Tags Set

The AIML Overview by Dr Rich Wallace.

A Tutorial for adding knowledge to your robot by Doubly Aimless

In the table, XML tags are shown in a shorthand notation. Closing tags are not

shown. The index attribute whenever it appears is optional. The default value

is index="1" (or index="1,1" for 2-d indexes). The index tag uses offset one

indexing.

AIML Tag WD Reference Remark

<aiml> 3.2. AIML Element AIML block delimeter

<topic name="X"> 4. Topic X is AIML pattern

<category> 5. Category AIML knowledge unit

<pattern> 6. Pattern AIML input pattern

<that> 6.1. Pattern-side That contains AIML pattern

<template> 7. Template AIML response template

<star index="N"/> 7.1.1. Star binding of *

<that index="M,N"/> 7.1.2. Template-side That

previous bot utterance

<input index="N"/> 7.1.3. Input input sentence

<thatstar index="N"/> 7.1.4. Thatstar binding of * in that

<topicstar index="N"/> 7.1.5. Topicstar binding of * in topic

<get name="XXX"/> 7.1.6. Get Botmaster defined XXX, default

<bot name="XXX"/> 7.1.6.1. Bot Custom bot parameter

<sr/> 7.1.7. Short-cut elements

<srai><star/></srai>

<person2/> 7.1.7. Short-cut elements

<person2><star/></person2;>

<person/> 7.1.7. Short-cut elements

<person><star/></person;>

<gender/> 7.1.7. Short-cut elements

<gender><star/></gender;>

<uppercase> 7.2.1. Uppercase convert all text to Uppercase

162

<lowercase> 7.2.2. Lowercase convert all text to Lowercase

<formal> 7.2.3. Formal capitalize every word

<condition name="X" value="Y"> 7.3.1. Condition One shot branch

<condition name="X"> 7.3.1. Cond>,<condition>

<set name="XXX"> 7.4.1. Set May return XXX or value

<srai> 7.5.1. SRAI Recursion

<person2> 7.6.1. Person2 swap 1st & 3rd person

<person> 7.6.2. Person swap 1st & 2nd person

<gender> 7.6.3. Gender change gender pronouns

<think> 7.7.1. Think Hides side-effects

Pandorabots Extension Purpose Remark

<condition name="X" value="*"> Branch with undefined value

One shot branch

<li name="X" value="*"> Branch with undefined value

used by <condition>

<li value="*"> Branch with undefined value

used by <condition>

<date locale="X" timezone="Y" format="Z"/>

date and time Unix strftime format

<that index="M,*"/> previous bot utterances multi-sentence

<request index="N"/> input request multi-sentence

<response index="N"/> output response multi-sentence

<learn> save AIML category non standard

<eval> AIML evaluation expression inside <learn>

Finding Other Resources

This book has only touched upon some of the major points of AIML. If you want

to get into Artificial Intelligence Markup Language in more depth, we recommend

you join the ALICE A. I. Foundation at www.alicebot.org. The ALICE A. I.

Foundation sets the standard for AIML and releases the free software behind the

A. L. I. C. E. brain. You can keep up with all the latest developments of in AIML

by joining the A. I. Foundation.

163

My other book, The Elements of AIML Style, also available on the A. I.

Foundation web site, provides much more detail about the AIML language itself.

Pandorabots is really only one of many different implementation of AIML. You

can take the knowledge you’ve gained here and apply it to many different pieces

of free AIML software available on the alicebot.org web site. You can also join

mailing lists and enter the discussion with other botmasters like yourself from

around the world who are on their own journey just like yours. Many of the

questions you are asking, they will already know the answers to and be happy to

share with you. The site www.alicebot.org is an excellent starting point for

meeting the worldwide A. L. I. C. E. and AIML community.

The End of The Journey

If you have made it this far, you have absorbed everything you need to know to

get started creating, hosting and selling your bot on Pandorabots. This

guidebook has provided you with all the basic steps necessary to get your bot up

and running on Pandorabots, to publish it on the web, to link it with talking,

animated virtual hosts, and to make money with your bot by signing up paying

customers as subscribers.

This is not really the end of your bot journey, but the beginning. Using the tools

you’ve acquired in this book, you can now begin your new career in Artificial

Intelligence as a botmaster. Remember, the most important skill for a botmaster

164

is not computer programming, but writing. The art of AIML is writing believable

responses for your bot that are brief, entertaining, grammatically correct, concise,

and above all evoke a “suspension of disbelief” in the client. The skill is not that

different from what is needed to develop characters for novels, movies, or

television.

Glossary

AIML - a markup/programming language for creating chat bots. AIML is a subset

of XML.

A. L. I. C. E. – a chat robot personality developed by Dr. Richard S. Wallace

Bot – The artificial intelligence chat robot.

Botmaster – The author or creator of the chat robot.

category – The basic unit of knowledge in AIML. A category contains an input

pattern, optional <that> and <topic> patterns, and a response template

Client – A person chatting with the bot.

165

Default category – A category with a pattern that contains a wildcard.

Default Response template - what is said when nothing is matched.

Graphmaster - A data representation of the categories in AIML.

Input pattern - The AIML pattern that matches the input sentence (or sentences)

provided to the robot.

Lisp – An artificial intelligence programming language used to program the

underlying code for Pandorabots.

Navigation Bar – The list of HTML links you can follow to navigate around the

Pandorabots web site.

pattern – The input part of the AIML category. AIML patterns are made up of

letters, numbers, spaces and the wildcard characters * and _.

pattern Matching - A capability of matching input sentences against stored

sentences

predicates – Variables relating to the client, which may change during the course

of a client conversation.

166

Properties – Variables relating to the bot, which remain constant.

template – the output or response part of the AIML category, either a response or

a program to generate a response.

Ultimate default category – A category that matches when the input sentence

fails to match any other category.

Wildcard – A special character that can match one or more words. AIML

wildcards include the star * and underscore character _.

XML - a markup/programming language similar to HTML.

Index

A. I. ................................................. 7

A. L. I. C. E..... 42, 43, 62, 63, 68, 72,

74, 168, 169

AAA ............................................... 16

Advanced Alter Response ............ 91

Advanced Alter Response page... 39,

40, 47, 71, 72, 73

AIML .... 6, 17, 30, 37, 38, 39, 40, 41,

42, 43, 44, 45, 46, 47, 48, 50, 52, 53,

54, 55, 56, 58, 60, 61, 64, 65, 68, 70,

71, 72, 73, 74, 77, 78, 80, 81, 83,

168, 169, 170, 171

avatar .............................................. 6

Avatars ............................................ 7

blogs............................................ 153

bot property ................. 39, 65, 67, 68

botid ...................................... 19, 120

Botmaster Control ................... 18, 19

167

Botmaster Control Page ................ 83

botmaster. ....................................... 6

category ... 31, 38, 39, 40, 41, 42, 43,

46, 50, 54, 55, 59, 60, 61, 62, 63, 64,

74, 75, 170, 171

chat robot ....................................... 6

Chatterbot Collection .................... 11

client 6, 7, 39, 41, 45, 46, 50, 53, 55,

56, 57, 61, 63, 72, 73, 74, 75, 77, 78,

81, 169, 171

condition ..................................... 159

Create a Pandorabot ............... 13, 14

date ............................................. 161

default response ........................... 50

deperiodiation ............................... 32

Deperiodization ..................... 32, 164

ELIZA ...................................... 63, 64

embrace and extend ........... 133, 157

Graphmaster ............................... 170

HTML ..... 30, 37, 39, 45, 54, 92, 106,

170, 172

indexes ....................................... 159

javascript ....................................... 93

Lisp.............................................. 170

Loebner Prize ................................ 17

mailing lists .................................... 12

matching algorithm. ....................... 33

MS Excel ..................................... 137

MSN Messenger.......................... 118

My Pandorabots ............................ 13

Navigation Bar ..... 13, 18, 78, 92, 170

normalization ................................. 33

Normalization ........................ 33, 165

Oddcast VHost ............................ 105

Pandorawriter ........ 78, 80, 81, 82, 83

Pandorbaots.com, ........................... 6

pattern 31, 38, 39, 42, 43, 50, 51, 52,

54, 55, 59, 61, 63, 71, 72, 74, 75, 81,

83, 170

PayPal ......................................... 153

predicate .. 40, 41, 42, 43, 44, 45, 46,

63, 171

predicate defaults ................ 134, 164

Predicate defaults........................ 165

predicates .......................... 26, 47, 73

pronouns ........................... 45, 64, 73

168

properties .......................... 65, 68, 73

property ....................... 21, 39, 41, 46

publish .......................................... 19

recursion ....................................... 32

screename .................................. 119

SitePal ........ 100, 101, 102, 104, 112

Speech recognition ......................... 7

spreadsheet ................................ 137

star ..... 50, 52, 54, 55, 59, 63, 64, 65,

171

subscription ........... See subscriptions

symbolic reduction ........................ 32

target ....................................... 88, 89

Targeting ..................................... 138

template ... 31, 38, 39, 42, 43, 44, 46,

48, 50, 54, 55, 58, 63, 65, 68, 69, 70,

71, 73, 75, 76, 83, 170, 171

Train .............................................. 27

-training ......................................... 71

Training ......................................... 27

ultimate default .................. 59, 61, 62

Ultimate default ... 59, 60, 63, 64, 171

underscore ................ 50, 51, 52, 171

VHost . 102, 104, 105, 106, 107, 108,

109, 111, 112

voice recognition ............................. 7

Voice synthesis ............................... 7

wildcard .... 50, 51, 52, 54, 55, 59, 61,

63, 64, 65, 87, 92, 170, 171

wildcards ..................... 33, 50, 58, 83

Wildcards ...................................... 52

XML ................... 39, 41, 45, 169, 172