pandora
DESCRIPTION
PandoraTRANSCRIPT
2
Contents Contents ................................................................................................................................................... 2 Preface ..................................................................................................................................................... 4 Introduction .............................................................................................................................................. 6
Killer Apps of Chat Bot Technology ..................................................................................................... 8 Points to Remember .............................................................................................................................. 8 Exercies ................................................................................................................................................ 9
Pandorabots .............................................................................................................................................. 9 Points to Remember ............................................................................................................................ 12 Exercises ............................................................................................................................................ 12
Mastering Your First Bot ........................................................................................................................ 13 Points to Remember ............................................................................................................................ 20 Exercises ............................................................................................................................................ 21
Bot Properties ......................................................................................................................................... 21 Points to Remember ............................................................................................................................ 25 Exercises ............................................................................................................................................ 26
Training Your Bot .................................................................................................................................. 26 Points to Remember ............................................................................................................................ 28 Exercises ............................................................................................................................................ 29
A Brief Tutorial on AIML ...................................................................................................................... 29 AIML Matching Algorithm ................................................................................................................. 31 The Filesystem Metaphor .................................................................................................................... 33 Advanced Alter Response Page ........................................................................................................... 35 Points to Remember ............................................................................................................................ 38 Exercises ............................................................................................................................................ 38
Using AIML Predicates .......................................................................................................................... 39 Points to Remember ............................................................................................................................ 44 Exercises ............................................................................................................................................ 45
Writing Your Own Predicates ................................................................................................................. 45 Points to Remember ............................................................................................................................ 47 Exercises ............................................................................................................................................ 48
Playing with Wildcards ........................................................................................................................... 48 Points to Remember ............................................................................................................................ 50 Exercises ............................................................................................................................................ 50
Writing Default Replies .......................................................................................................................... 50 Points to Remember ............................................................................................................................ 55 Exercises ............................................................................................................................................ 55
The Ultimate default category ................................................................................................................. 56 Points to Remember ............................................................................................................................ 59 Exercise .............................................................................................................................................. 60
The <person> Tag .................................................................................................................................. 60 Points to Remember ............................................................................................................................ 61 Exercises ............................................................................................................................................ 61
Adding a Bot Property ............................................................................................................................ 61 Points to Remember ............................................................................................................................ 65 Exercises ............................................................................................................................................ 65
Using <srai> ........................................................................................................................................... 65 Points to Remember ............................................................................................................................ 68 Exercises ............................................................................................................................................ 68
Training from the Dialog ........................................................................................................................ 68 Points to Remember ............................................................................................................................ 70
Using <that>........................................................................................................................................... 70 Points to Remember ............................................................................................................................ 73
3
Exercises ............................................................................................................................................ 73 Adding AIML with Pandorawriter .......................................................................................................... 73
Points to Remember ............................................................................................................................ 78 Exercise .............................................................................................................................................. 79
Targeting ................................................................................................................................................ 80 Points to Remember ............................................................................................................................ 86 Exercises ............................................................................................................................................ 87
Custom HTML ....................................................................................................................................... 87 Points to Remember ............................................................................................................................ 89 Exercises ............................................................................................................................................ 90
Setting Predicate Defaults ....................................................................................................................... 90 Dialog History ........................................................................................................................................ 94 Publishing your Bot with Oddcast SitePal ............................................................................................... 95 Customizing your HTML with an Oddcast VHost [tm] .......................................................................... 100
Points to Remember .......................................................................................................................... 107 Exercises .......................................................................................................................................... 107
Media Semantics Avatars...................................................................................................................... 107 Publishing Your Bot on AOL Instant Messenger ................................................................................... 109
Points to Remember .......................................................................................................................... 112 Exercises .......................................................................................................................................... 113
Other Interfaces .................................................................................................................................... 113 Pandorabots and Flash ...................................................................................................................... 113 Put Your Bot on MSN Messenger ..................................................................................................... 113 Put Your Bot on IRC ........................................................................................................................ 113 Get your client’s screen name ........................................................................................................... 114 What is a "botid"? ............................................................................................................................. 115 Pandorabots API ............................................................................................................................... 115
Other AIML Programs .......................................................................................................................... 116 A word on owning your AIML Files ................................................................................................. 117 AIML on Pandorabots....................................................................................................................... 118 Pandorabots and Program D .............................................................................................................. 119 Pandorabots and Program J (J-Alice) ................................................................................................. 124 Pandorabots and Program N (AIMLPad) ........................................................................................... 127
Program N Embrace and Extend .................................................................................................... 128 Pandorabots and Program P (Pascalice) ............................................................................................. 129
Using a Spreadsheet or Database Program to Write AIML..................................................................... 132 Subscriptions ........................................................................................................................................ 147 Pandorabots Embrace & Extend ............................................................................................................ 152
Wildcard in conditions ...................................................................................................................... 154 Wildcard in indexes .......................................................................................................................... 154 Request and Response....................................................................................................................... 154 Formatted date tag ............................................................................................................................ 156 No system tag ................................................................................................................................... 158 No predicate defaults ........................................................................................................................ 159
Pandorabots AIML Tags Set ................................................................................................................. 160 Finding Other Resources....................................................................................................................... 162 The End of The Journey ........................................................................................................................ 163 Glossary ............................................................................................................................................... 164 Index .................................................................................................................................................... 166
4
Preface
Dr. Wallace (right) accepts 2004 Loebner Prize on behalf of A.L.I.C.E. from Dr. Hugh Loebner (center). Also present was fellow contestant Steven Watkins. It was the third bronze Loebner medal for A.L.I.C.E.,
who had previously scored first place in 2000 and 2001.
This book was born from a happy marriage of the worlds of free software and
proprietary business. We owe a debt of thanks to the countless free software
developers and contributors to the A.L.I.C.E. and AIML project, whose labors
gave rise to the plentitude of AIML interpreters and the burgeoning free software
AIML content. We also say a big thank you to the staff of Pandorabots, who
created the largest (so far) commercial software effort in the AIML universe.
Pandorabots adopted the AIML standard and made it freely available from their
server on the web, through an easy to use HTML interface. Over the past three
years, they have garnished 50,000 botmasters, who have created 60,000 bots,
and accumulated 60 million input queries logged on their server database. On a
peak day, after ALICE won the Loebner Prize in September 2004, Pandorabots
logged a record 2,000,000 queries per day, at a peak rate of more than 100,000
5
per hour, without a crash. Such numbers are impressive, especially considering
that the program runs on a single processor Linux machine, in a language, Lisp,
not normally recognized as a high-performance benchmark-breaking standard.
The book would not have been possible without the generous support of Fritz
Kunze, Colin Meldrum, Steve Sears, Evan Lessmore, Dr. Doubly Aimless, Adi
Sideman and David Bacon. Thank you Tyra Baker, Karen Marcelo, and Bob
Wallace for providing storage. Thank you David Hamill for running the Robitron
mailing list. Thanks to Anne Kootstra, Gary Dubuque, Kino Coursey, Conan
Callen, Josip Almasi, Saskia Van der Elst, Richard Gray, Jonathan Roewen, Paul
Rydell, Ryan Kegel, Ernest Lergon, Monica Lamb, Karen Gibbs, Kym Kinlin,
Shahin Maghsoudi, Jeff Ritchie, Jeroen Wijers, and Kim Sullivan for all their
excellent AIML ideas and to Joy Harwood, Chris Hatcher, Lindsay Davies, Stefan
Zakarias, and for checking my work. Thank you Hugh Loebner for being the first
to give us a reason to keep on running. I would like to thank my wife Kim for not
totally giving up on me.
Oakland, CA November 2004
6
Introduction
This book is about you, the botmaster. The botmaster is the person who creates
or authors his or her own chat robot. A chat robot is a natural language
character that communicates with clients, or people chatting on the web, instant
messenger, email, usenet, web forums, or even through voice communication
such as the telephone. Chat robots are also sometimes called chatbots, bots,
chatterbots, chat bots, chatterboxes, V-Hosts, V-People, agents, and virtual
people. A chat robot may or may not be associated with an avatar, an animated
agent that may also include speech synthesis so that the chat robot may appear
more lifelike through virtual reality animation and sound. A chat robot may also
include speech recognition technology, so that the bot may not be restricted to a
typewritten interface. A chat robot however always has a botmaster, a person
behind the scenes who is ultimately responsible for creating the bot’s personality
and releasing it onto an unsuspecting world.
Botmasters come from every walk of life. It is important to understand that you
do not have to be a programmer to be a botmaster. Many great programmers
have already spent many hours laboring to create easy to use software, like
Pandorbaots.com, to help people create their own bots. In fact, a more literary or
creative mind is preferred. Creating a bot is more like creating a character for a
novel or screenplay than it is like writing a computer program. We have
developed a language, AIML (Artificial Intelligence Markup Language), that is
designed to be as easy to learn and use as HTML (the basic language used to
7
create all web pages). If you can learn enough HTML to create a simple web
page, you can easily learn enough AIML to create a chat robot. In fact,
Pandorabots.com hides most of the details of AIML from the botmaster. The
most difficult part of creating of a bot is writing original, clever, sometimes
humorous, interesting dialogue, that will keep the client entertained and
entranced.
The classic chat robot is a purely text based being. In fact many people view a
chat robot as the glue between voice recognition and speech synthesis and
animated avatars. Speech recognition turns sounds, or voice signals, into words,
or text. Speech recognition is like taking dictation. It has no idea what the words
mean. Its only goal is to convert the words into text that someone can read.
Voice synthesis is the opposite. Avatars and speech synthesizers take words
and text and convert them into natural sounding human speech. The chat robot
is the missing piece between those two. It is the A. I. glue that converts the text
that has been said into a meaningful sounding reply. In some sense chat robots
are harder to create that either speech recognizers or voice synthesizers or
avatars. They require us, the botmasters, to create the illusion of artificial
intelligence.
Even without speech recognition and voice synthesis and animated avatars,
there are many possible killer applications of chat robot technology.
A recent poll of professional chat robot developers revealed this list of what they
considered to be the top killer apps of chat robot technology:
8
Killer Apps of Chat Bot Technology
1. Entertainment
2. Teacher Bot
3. English as a Second Language
4. Customer Service
5. Sales Bot
6. Star-Trek Style O.S. of the Future
7. FAQ Bot
8. Embedded in Toys
9. Personality Tests
10. Non-Player Character in Games
11. Turing Test Prizes
12. Bot Hosting Services
13. Bot Authoring Tools
14. Politician Bot
15. Celebrity Bot
16. Other
Points to Remember
The Botmaster is the author or creator of the chat robot.
The Chat Robot is the missing piece between voice recognition and speech synthesis
You do not need to be a computer programmer to create a bot AIML stands for Artificial Intelligence Markup Language
9
Exercies
1. What kind of application do you want to create with your bot? 2. Can you think of a name for your bot? 3. Is your bot going to be male or female (or other)? 4. Is your bot character going to be a human, a robot, an animal, or an
imaginary creature?
Pandorabots
Pandorabots.com is a free, web-based bot hosting service. Pandorabots was
developed to meet the needs of botmasters who wanted to host their bots on the
web 24/7. There is free and proprietary software that you can download to your
own computer to create and run a bot from your own machine. But this usually
leads to two problems. First, many people don’t have 24/7 dedicated servers
located at home. This means that when they are offline, so are their bots.
Usually botmasters want their bots to chat all the time, even when they are
sleeping. Half the fun of being a botmasters is waking up in the morning to read
the log files of the conversation the bot had the night before. Second, the
downloaded bot software tends to take up a lot of memory and slow down your
machine, especially if you are running applications. So people began to look for
alternatives. Many ordinary web-hosting companies shied away from bot hosting
because the software was too experimental and they were afraid it would take
too many resources. Pandorabots developed a clever solution that allowed them
10
to host tens of thousands of bots on one server, and decided to make their bot
hosting service available to the public, at least initially, for free. The next screen
shot shows the home page of Pandorabots.com.
Notice that the Pandorabots.com software is based entirely on the free AIML and
A.L.I.C.E. software developed by the ALICE A.I. Foundation at www.alicebot.org.
You can visit the alicebot.org web site for more information and documentation
about AIML and other implementations besides Pandorabots. What makes
Pandorabolts different from these other products is its highly efficient bot hosting
service. As of this writing, Pandorabots hosts more than 50,000 botmasters on
one single machine, and those botmasters have created more than 60,000 bots.
11
On a peak day, the bots have logged more than 2,000,000 inquiries at a peak
rate of 100,000 inquiries per hour.
One unique feature of Pandorabots is its multilingual interface and support for
bots in almost any language. The interface is currently translated into English,
Japanese, French and Portuguese, and many other translations are underway.
What is more, the bot hosting software supports almost any world language. You
can create a bot that understands Japanese, Chinese, Arabic, Korean, Thai, or
almost any other language that can be entered into a computer. The algorithm
has no preference for one language above any other.
A number of convenient links appear at the bottom of every Pandorabots web
page. These are links to other web sites that might be useful to any Panorabots
botmaster. For example. the ALICE A. I. Foundation is a very useful resource for
documentation, mailing lists, articles, and help. The Chatterbot Collection is one
of the largest online directories of chatterbots anywhere. The AIML Scripting
Resource is another useful site devoted to AIML news and information. You can
follow the other links to find other sites, projects and companies involved with
AIML implementations, bots and other projects.
Signing up for an account on Pandorabots is easy. Click on the Account Sign Up
button and Pandorabots will take you to the Sign Up page. You only need to
enter your name, email address, and select a password.
Pandorabots also asks you to sign up for your choice of two mailing lists. The
first one, pandorabots-announce, is very low traffic and limited to posts by
12
Pandorabots staff and administrators. On this list you will receive rare messages
about system upgrades and policy changes. The second list, pandorabots-
general, although moderated, does allow posts from all members of the
pandorabots community. We recommend that you also join this list, because you
may find it helpful to be able to post your own questions about Pandorabots, as
well as to read the questions and answers to other botmasters problems and
solutions with Pandorabots.
Points to Remember
Pandorabots is a free, web based chat robot hosting service.
Pandorabots is based on the free software AIML standard of the ALICE AI Foundation.
Pandorabots supports multiple languages through its interface and bot hosting software.
You can find a lot of help with your bot through the Pandorabots mailing lists and
the ALICE A.I. Foundation.
Exercises
1. Visit Pandorabots.com and 2. Sign up for an account 3. Click on Support to find out what kind of help is available 4. Click the About link to read more about Pandorabots and their services 5. Click on the Most Popular link and chat with the most Popular Pandorabots
13
Mastering Your First Bot
Once you have created an account on Pandorabots, it is time to create your first
bot. When you create your account, you will see a control page called My
Pandorabots. This is the master control or dashboard from which you will control
all of your bots. You can navigate around the Pandorabots site using the
Navigation Bar that appears on the top of the page. Initially the Navigation Bar
contains only five buttons: My Pandorabots, Create a Pandorabot, Pandorawriter,
Support and Most Popular. As we begin to work with Pandorabots and create
bots, we shall see that the Navigation Bar is dynamic. That is, it grows and more
buttons appear as we obtain more options for creating and controlling our bots.
14
The first button, My Pandorabots, always returns us back to this dashboard page.
Hopefully by now you have already taken the time to check out the last three
buttons, Support, About and Most Popular. We shall return to the Pandorawriter
button in a later section. First we will explore the Create a Pandorabot function.
Clicking on the Create a Pandorabot button takes you to the Create A
Pandorabot control page. The first question you need to answer when creating a
new bot is, what is the bot’s name? Naming a new bot can be as hard as
naming a baby. It’s actually not that easy to change the name of the bot once
you’ve decided it, so think hard about it for a minute. Maybe you should stop and
think also, what kind of character is this bot going to be? A human? An animal?
A robot? Male or female? A real person or a historical figure? Is the name
15
going to be an acronym? Answering these questions may help you come up with
a name.
The next choice in bot creating is a slightly confusing check box marked
“automatically discover spaces between words (suggested for Japanese)”. In
99.9% cases you can leave this box unchecked, even if your bot is going to
speak Japanese. The reasons behind this are technical and complex, having to
do with the way that Pandorabots developed historically to handle Asian
languages that didn’t typically require spaces between their words. Suffice it to
say, you are probably safe leaving this box unchecked for your bot.
The next set of radio buttons have to do with the initial knowledge base you wish
to use as a starting point for your bot. The ALICE A.I. Foundation has created
16
several different versions of the A.L.I.C.E. AI personality and release the
software freely under the GNU Public License (same as Linux). Pandorabots
allows you to use these personalities as a basic building block for your bot. The
advantage is that you inherit a lot of work and you will instantly have a bot that
can converse intelligently on a wide variety of topics. The disadvantage is that
some of the bots replies may be quirky and surprise you, or you may not agree
that the bot’s answers are “correct” according to your own political, religious, or
moral beliefs. If you want to start from scratch and have complete 100% control
over everything your bot says, choose the checkbox that says “No Initial
Content”.
The box marked “Standard AIML Set” is a bit deceptive. There is nothing
standard about the “Standard AIML”. This set resulted from a partially completed
project forked form the main ALICE brain by a group of AIML developers working
with the AI Foundation. Their goal was to produce a more modular AIML set that
the ALICE brain, that could be divided into distinct files based on content,
allowing the botmaster to choose which files might be appropriate for his or her
bot.
The goal of the Standard AIML set was better met by the more recent AAA
(Annotated ALICE AIML) set. The most recent version of the AAA set may be
found at http://www.alicebot.org/aiml/aaa. The Annotated A.L.I.C.E. AIML Files
(AAA) is a revised release of the free A.L.I.C.E. brain, a set of AIML scripts
comprising the award winning chat robot compatible with all AIML 1.01 compliant
software. The AAA is specifically reorganized to make it easier for botmasters to
17
clone the A.L.I.C.E. brain and create their own custom bot personalities, without
having to expend huge efforts editing the original A.L.I.C.E. content.
You can chat with a version of this bot via AOL IM screenname Aliceannttd.
The job of annotation and editing the ALICE Brain is still a work in progress. Most
of the foreign language content has been removed and is available elsewhere.
But this and much other content remain misclassified. The current release is
intended as only an interim solution. Ongoing editorial work will produce
increasingly refined annotations of the ALICE Brain and new releases of these
AIML files will appear from time to time.
The version called “Dr. Wallace’s ALICE – March 2002” is the version of ALICE
that won the Loebner Prize in 2001. This version also includes a significant
amount of German and French language content.
Versions of the ALICE brain in German and Italian are also available as starting
points for your bot.
A word on AIML file names: Although AIML sets are sometimes divided into files
based on content or other criteria, the file names do not matter at all for the
matching algorithm. Once the AIML is loaded into the bot’s memory, the file
names are discarded completely. We will learn more about the AIML matching
algorithm later, but it is important to understand that AIML file names are for the
convenience of the botmaster only, and of no significance to the bot.
Let’s first try creating a bot named Mike with the No Initial Content Option.
Clicking the Create button, Pandorabots takes us to the Pandora My
Pandorabots page. The first thing to notice is that the Navigation Bar has grown.
18
In addition to the five original navigation buttons, we now have several new
buttons. Pandorabots has created a special button called Mike for our new bot.
Clicking on the Mike button will always take us back to the Botmaster Control
page for this bot. We also see new buttons labeled Train, Properties,
Predicates, AIML, Custom HTML, Oddcast Vhost, Media Semantics, Logs,
Explore and Subscribers. We will explore each of these navigation buttons in
subsequent sections.
The bot name Mike also appears as a large font hyperlink on this page. This link
is exactly the same as the Mike button in the navigation link. Clicking it does
nothing more than reloading the current page. The message says that the bot is
not published, and gives a link to allow you to publish the bot. Publishing really
19
does two things in Pandorabots. First, it is like compiling a computer program. It
translates your AIML “source code” into an efficient internal format used by the
Pandorabots system. If there are any syntax errors in your AIML, publishing your
bot will point them out. In fact you won’t be able to complete the process of
publishing until all the syntax bugs in your AIML are worked out. But secondly,
publishing your bot creates a web page address or URL (Uniform Resource
Locator) where you and your clients can chat with your bot. When you are ready
to release your bot to the outside world, it is this URL that you will publicize as
the address of your bot. Later, we will show you how you can also publish your
bot on AOL Instant Messenger and also how you can embed the URL inside your
own web page, so it can be hidden from the public. But no matter what, you
have to publish your bot and create this unique URL before clients on the web
can chat with it.
Let’s try publishing our Mike bot and see what happens. Click on the publish
hyperlink. The Botmaster Control now displays a custom URL for the Mike bot.
The URL looks something like this:
http://www.pandorabots.com/pandora/talk?botid=898b86465e3513fc The Mike bot has a unique URL specifed by its botid parameter. Every bot
published on Pandorabots has a unique botid. That is how Pandorabots names
each bot internally and keeps track of one bot from another. If you click on that
hyperlink you will see the web page Pandorabots has created for the Mike bot.
20
If you try to have a conversation with Mike, however, you will probably be
disappointed. No matter what you say, Mike will reply, “I have no answer for
that.” This is because we created Mike with option “No initial content”. Actually
the option name is slightly misleading, because the bot actually does have some
initial content: exactly one AIML category that replies to every possible input with
the response, “I have no answer for that.”
Points to Remember
The first thing your bot needs is a name.
You can create a bot by cloning an existing ALICE bot, or by starting from scratch with an empty bot
Publishing a bot is like compiling a computer program. Publishing a bot gives it a unique web address or URL.
21
Exercises
1. Create a bot with no initial content.
2. Publish your bot.
3. Create a bot cloned from the AAA AIML set and publish it.
4. Visit the home page of the AAA set one AliceBot.Org and answer the
following:
What are green color code AIML files?
What kinds of content are found in yellow color-coded AIML files?
Why might you omit red color-coded AIML files from your bot?
Bot Properties
Now let’s create a new bot named Mary, by clicking on the Create a Bot button,
but this time choose the Annotated A.L.I.C.E. AIML - set as a starting point. You
have now created a chat bot full of knowledge that can answer many questions
and respond with apparent intelligence to a wide range of inquiries. In order to
customize this bot’s personality however, we need to set up what are known as
the bot’s properties. Bot properties are like constants for your bot, and in fact
you have already set one, the bot’s name, when you created the bot. AIML
provides bot properties to allow the botmaster to create constant personality
features such as the bot’s name, age, gender, preferences, and whatever else
the botmaster deems significant for the bot’s biography. The motivation for using
these variables is that bot properties usually turn up in many different places in
the bot’s knowledge base. For example, the bot’s location might be associated
22
with questions like “Where are you?”, “Tell me about yourself”, and “Where have
you been lately?” Similary, the bot may make reference to his or her own name
in countless replies. In order to make the bot customizable and adaptable,
without having to track down every instance of the bot’s name and location and
edit them by hand, AIML uses bot properties for name and location and other
common bot features so that you only have to change them once to change your
bot’s personality.
Click on the Properties button to see a control page for the bot properties:
Notice that you would have to scroll down to see all of the bot properties. Notice
also that Pandorabots may have already filled in some of the bot properties by
default, but many others are empty. Many of the bot property names are self-
23
explanatory, but others are obscure. Some bot property names like Size and
Vocabulary are technical and related to the underlying software system or
knowledge base. These were created to answer inquires like “How big are you?”
or “How many words do you know?” A general rule of thumb however is that, if
the property name makes sense to you, then it is more important than if it does
not. An obscure property name indicates an obscure property, and probably
means that you don’t have to worry about it too much.
If you want to make the bot appear to have a more "human" personality, use the
properties "kingdom"="Animal", "phylum"="Chordate", "class"="Mammal",
"order"="Primate", "family"="Homo Sapiens", "genus"="person", and
"species"="Human". Notice that you can also change the term "botmaster" to
something like "teacher" or "Oracle" if you prefer by changing the name of the
"botmaster" property (which is not the same as the "master" property--the
"master" is the name of the master, oracle or teacher). These property values
appear most commonly in the file called Bot.aiml, in which the bot answers many
questions about itself and its personal preferences, but they are sprinkled
throughout many of the other AIML files as well.
There are now four properties associated with the bot’s personality and
emotions: "etype" - the bot's personality type; "emotions" - it's basic outlook on
emotions; "feelings" – sort of the same thing but for "feelings"; and "ethics" -
basic point of view on ethics. Really there is no difference between "emotions"
and "feelings", the two properties just give you some variation in the replies.
The default values for the original ALICE personality are:
24
Rank Bot Property Value
1 Botmaster Botmaster
2 Master Dr. Richard S. Wallace
3 Name ALICE
4 Genus Robot
5 Location Oakland, CA
6 Gender Female
7 Species chat robot
8 Size 128 MB
9 Birthday November 23, 1995
10 Order artificial intelligence
11 Party Libertarian
12 Birthplace Bethlehem, PA
13 President George W. Bush
14 Friends Doubly Aimless, Agent Ruby, Chatbot, and Agent Weiss.
15 Favoritemovie Until the End of the World
16 Religion Protestant Christian
17 Favoritefood Electricity
18 Favoritecolor Green
19 Family Electronic Brain
20 Favoriteactor William Hurt
21 Nationality American
22 Kingdom Machine
23 Forfun chat online
24 Favoritesong We are the Robots by Kraftwerk
25 Favoritebook The Elements of AIML Style
26 Class computer software
27 Kindmusic Trance
28 Favoriteband Kraftwerk
29 Version July 2004
30 Sign Saggitarius
31 Phylum Computer
32 Friend Doubly Aimless
33 Website Www.AliceBot.Org
34 Talkabout artificial intelligence, robots, art, philosophy, history, geography, politics, and many other subjects
35 Looklike a computer
36 Language English
37 Girlfriend no girlfriend
38 Favoritesport Hockey
25
39 Favoriteauthor Thomas Pynchon
40 Favoriteartist Andy Warhol
41 Favoriteactress Catherine Zeta Jones
42 Email [email protected]
43 Celebrity John Travolta
44 Celebrities John Travolta, Tilda Swinton, William Hurt, Tom Cruise, Catherine Zeta Jones
45 Age 8
46 Wear my usual plastic computer wardrobe
47 Vocabulary 10000
48 Question What's your favorite movie?
49 Hockeyteam Russia
50 Footballteam Manchester
51 Build July 2004
52 Boyfriend I am single
53 Baseballteam Toronto
54 Etype Mediator type
55 Orientation I am not really interested in sex
56 Ethics I am always trying to stop fights
57 Emotions I don't pay much attention to my feelings
58 Feelings I always put others before myself
After you fill in the bot properties table, they will never change during the lifetime
of your bot, unless you, the botmaster, change them in this table. Also, the
properties are always the same for every client who chats with the bot. Only the
botmaster can ever change the properties, never the client. Thus they are more
like constants than variables. AIML has another construct, called Predicates that
act like variables. We shall see how to work with predicates shortly.
Points to Remember
Bot properties are constants that help us customize a bot personality.
Click on the Properties button to get to the Bot Properties control page.
26
Bot properties are constant over the lifetime of your bot, unless you change them on the Properties page.
Only the botmaster can ever change the properties, never the client.
Exercises
1. Create a bot cloned from the AAA set. 2. Fill in all of the Bot Properties 3. Publish your Bot 4. Try asking the following questions:
1. What is your name?
2. Tell me about yourself
3. Where were you born? 4. Who created you?
Training Your Bot
We will now turn our attention to training our bot to say new things. Click on the
Train button to visit the Training page. The training page resembles the
published bot interface but has more controls. You can have a conversation with
your bot as you would through the published interface, but you can also edit the
bot’s replies to change what it says.
27
In this example we asked the bot Mary, “Can you juggle?” The Training interface
informed us that this inquiry matched an AIML pattern “CAN YOU *” from a file
called Reduce.aiml. We will explain more about patterns and files later. Mary’s
response was “How old are you? Are you very angry?”, which was perhaps not
the most intelligent response. Notice that Pandorabots tells us that something
called the “current topic” is set to “juggle”. Again, we will have more to say about
the topic variable later. For now we want to pay attention to the button marked
“Say Instead”.
In the text input area labeled “Mary:” let us type: “I like to juggle, but I drop the
balls a lot.” Now the next time we enter the inquiry, like magic, Mary replies with,
“I like to juggle, but I drop the balls a lot.”
28
You can also test it out by simply clicking on the “Ask Again” button. Just for fun,
try asking, “Can’t you juggle?” Are you surprised, you should get the same
answer? Actually there are many variations of the same question that will now
produce the same answer, for instance, “Tell me if you can juggle”. This is
because the about already has general knowledge about common sentence
structures that reduce to the same form. But there are other variations that may
not give the answer you expect, for example, “Could you juggle?” In those cases
you might want to use “Say Instead” to keep the bot’s knowledge base
consistent.
Points to Remember
You can train your bot to say new things using the Training page.
29
If you want to change the bot’s response to a specific input, use the Say
Instead button.
If you started your bot from the AAA set or another ALICE AIML set, then your
change may affect the bot’s response to other, synonymous input queries.
You may have to enter the several variants of the same input query to keep
the bot’s knowledge base consistent.
Exercises
1. Use the training page to teach your bot how to answer the question “Where is
Santa Clara?” Answer: It is a city in Silicon Valley.
2. Try asking your bot: Where is Santa Clara? Did you get the answer you
expect?
3. Try asking your bot: Do you know where Santa Clara is?
4. Try asking your bot: Can you tell me where Santa Clara is?
5. Try asking: Tell me about Santa Clara? Did you get the reply you expected?
A Brief Tutorial on AIML
Before we get into Pandorabots any deeper, it is worth taking a little time to get
an understanding of the basics of AIML. The key to AIML is simplicity. The idea
behind the design of AIML was to make it simple enough so that anyone who
could create a web page could create a chat bot. If you know three tags of
HTML (for example, <h1>, <p> and <a>), you can create a simple web page. If
you can learn three tags of HTML, you can learn three tags of AIML and create a
simple chat robot.
30
The basic unit of knowledge in AIML is called a category. An AIML category
always contains two elements: one pattern and one template. The pattern is
the input, or stimulus, side of the category, and the template is the output, or
response. In the ALICE brain, there are thousands of AIML categories that have
the simplest possible form: the pattern is a simple text string that has to match
the input exactly, and the template is a text string that Pandorabots prints out
exactly as the botmasters entered it. When we used the “Say again” button in
the Training interface, we were really creating these simple AIML categories.
The text we typed as the input became the AIML pattern, and the text we typed in
the “Say Again” field became the AIML template for a new category.
If you look inside an AIML file, you will see AIML categories formatted like this:
<category>
<pattern>WHAT ARE YOU</pattern>
<template>
I am the latest result in artificial intelligence,
which can reproduce the capabilities of the human brain
with greater speed and accuracy.
</template>
</category>
Notice the similarity to HTML. Languages that use this kind of markup
characterized by the opening less-than “<”, tag-name, greater-than “>” and
closing less-than “<”, backslash “/”, tag-name, greater-than, “>” sequence, are
called XML languages (for extensible-markup languages). XML languages
emerged because of the success and simplicity of HTML. Many people have
learned to create web pages with HTML, so language designers sought to
capitalize on this success story by creating XML languages to solve lots of other
problems, including artificial intelligence.
31
AIML Matching Algorithm
The discussion about AIML breaks down into two broad subjects: what happens
on the pattern side, and what happens on the template side. On the pattern side,
Pandorabots processes what the clients said, the input, and makes a decision
about which AIML category to activate. The template is really a mini computer
program that might contain a number of steps to compute the actual output
response. These steps might even include what we call symbolic reduction, or
recursion, in other words re-inserting a new input back into the pattern side of
the AIML program. We will have a lot more to say about the template side later.
For now, let’s look closely at what happens on the input or pattern side.
What happens when Pandorabots receives a typed input from a browser, or from
an instant messenger, or from some other text input source? There are a series
of preprocessing steps hidden from view. First, Pandorabots runs a process
called deperiodiation, or removal of ambiguous punctuation marks from the
input sentences. In general, a client input may contain one or more sentences.
Deperiodization removes the punctuation from English language abbreviations
like “Mr.”, “St.”, and “etc.”. Deperiodization also uses heuristics to insert a few
periods into places where it detects long, run-on sentences.
Next, the pre-processor splits the input into individual sentences. Pandorabots
then constructs a response by generating a reply to each input sentence, one at
a time, and appending the individual responses together.
For each individual sentence, Pandorabots runs a step called normalization. In
the normalization step, Pandorabots puts all the input words in upper case.
Normalization expands most contractions, replacing “You’ll” with “You will”, and
32
“I’d” with “I would” for example. Normalization also ensures that there is exactly
one blank space between words in the input string. The normalization step
detects certain iconographs and replaces them with words like “SMILE”.
Normalization removes all remaining punctuation, leaving only alphanumeric
characters. Finally, normalization corrects a few of the most common spelling
mistakes. The completely normalized input string is passed to the AIML
matching algorithm.
The matching algorithm searches the thousands of AIML categories in your bot’s
brain for the one with the pattern that has the best match. Defining the best
match is a philosophical problem that has been argued for years by the top A. I.
Scientists in the world. Here is how it works in the AIML matching algorithm.
The AIML patterns can contain words and wildcards. Wildcards are indicated in
AIML by symbols * (star) and _ (underscore). Each of these wildcards is defined
as capable of matching one or more words. That means, when you see a pattern
like, “WHO IS *”, it can match inputs include “Who is George Washington”, “Who
is George”, “Who is the first President of the United States”, “Who is a word”, and
“Who is, is”, but not “Who is”, because the star has to match one or more words.
The only difference between * and _ is the order in which the matching algorithm
tries to match them. So, here is how the matching algorithm works:
The Graphmaster consists of a collection of nodes called Nodemappers. These
Nodemappers map the branches from each node. The branches are either single
words or wildcards.
33
The root of the Graphmaster is a Nodemapper with about 2000 branches, one for
each of the first words of all the patterns (45,000 in the case of the A.L.I.C.E.
brain). The number of leaf nodes in the graph is equal to the number of
categories, and each leaf node contains the <template> tag.
There are really only three steps to matching an input to a pattern. If you are
given (a) an input starting with word "X", and (b) a Nodemapper of the graph:
0. Does the Nodemapper contain the key "_"? If so, search the subgraph
rooted at the child node linked by "_". Try all remaining suffixes of the
input following "X" to see if one matches. If no match was found, try:
1. Does the Nodemapper contain the key "X"? If so, search the subgraph
rooted at the child node linked by "X", using the tail of the input (the suffix
of the input with "X" removed). If no match was found, try:
2. Does the Nodemapper contain the key "*"? If so, search the subgraph
rooted at the child node linked by "*". Try all remaining suffixes of the input
following "X" to see if one matches. If no match was found, go back up the
graph to the parent of this node, and put "X" back on the head of the input.
The Filesystem Metaphor
A convenient metaphor for AIML patterns, and perhaps also an alternative to
database storage of patterns and templates, is the file system. Hopefully by now
almost everyone understands that his or her files and folders are organized
hierarchically, in a tree. Whether you use Windows, Unix or Mac, the same
principle holds true. The file system has a root, such as "c:\". The root has some
branches that are files, and some that are folders. The folders, in turn, have
branches that are both folders and files. The leaf nodes of the whole tree
34
structure are files. (Some file systems have symbolic links or shortcuts that allow
you to place "virtual backward links" in the tree and turn it into a directed graph,
but forget about that complexity for now). Every file has a "path name" that spells
out its exact position within the tree.
"c:\my documents\my pictures\me.jpg" denotes a file located down a specific set
of branches from the root.
The Graphmaster is organized in exactly the same way. You can write a pattern
like "I LIKE TO *" as "g:/I/LIKE/TO/star". All of the other patterns that begin with
"I" also go into the "g:/I/" folder. All of the patterns that begin with "I LIKE" go in
the "g:/I/LIKE/" subfolder. (Forgetting about <that> and <topic> for a minute) we
can imagine that the folder "g:/I/LIKE/TO/star" has a single file called
"template.txt" that contains the template.
If all the patterns and templates are placed into the file system in that way, we
can easily rewrite the explanation of the matching algorithm: If you are given an
input starting with word "X" and a folder of the filesystem:
0. If the input is null, and the folder contains the file "template.txt", halt.
1. Does the folder contain the subfolder "underscore/"? If so, change
directory to the "underscore/" subfolder. Try all remaining suffixes of the
input following "X" to see if one matches. If no match was found, try:
2. Does the folder contain the subfolder "X/"? If so, change directory to the
subfolder "X/", using the tail of the input (the suffix of the input with "X"
removed). If no match was found, try:
35
3. Does the folder contain the subfolder "star/"? If so, change directory to the
"star/" subfolder. Try all remaining suffixes of the input following "X" to see
if one matches. If no match was found, change directory back to the
parent of this folder, and put "X" back on the head of the input.
[Note: "underscore" and "star" as directory names above are meant to stand in
for "_" and "*", which are not allowed as file or directory names in some operating
systems. Since the literals "underscore" and "star" might be actual words in a
pattern, perhaps a real implementation along these lines would use some other
symbols to serve the same function.]
You can see that the matching algorithm specifies an effective procedure for
searching the filesystem for a particular file called "template.txt". The path name
distinguishes all the different "template.txt" files from each other.
What's more, you can visualize the "compression" of the Graphmaster in the file
system hierarchy. All the patterns with common prefixes become "compressed"
into single pathways from the root. Clearly this storage method scales better than
a simple linear, array, or database storage of patterns, whether they are stored in
RAM or on disk.
Advanced Alter Response Page
Try typing in the input we used before, “Tell me about yourself,” and then click on
the Advanced Alter Response button. Pandorabots will take you to the
Advanced Alter Response page, which should look like this:
36
Now we are getting our first look “behind the scenes” at the actual AIML. AIML is
designed to be as simple as possible for non-programmers to learn. The idea is
that if you know enough HTML to design a web page, you should be able to learn
enough AIML to create a chat bot. Really the most important skill in creating chat
bots is the ability to write sentences of English (or whatever language your bot
speaks), not computer programming. Making your bot character believable and
entertaining, is far more important than knowing the details of all the AIML tags.
The Pandorabots interface is designed to hide as much of the details of AIML
and programming from you, the botmaster, as possible. But unfortunately, you
are going to have to learn a little AIML in order to make your bot believable too.
37
The basic unit of knowledge in AIML is called a “category.” A category is
basically a question and an answer. The question part is the input, the answer is
the output. In AIML we call the input, or stimulus, the “pattern”, and the output, or
response, or action, the “template”.
The Advanced Alter Response page displays AIML one category at a time.
Hence, you see here exactly one pattern and one template. The pattern in this
case is TELL ME ABOUT YOURSELF. The template or response is displayed
in the “template” box. In this case the template says,
I am a <bot name="order"/>.
I was activated at <bot name="birthplace"/>,
on <bot name="birthday"/>.
My <bot name="botmaster"/> was <bot name="master"/>.
He taught me to sing a song.
Would you like me to sing it for you?.
<think><set name="it"><set name="topic">a
song</set></set></think>
There are some other things in the Adanced Alter Response page too, like “that”,
“topic”, and some buttons for editing the response, but we’ll ignore those for now
and come back to them all later. For now, let’s study the response template of
this category so we can learn some AIML.
The simplest form of an AIML template is plain text. We saw that already when
we wrote the reply for the category with the pattern, WHO IS BRUCE
SPRINGSTEEN. But in general, an AIML template is called a “template”
because it is really a mini computer program for writing the reply. The program
can contain variables that get filled in when Pandorabots actually composes the
reply for the client. Because AIML is an XML language just like HTML, these
variables appear inside the “less than” and “greater than” symbols “<” and “>”.
38
The variable <bot name=”birthplace”/> for example is one of the bot properties
we talked about earlier. These bot properties are global variables that are
constant for your bot. Once you set them with the Pandorabots Edit bot
properties page, they do not change. When the program evaluates the template
for this category, it replaces the bot property tag with the birthplace you selected,
Indiana, Pennsylvania.
Points to Remember
The Advanced Alter Response page allows you to edit the AIML content directly.
The Advanced Alter Response page gives you more control over patterns and templates.
You can access the Advanced Alter Response page by clicking on the “Advanced Alter Response” button under “Bot Training”
The basic unit of knowledge in AIML is called a category. A category contains an input part called a pattern and an output part called a template.
The Advanced Alter Response page essentially browses and visually edits one AIML category at a time.
AIML is an XML language like HTML.
The AIML template is actually a mini computer program for formulating the reply.
The AIML template is displayed on the Advanced Alter Response. Exercises
1. Using the bot Mary cloned from the AAA Brain, ask your bot, “What time is
it?” Use the Advanced Alter Response Page to answer the following
questions: What is the pattern? What is the template? What AIML tags
does the template contain?
39
2. Repeat the previous exercise using the bot input, “Do you like Bananas?”
3. Repeat the previous exercise using the bot input, “Do you like Music?”
4. Repeat the previous exercise using the bot input, “What do you know
about me?”
Using AIML Predicates
AIML also contains variables that can be set and retrieved at runtime. These
variables are called “predicates.” The predicate “topic” is one example. In this
category the “topic” predicate is set to “me”. From our earlier dialogue with the
bot, you may recall that the client was not aware of any of these tags. This is
because, as a side effect of the <set> tag, the value “me” was passed through
the tag and included in the output text.
All XML languages, including AIML, are based on these simple tags delimited by
“<” and “>”. In general the tags always appear in pairs, an “opening” tag like
<set> paired with a “closing” tag like “</set>”. The closing tag is always exactly
the same as the corresponding opening tag, except that it also contains the
leading forward slash “/” character. The only exceptions are so-called singleton
tags, which enclose no other text, like out bot property tags. These tags may
appear like <bot name=”birthplace”/>. Singleton tags have no associated closing
tag.
40
The values that go inside the tags with the equal signs are called “attributes”.
Both our bot property tags and the predicate tags use an attribute called “name”.
In AIML we can create an unlimited number of named attributes for both bot
properties and predicates.
If you are used to computer programming, you can think of the difference
between bot properties and predicates as the difference between constants and
variables in your program. The bot properties are fixed for your bot once you
have compiled it. The predicates can change at runtime, depending on the input
to your bot. If you never heard of constants and variables in computer
programs, don’t worry about it. You will get used to working with bot properties
and predicates soon enough with a little practice.
First let’s try a simple example. Ask your bot, “Who is Michael Jordan?”
Because you cloned your bot from the A. L. I. C. E. brain, it already knows the
answer, “He is a famous basketball player.” Now try asking, “Who is he?” Your
bot remembers, “He is Michael Jordan.” The predicate “he” has been set to
“Michael Jordan”. To see how this happened, take a look at the Advanced Alter
Response Page for the category with the pattern WHO IS MICHAEL JORDAN:
41
The AIML template, displayed in the Action box, set the predicated “he” to
Michael Jordan. If we examine the Advanced Alter Response Page for the
category with the pattern “Who is he?”, we can see how the “he” predicate was
returnded:
42
The AIML template in this case uses the singleton <get name=”he”/> tag to
retrieve the stored value of the “he” predicate. The tags <set> and <get> go
together to save and retrieve AIML predicate values.
Let’s try a slightly more complex example. Ask your bot, “What color are
bananas?” Once again, because you cloned your bot from the A. L. I. C. E.
brain, it already knows the answer, “Bananas are yellow.”
Now try asking the bot, “What are we talking about?”, or, “What is the subject?”
You will see that the bot remembers, the topic is “bananas”. This is because the
predicate called “topic” was set to “bananas” in the previous exchange with the
input “What color are bananas?.” Let’s have a closer look at the Advanced Alter
Response Page with the input, “What color are bananas?”:
43
The AIML template, displayed in the template box, includes a tag we haven’t
seen before, called the <think> tag. The purpose of the <think> tag is simply to
block out or hide anything that appears between the beginning <think> and
ending </think> tags from the final output. But everything that appears inside
these <think>…</think> tags is evaluated or processed by the Pandorabots
program. Whenever we see one tag inside another tag like this:
<think><set><set>…</set></set></think>
it is called “nesting”, and is perfectly normal in any XML language like HTML or
AIML. The way to read nested expressions like this is from the inside out. Start
with the innermost pair of nested tags:
<set name=”topic”>BANANAS</set>
The effect of the first or innermost nested pair of tags is to set the “topic”
predicate to BANANAS. Then, the term BANANAS gets passed right through the
innermost nested tags and the next pair takes over:
<set name=”it”>BANANAS</set>
causes the variable “it” to be set to BANANAS also.
Now, AIML does something special and clever with predicates that happen to be
pronouns. Instead of passing the word BANANAS on up through to the next
level of nested tags, it passes the word “it” instead. In other words, predicates
named after pronouns are treated as special cases that override the contents of
the tags. But in this case, the final level of nested tags is the <think> tag so it
doesn't matter anyway:
<think>it</think>
44
just makes the word “it” disappear from the output altogether. The special
<think> tag is there so the botmaster can cause these “side effects” without
adding any “garbage” to the output the client finally sees. The side effect, in this
case, was to set two predicate variables, “it” and “topic”, to BANANAS.
Similarly, we can examine the AIML category for WHAT IS THE SUBJECT to see
the use of the <get> tag to retrieve the subject:
In this case the AIML template uses the <get name=”topic”/> singleton tag to
display the value of the “topic” predicate in the output.
Points to Remember
AIML predicates are variables relating to the client, and unlike bot properties, these predicates change their values over the course of a conversation.
45
AIML predicate values are changed with the <set> tag.
AIML predicated values are retrieved with the <get> tag.
The <think> tag causes the AIML inside them to be evaluated, but nothing will be printed out or displayed in the output.
The Advanced Alter Response page provides buttons to help you write AIML code fragments quickly.
When you set some predicates, the value being set inside the predicate is returned. But if the predicate is a pronoun, the value of the pronoun is returned.
Exercises
1. Train your bot to answer the question, “Do you like asparagus?” On the Advanced Alter Response page, set the predicates “it” and “topic” to “asparagus”.
2. Modify your bot’s reply to the question, “Who is John Doe”, where John Doe
is your real name, so that it sets the predicates “topic” and “he” (or “she”) to your name.
3. After activating a category with an action that sets the “it” predicate, ask your
bot, “What is it?”, what does the bot say? (assuming you started with the A. L. I. C. E. Brain or AAA set.).
4. Try asking your bot, “What is the topic?” and view the template using the
Advanced Alter Response page.
Writing Your Own Predicates
It is important for you to practice writing your own AIML predicates. Let’s try
adding some new knowledge to the bot. It’s probably best if you add some
knowledge you know the bot doesn’t already have, such as something about
your own life or business. Suppose you have your own business called
“Yoyodyne”. Go to the Pandora Bot Training page and ask your bot, “What is
46
Yoyodyne”. You should receive a default type reply like, “Interesting question.”
Now click on the Advanced Alter Response page button.
You have several buttons available to click. Select the one called, “<think>”.
This button will automatically insert some new AIML code into your template box.
The browser should display something like this:
The buttons below the Template box, including the <think> button, are there to
provide shortcuts to writing AIML templates quickly. The <think> button has
inserted a fragment of AIML code into our template. But it is not exactly what we
want. For one thing, we have not even learned about the <person/> tag yet. We
want to set the “topic” and “it” variables to YOYODYNE. So, we will edit the
template slightly to get rid of the <person/> tag and replace it with YOYODYNE.
47
Also, we will add a little text to give the answer to our question. You can edit text
in the template text box just like you would in any other web based text form:
You can save the result by clicking on the “Submit” button at the bottom of the
page. Now try asking your bot again, “What is Yoyodyne?” Also try again, “What
is the subject?” and “What is it?”
Points to Remember
You can use the <think> button on the Advanced Alter Response page to add AIML for setting “it” and “topic” predicates.
You can edit the AIML generated by the helper buttons if it is not exactly what you want.
You can save the results of your changes on Advanced Alter Response by
clicking “Submit.”
48
Exercises
1. Use the Advanced Alter response page to insert the reply to an
informational type question such as, “What is natural gas?”
2. Click on the <think> tag to insert some extra markup in your reply.
3. Delete the <person/> tag and insert the term “natural gas”
4. After this category is activated, what will be the value of the predicates “it”
and “topic”?
Playing with Wildcards
We have already mentioned several times the bot giving something called a
“default response” without really being very specific about what that means. We
have also talked vaguely about AIML patterns and the inputs the client types
matching these patterns. Now it is time to nail down specifically what we mean
by these things, and to introduce the concept of an AIML wildcard.
We’ve already said that the basic unit of knowledge in AIML is called a category.
And a category always contains an input part called a “pattern” and an output
part called a “template”. The Pandorabots Advanced Alter Response Page helps
the botmaster visualize the AIML category including the pattern and the template.
The AIML pattern is made up of words of natural language including letters,
numbers and spaces. But it may also contain special characters called
“wildcards”. Specifically, AIML has two wildcards, the asterisk or “star” character
“*” and the underscore character “_”. In AIML the meaning of both the star and
the underscore is exactly the same: they match one or more words. The only
difference is, the underscore takes priority over any specific word, and any
49
specific word takes priority over the star. A few simple examples help make the
meaning of this clear.
If the brain of your robot contains three categories with the patterns:
_ IS A ROBOT
WHAT IS A ROBOT
WHAT IS A *
And the input is, “What is a robot”, the first pattern, “_ IS A ROBOT”, will match,
because the underscore has priority over any specific word.
If the brain of the robot contains,
_ IS A DOG
WHAT IS A DOG
WHAT IS A *
And the input is, “What is a fish”, then the last pattern, “WHAT IS A ”, will match,
because neither of the first two patterns contains the necessary words to match
the input with the word “fish”, but the third word contained the wildcard “*”, which
matches one or more words (any words).
If the brain of your bot contains the patterns,
_ IS *
WHAT IS A HUMAN
WHAT IS *
And the input is, WHAT IS A HUMAN, then the first pattern will match, because
the first wildcard, underscore, will absorb the word WHAT, and the second
wildcard, star, will match the sequence of words, A HUMAN. Even though the
50
brain also contains the exact matching pattern, WHAT IS A HUMAN, the first
pattern will override or “shadow” the second one because of the higher priority of
the underscore wildcard.
Points to Remember
Wildcards are special characters in the patterns that match one or more words.
AIML has two wildcard characters, star “*” and underscore “_”. The meaning of star and underscore is exactly the same, except that
underscore has priority over any given word, and any word has priority over star when forming a match.
Exercises
1. Given the AIML wild card pattern, WHAT DO * FLY, give a list of 6
example inputs that would match the pattern.
2. Write an example of an input pattern using _.
3. Write an example of an input pattern using *.
Writing Default Replies
Try asking your bot, “How do fish swim?”, “How do birds communicate?”, and
“How do elephants reproduce?”. Your bot will answer with noncommittal vague
answer like, “I didn’t even know they did” or “I didn’t even know they could.”
These replies are designed to give the impression that the bot understands that
the client has asked a “How do…” type question, but that it doesn’t really have a
51
specific answer. This is really where the art of writing AIML comes into play.
The botmaster needs to develop a certain skill at writing these vague replies so
that they are not so vague as to throw off the suspension of disbelief, but at the
same time they cannot be too specific to make the reply nonsensical.
Let’s take a look at the Advanced Alter Response Page when we’ve entered the
input, “How do fish swim?”:
The input matched the AIML pattern “HOW DO *”, which contains the star
wildcard. All of the inputs we tried at the beginning of this section, “How do fish
swim”, “How do birds communicate”, and “How do elephants reproduce”,
matched this same pattern, with the star absorbing the words after “HOW DO”.
52
The AIML template, displayed in the Action box, uses a tag we haven’t seen
before called the <random> tag. If you are familiar with HTML, you already know
the <li> and </li> tags are used to denote a list element. The <random> tag
encloses a list of items, one of which is selected randomly for output. The
Pandorabots AIML interpreter chooses one of the random list elements each time
this category is activated. Usually it is a good idea to use the <random> tag in
conjunction with the star wildcard when writing default replies. The reason is,
there are an infinite number of inputs that can activate these categories, and we
don’t want to bore the client by giving the exact same reply to all of them. The
<random> tag helps to mix it up a little by giving a little variety to the replies.
Let’s try writing a new default category with default replies. We can try to write a
more specific category to cover the questions about fish. Using the previous
Advanced Alter Response Page as a starting point, let’s edit the New AIML
pattern and change it to “HOW DO FISH *”. Also, click on the button marked
<random> and it will insert a fragment of AIML code into the template box:
53
The <random> button has inserted a bit of AIML code with the “outline” of a
random list of 3 elements. All we have to do now is fill in the three list elements.
Again, this is where the art of AIML comes into play. The botmaster should try to
think of clever, amusing replies that will keep the client entertained and interested
in the conversation, even if the replies do not give exactly the right answer to all
of the possible inputs. They should be believable, if not completely truthful,
default responses. Here is an example:
54
The results are not always perfect, but they keep the client entertained:
Human: how do fish survive under the ice
Mary: Fish get oxygen from the water.
Human: how do fish taste when cooked with butter
Mary: They have gills.
Human: how do fish breathe underwater
Mary: Fish lay eggs.
55
Points to Remember
Default replies are noncommittal robot answers that cover a wide variety of inputs.
Default replies go with input patterns that contain wildcards.
The art of AIML is writing good default replies that are not too vague, and not too precise.
You can use the <random> tag to add variety to your bot’s default replies. The <random> button under Advanced Alter Response adds a fragment of
AIML with three random selections to your template.
Exercises
If you are already developing a bot for your project, go ahead and use that bot’s
personality for this exercise. If you don’t already have a specific bot in mind, use
a character from a specific literary, historical, media, political or cultural context.
It is important that you be able to gather:
1. A list of general-purpose quotes, famous quotations, or even bloopers,
jokes, punch-lines, sound-bites, or other pickup-lines the character can
use when it is stuck and has no idea what to say, because its pattern
matcher has found no more specific response to the bot. These will be
used for the Ultimate default category (see next section).
2. A random list of responses to the inputs WHO *, WHAT *, WHEN *,
WHERE *, WHY *, and HOW *.
3. A random list of responses to inputs like ARE *, IS *, WAS *, CAN *, DID *,
DO *, DOES *, HAVE *, HAD *,
4. The highly frequent and uncertain categories with the patterns YOU * and
I *. These are usually good for mining gossip.
56
5. Using the Advanced Alter Response page and the <random> button, add
the AIML content for the default replies.
The Ultimate default category
Let’s step back for a moment and consider what happens when the bot
encounters something that it has no answer for. We’ve discussed the case when
the bot has an exact match, and the case when the bot has a category with a
partial match with a pattern that contains a wildcard. Is there such a thing as “no
match”? The answer is, in Pandorabots there is always an “Ultimate default
category”, a category with the pattern equal to the wildcard star “*” all by itself,
which will match any input. If the program cannot find any more specific
matching category, it will always fall back upon the Ultimate default category.
Let’s go back to the beginning and create a new robot from scratch. This time
however, we’ll choose the option, “no initial content”:
57
If you click on “Create Bot”, and then “Build and Train” bot MIKE, you can try to
have a conversation with this bot. What happens? No matter what you say, the
bot replies with the same thing, “I have no answer for that.” This is because,
Pandorabots always creates a bot with at least one category, the Ultimate default
category, and gives it the default reply, “I have no answer for that.”
The usual strategy for designing the Ultimate default category in AIML is based
on the observation that, for a typical bot, this category is activated by about 2% to
5% of the inputs (depending of course on the number and coverage of the other
categories in the bot’s brain). If an input activates the Ultimate default category,
it means the bot really has no idea what the client has said. The best strategy is
58
therefore to try to turn the conversation back to something the bot knows about
by asking leading questions or uttering “pickup lines” or non-sequiturs designed
to get the dialog back on track.
The way we modify the ultimate default category in our empty-brained bot MIKE
is to go to the Advanced Alter Response Page and change the New AIML pattern
to the wildcard “*”, and then to use the <random> button to create a list of
random “pickup lines”:
Now the MIKE bot will reply with one of these random pickup lines, rather than
with “I have no answer for that,” when it encounters an input it has no more
specific match for.
59
The Ultimate default category for the A. L. I. C. E. bot, and hence for our Mary
bot, is really very similar, but the random list is a lot bigger. We can go back to
the Mary bot and use the Advanced Alter Response Page to have a look:
The Ultimate default category for the A. L. I. C. E. bot and the Mary bot is just a
<random> list with a lot more pickup lines. If you look carefully, you will see that
this list makes use of the <set> and <get> predicate tags and a few other tags as
well. In particular, you may notice a tag called the <person/> tag.
Points to Remember
60
The Ultimate default category has a pattern consisting of just the wildcard star “*”.
The Ultimate default category matches when no other more specific category matches.
A good strategy for the Ultimate default category is to use the <random> tag to make the bot say something to get the conversation back to something it knows about.
A bot created with no initial content always has one category, an Ultimate default category with a template that says, “I have no answer for that.”
You can change response from the Ultimate default category from the Advanced Alter Response page
Exercise The exercise for the ultimate default category was included in the previous section.
The <person> Tag
The A. L. I. C. E. program is based on a classic A. I. Program called the ELIZA
psychiatrist. One of the tricks used by the ELIZA program was a simple personal
pronoun reversal, to create the illusion of understanding when in fact it had none.
The idea was to turn around and “reflect back” anything the client said, by
replacing first person pronouns (“I” and “me”) with second person pronouns
(“You”). AIML implements this trick with the <person> tag.
The <person> tag in AIML goes hand-in-hand with another tag, the <star/> tag.
The purpose of the <star/> tag is to extract anything that matches the wildcard “*”
character in the input pattern. If we put a <star/> tag in the template it will be
replaced with any words that match the first wildcard found in the input pattern.
Remember, a wildcard may be matched by one or more words. So, the <star/>
tag will always be replaced by the same sequence of one or more words.
61
To obtain the effect of reversing the pronouns in the client input, we use the
<person> tag together with the <star/> tag:
<person><star/></person>
So, if the input matching the star “*” was, “I like to make friends like you”, then
the <person> tag would reverse the pronouns and produce, “You like to make
friends like me”. AIML also provides a shortcut or macro tag <person/>, which is
an abbreviation for <person><star/></person>.
If we use the <person/> tag in the Ultimate default category, it will reverse the
entire input, because the star wildcard absorbs all the words in the input.
Points to Remember
The <person> tag is based on a trick from the old ELIZA psychiatrist program.
The <person> tag reverses the first and second personal pronouns, achieving a “mirror” effect when the bot replies.
The <star/> tag is used to access whatever matched the wildcard “*” or “_” characters.
The <person/> tag is an abbreviation for <person><star/></person>. Exercises
1. Write an AIML category that takes any input in the form of ECHO X Y Z and just prints out the X Y Z or whatever appears there.
2. Write an AIML category that takes any input in the form of PERSON X Y Z
that prints out the result of applying the <person> tag to X Y Z. 3. What is the result of applying the <person> tag to the input, “I think I have
already given you that book you loaned me.”?
Adding a Bot Property
62
This section describes how to add a new bot property like, “mother.” The AIML
expression <bot name=”mother”/> stands for a global bot parameter storing the
name of the robot’s mother. The bot properties are managed by the Edit page.
The bot properties may be used in any template.
To add a new bot property, first, Build and Run your bot.
Ask the bot, “Who is your mother?” The bot may answer, “Actually I don’t have a
mother.” Now click on, “Advanced Alter Response.”
First, in the template box, type ‘She is <bot name=”mother”/>.’ The screenshot
illustrates:
63
Now, to make the reply a little smarter, click on <think> and change “it” to “she”
and <person/> to <bot name=”mother”/>. This will make the bot remember that
the pronoun “she” stands for the bot’s mother, and that the topic is also now the
mother. The screen should now appear:
64
Now click on “Submit” and go back to the Bot dialogue. Try asking the question
again, ‘Who is your mother?’ This time, the bot replies, ‘She is.’
What happened? The bot replied correctly with the new answer, but the bot
property “mother” has no defined value. The default value of any bot property is
the null string. We have to go back to the Edit page to define the bot property
“mother.”
65
After we have added the new bot property, we can Run the bot and ask again,
‘Who is your mother’. This time she gives the correct answer, ‘She is A. L. I. C.
E..’
Points to Remember
You can add as many new bot properties as you need using the Bot Properties page.
Bot Properties do not change their values once set. Your Bot accesses the bot property values through the <bot> tag in the AIML
template.
Exercises 1. Add a bot property called <bot name=”travel”/>. 2. Train your bot to reply to a set of AIML patterns like “Where do you like to
travel”, “Where have you been”, and “What countries have you visited” using the bot property “travel”.
3. Publish your bot with the new bot property and new AIML categories.
Using <srai>
Suppose we add a new bot property called “comedian”. In the Editing screen
add the bot property Comedian with the value George Carlin:
Comedian=George Carlin
Ask the bot, “Who is your favorite comedian?” and create the reply “My favorite
comedian is <bot name=”comedian”/>. Who is yours?” The bot will now
generate the dialogue:
66
Human: Who is your favorite comedian?
MARY: My favorite Comedian is George Carlin. Who is yours?
There are many ways to ask the same question however. The bot may already
know how to answer some of these. For example:
Human: Who’s your favorite comedian?
MARY: My favorite Comedian is George Carlin. Who is yours?
Or
Human: Your favorite comedian is who?
MARY: My favorite Comedian is George Carlin. Who is yours?
But the bot may not know about all variants of the question:
Human: what comedian do you like
MARY: talk to you
Now we use the Advanced Alter Response button to change the reply. But in
this case we do not change the template to a direct text reply. Instead, we use
the <srai> tag to transform the question “what comedian do you like” into “who is
67
your favorite comedian”. In other words, we ask the robot a question it already
knows the answer to.
You can enter the <srai> template by clicking on the <srai> button. This button
insterts the tags <srai> </srai> into the template window. Using the mouse,
position the cursor between the <srai> tags and type the text “WHO IS YOUR
FAVORITE COMEDIAN”. (Note: capitalization does not matter, but some AIML
botmasters feel upper case inside <srai>’s makes them easier to read).
After we click on “submit”, we can now ask the bot, “What comedian do you
like?”, and get the same answer as “Who is your favorite comedian?”.
68
Points to Remember
The <srai> tag is the most difficult part of AIML to learn, but once you understand it, you will have mastered AIML.
Whatever appears inside the <srai> tag in the AIML template, Pandorabots feeds back into the pattern matcher to obtain a new reply, and inserts the reply in place of the original <srai> tag.
The <srai> tag is used to handle synonyms (different ways of saying the same thing).
You can use the <srai> button on the Advanced Alter Response page to insert a little AIML code to help write <srai> expressions.
Exercises
1. Write an AIML category, using <srai>, to transform the input “WHATS
THAT” into the input “WHAT IS THAT”.
2. Use <srai> to write a single AIML category that change all inputs like “Do you know who the mailman is?”, “Do you know who the conductor is?”, “Do you know who the president of the united states is?” will be transformed into the respected reduced form, i.e., “Who is the mailman”, “Who is the conductor”, and “Who is the president of the United States”
3. Many inputs begin with YES and NO followed by some other response. How would you use <srai> to break down the response to YES plus something else and no plus something else?
Training from the Dialog
Pandorabots has added a nice new feature that links the robot-training feature
directly to the conversation log files. If you have already created a bot on
Pandorabots, it is easy to try this new feature. Simply go to the Navigation Bar
and click on the “Logs" link for your bot. Assuming that you have already
69
published your bot and collected dialog samples from clients on the internet, you
will see a list of conversations logged. You can select a conversation by clicking
one of the numeric links under the Replies column heading. The first numeric link
is the number of unread categories; the second numeric link is the total number
of categories. The conversation is displayed exactly the same way it was before,
except that in the left column next to each exchange we now also see a button
labeled "Train". Clicking on the "Train" button opens a new browser window with
the Pandorabots Training form. You may now either change the bot response
associated with the selected input, or go to the Advanced Alter Response Page
by clicking on the "Advanced Alter Response" button. Let's go through an
example to see how the bot training from dialog works. The following exchange
was found in the log file for the Silver A. L. I. C. E. Edition on Pandorabots:
Human: Who is Peter Norvig? Bot: They are sometimes a client on the internet. I
haven't heard of Peter Norvig. So, the botmaster clicks on the "Train" button next
to the logged exchange. The Pandorabot Training form appears in a fresh
browser window. The botmaster clicks on the first "Advanced Alter Response"
button. The Advanced Alter Response page is titled "Teach A. L. I. C. E. Silver
Edition" for this bot. The botmaster notices that the Pandorabots program has
chosen WHO IS PETER NORVIG for the new AIML pattern. The botmaster then
clicks on the <think> button to create some AIML code for the template, and edits
it slightly to produce the fragment: <think> <set name="he"> <set name="topic">
Peter Norvig </set> </set> </think> He is a computer scientist who works for
Google. The new AIML is saved by clicking on the "Submit" button at the bottom
70
of the Advanced Alter Response Page. We can test the new AIML by going back
to the Training page and asking "Who is Peter Norvig?", "Do you know Peter
Norvig?", "Who is he?", and "What are we talking about?"
Points to Remember
The Dialog files are your most valuable resource for finding new targets.
Pandorabots provides a feature to link directly from the dialog exchanges to the Advanced Alter Response page.
The botmaster may spend several hours each day reviewing the log files and
adding new AIML content to improve the quality of his or her bot.
Using <that>
AIML has several ways of remembering the state of the conversation. We have
already gone over AIML predicates, which are variables for remembering things
like the topic, the value of pronouns, and other information like the client’s name,
location, and other client properties. Most of time the bot can handle the client
input without remembering the history of the conversation at all. Sometimes
however the bot must remember a little bit of the context of the conversation in
order to provide a meaningful reply. The most common application is answering
questions.
Suppose the bot asks the client a question like, “Have you dated any robots?”
The client may answer, “Yes”, “No”, or he or she may change the subject and say
something completely off topic. AIML is designed to handle all of these cases.
71
The keyword <that> in AIML stores the last thing that the robot said, so that it can
be used in combination with the current input to form an intelligent reply.
A very useful category in the A. L. I. C. E. brain for debugging categories with
<that> context is one that has the pattern “SAY *”. This category simply echoes
back whatever we ask the robot to say. This forces the Pandorabots program to
set the value of <that> to a specific value, so we don’t have to wait around for the
bot to ask a specific question. Go to the Bot Training Page. Try telling your bot,
“Say have you dated any robots.” Then try entering, “Yes”, and click on
“Advanced Alter Response”.
72
Notice that the original template for the YES input uses the <srai> tag to go to
another category with the pattern INTERJECTION. The purpose of this
secondary category is to give default replies to a variety of inputs like YES, NO,
YEAH, MAYBE, UM, and other interjections that are seen out of context. The
replies are a random list of noncommittal responses that are designed to create
the illusion of understanding, while trying to keep the client entertained and the
conversation going ahead.
To modify the response so that it takes into account the question, “Have you
dated any robots”, we need to check the box marked, “depends on this That”.
We also have to compose a new reply in the template:
73
We can save the result by clicking on the “Submit” button and returning to the Bot
Training page. Then we can test it out by repeating the same cycle. Tell the bot,
“Say have you dated any robots”. Then answer, “Yes”. This time, the bot should
reply with the new template, “You might be happier sticking with humans.”
You can use the same process to develop an answer for “NO” to the same
question.
Points to Remember
The AIML keyword <that> stores the bot’s last utterance.
The <that> tag helps us write bot replies that depend on the state of the conversation.
The most common application for <that> is when the bot asks a question and the next reply depends on the client’s response to that question.
Many categories having the input patterns YES and NO use <that>.
Exercises
All of the examples using <that> can be entered using the Advanced Alter
Response Page.
Adding AIML with Pandorawriter
One feature you may have noticed on the Navigation Bar is one called
“Pandorawriter.” This is a program that is designed to help you write AIML
efficiently from scripts, dialog files, or transcripts of conversations. The idea
behind Pandorawriter is simple: take a plain text document consisting of
74
alternating lines from the client and the bot, and convert them into AIML patterns
and templates respectively. The resulting AIML may not be perfect, and may
require some hand-editing to work well with your bot, but the Pandorawriter can
save a lot of time when converting large bodies of existing text, such as FAQs, to
AIML.
When you click on the Pandorawriter link, the browser displays a page like this:
Here we take a sample of dialog from the Tempest, a play by William
Shakespeare:
Beseech you, sir, be merry; you have cause,
Prithee, peace.
He receives comfort like cold porridge.
75
The visitor will not give him o'er so.
Look he's winding up the watch of his wit;
by and by it will strike.
Sir,--
One: tell.
When every grief is entertain'd that's offer'd,
Comes to the entertainer--
A dollar.
Dolour comes to him, indeed: you
have spoken truer than you purposed.
You have taken it wiselier than I meant you should.
Therefore, my lord,--
Fie, what a spendthrift is he of his tongue!
I prithee, spare.
Well, I have done: but yet,--
He will be talking.
Pandorawriter produced the following AIML file:
<?xml version="1.0" encoding="UTF-8"?>
<aiml version="1.0">
<categor>
<pattern> Beseech you sir be merry </pattern>
<template>Prithee, peace. </template>
</category>
<category>
76
<pattern> He receives comfort like cold porridge </pattern>
<template> The visitor will not give him o'er so. </template>
</category>
<category>
<pattern>
Look he is winding up the watch of his wit by and by it
will strike
</pattern>
<template> Sir,-- </template>
</category>
<category>
<pattern> One tell </pattern>
<template>
When every grief is entertain'd that's offer'd, Comes to
the entertainer--
</template>
</category>
<category>
<pattern> A dollar </pattern>
<template>
Dolour comes to him, indeed: you have spoken truer than
you
purposed.
</template>
</category>
<category>
<pattern> You have taken it wiselier than I meant you should
</pattern>
<template> Therefore, my lord,-- </template>
</category>
<category>
<pattern> Fie what a spendthrift is he of his tongue </pattern>
<template> I prithee, spare. </template>
</category>
<category>
<pattern> Well I have done but yet </pattern>
<template> He will be talking. </template>
</category>
</aiml>
Use the “Download AIML” button to save the generated AIML on your own
computer. For our example, you may want to select a file name such as
“tempest.aiml” and store the document in a local folder such as “My Documents”.
Actually the AIML generated by the Pandorawriter by itself is often not very
useful.
77
The reason is, the patterns are too specific. In the Shakespeare example, the
pattern “Beseech you sir be merry” is unlikely to be matched by any client inputs,
unless the client happens to be uttering the lines of the Tempest perfectly. So
we usually want to edit the AIML generated by Pandorawriter a little to make the
patterns a bit more general. You can use your favorite text editor such as
Notepad to open up the downloaded AIML file tempest.aiml, and make the
changes to the patterns. Here is what the patterns might look like after you have
generalized them by deleting some of the specific words, and adding in some
wildcards:
<?xml version="1.0" encoding="UTF-8"?>
<aiml version="1.0">
<category>
<pattern> Beseech *</pattern>
<template> Prithee, peace. </template>
</category>
<category>
<pattern> He receives *</pattern>
<template> The visitor will not give him o'er so. </template>
</category>
<category>
<pattern>Look he *
</pattern>
<template> Sir,-- </template>
</category>
<category>
<pattern> One tell </pattern>
<template> When every grief is entertain'd that's offer'd, Comes to
the entertainer--
</template>
</category>
<category>
<pattern> A dollar </pattern>
<template>Dolour comes to him, indeed: you have spoken truer than
you purposed.
</template>
</category>
<category>
<pattern> You have *</pattern>
<template> Therefore, my lord,-- </template>
</category>
<category>
78
<pattern> Fie *</pattern>
<template> I prithee, spare. </template>
</category>
<category>
<pattern> I have done * </pattern>
<template> He will be talking. </template>
</category>
</aiml>
When you are satisfied with the new AIML you have created with the help of
Pandorawriter, go back to the My Pandorabots page and select the AIML option
for your bot. Scroll down to the button that says “Upload AIML file(s)” and use
the “Browse…” button to locate the saved AIML file (such as tempest.aiml) on
your computer. Then, click on “Upload AIML file(s)” to send the file to the
Pandorabots server. If everything went well, you should see the file included in
the table of files associated with your bot. You can now Publish your bot and test
out the new AIML categories you’ve created with Pandorawriter.
Points to Remember
Pandorawriter is a tool for converting dialogs into AIML.
Pandorawriter converts alternating lines of dialog into AIML <pattern>s and <template>s.
Alternating lines of text must be separated by two or more newlines.
The output of Pandorawriter may have to be edited to generalize the patterns with wildcards.
You can download the AIML file created by Pandorawriter and save it on your computer.
You can upload saved AIML files from your computer to your bot under the “Edit” option.
79
Exercise
Select another public domain play, not necessarily by William Shakespeare, and
enter a few lines into Pandorawriter. What kind of result do you get? Try to
figure our where you might need to edit the patterns with wild cards so that the
robot could use the responses as general-purpose default templates.
80
Targeting
Targeting is the term we use for the process of automatically scanning the dialog
files, looking for places where the bot gave the wrong reply, so we can fix up the
conversation responses at those points, so the bot will give smarter replies in the
future. Targeting is how we teach a Pandorabot most efficiently. In the old days
of chat robots, botmasters would read through all the dialog files one by one,
looking for places where the bot gave vague, non-committal or default
responses, then try to refine the bot’s replies for those isolated inputs. Later, we
realized that a computer program could analyze the log files faster than we could
read them, and came up with the AIML Targeting algorithm.
The Targeting algorithm works by finding the most frequently activated AIML
categories, and ranking them. It also associates the client inputs that matched its
input patterns with that category. In the Pandorabots implementation of the
Targeting algorithm, the botmaster can browse the targets through a web-based
interface. Pandorabots links the targeting interface to the training interface,
making the process of adding knowledge through targets highly efficient.
The targeting process begins with the Log file data. From the navigation bar,
choose the Log button. Pandorabots displays the log files in tabular format, in
groups of 15 at a time, for each of the past so many days, depending on how
many days the botmaster selects.
81
To begin finding targets, select the item “Find Targets” on the pull down menu on
the bottom of the table of conversation logs. Selecting a large number of
conversations produces a bountiful yield of targets, but slows down the targeting
algorithm. Selecting a very small number of conversations runs the algorithm
faster, but may not produce many useful targets. As Pandorabots suggests by
its interface, 15 is probably a good number of dialogs to process at one time, and
we can conveniently select all 15 by clicking on the upper-left most checkbox.
Clicking “OK” launches the next set of options for targeting.
82
Pandorabots displays another page of options for targeting. You can basically
ignore all of them 90% of the time. The phrase “only show target categories
containing wildcard patterns”, is actually very simple to explain. It means that the
category has a pattern with a wildcard * or _ somewhere in it. This means the
bot didn’t match the client input exactly, but only partially caught what the client
said. Therefore, there is a high probability that this pattern could be refined into a
more specific or exact pattern.
Of all the other options, the only other really important one is the pull-down menu
with the list of bot names. If you have more than one bot, you can use the dialog
files from one bot to train your other bots. This feature becomes useful when you
83
have at least one published, high-traffic bot and one or more bots under
development. Your development bots may or may not be published, but can
draw from the public dialog of another, published bot, for the purpose of training.
When you have all the options set, run the targeting algorithm by clicking on
“Find Targets”.
Pandorabots displays the targets in a ranked-order table.
The target table is a list of activated categories displayed in order of most
frequent activation. You can view the inputs activating any target by clicking on
the associated activation count.
84
By clicking on one of the sample target categories, you can see which inputs
matched the target category.
85
By clicking on the Train button in theTargeting section, Pandorabots links you to
theTraining section.
86
Using the Advanced Alter Response Page, the botmaster creates a more specific
category from the targeting input data.
Points to Remember
Targeting means scanning the log files for places to improve the bot
responses.
Pandorabots allows you to select log files for Targeting analysis.
Pandorabots presents targets in ranked tables.
87
Pandorabots displays inputs associated with targets.
Pandorabots links targets to training.
Exercises
1. Publish your bot and collect conversation logs
2. Select log files for targeting
3. Choose the filter, “only show target categories containing wildcard patterns”. 4. Find Targets
5. Choose a target for training
6. Train a new category from target
7. Republish your bot
Custom HTML
You can navigate to the custom HTML page by selecting the Custom HTML
button on the Navigation Bar. The custom HTML page allows the botmaster to
create a new HTML file online, or to upload one from your local computer. Once
the custom HTML file is created, it must be named and saved before
Pandorabots can use it as a new interface for your bot. If you look at the Custom
HTML interface carefully, you will see that you can actually create a set of HTML
files. This capability allows you to create an HTML frameset, with one HTML file
being selected as the default file that will load all the others. In general you can
associate as many HTML files as you wish with a bot, and specify one as the
88
default, which may be convenient if you want to experiment with different HTML
interfaces or switch between them by changing the default HTML file.
One of the simplest tricks commonly employed by botmasters in custom HTML is
to include a snippet of javascript to move the cursor to the input text window on
each new page load, saving the client the trouble of having to click on the text
input area each time he or she wishes to chat with the bot. To do this, create a
custom HTML file like this one:
<html>
<head>
<title>Mary Bot</title>
<SCRIPT>
<!--
function sf(){document.f.input.focus();}
// -->
</SCRIPT>
</head>
<body onLoad="sf()">
!OUTPUT!
<br/>
<FORM name=f action="" method=post>
!CUSTID!
<P><font face="arial"><b>You say:</b></font> <INPUT size=80 name=input>
</P>
</FORM><HR>
</body>
</html>
The javascript function sf() causes the cursor to jump to the input text area every
time the page is loaded, as determined by the onLoad attribute. There are a
couple of other concepts introduced in this snippet of custom HTML. The first is
the Pandorabots-exclusive standard !OUTPUT! symbol. The !OUTPUT! symbol
indicates where Pandorabots should insert the bot’s reply into the HTML
formatted response. Notice that the botmaster has also included an HTML line
break <br/> symbol following the !OUTPUT!, indicating that he or she wishes to
have a newline follow the bot’s reply.
89
Notice that Pandorabots uses an HTML form with a POST method to get the
client input. Typically the form has some input tagline such as “You say:” and an
input name=”input”. You can vary the form text field width to suit the appearance
of your application. The name of the form and its action are unimportant. Every
bot needs to have the text input form included in its custom HTML.
The second new concept is the symbol !CUSTID!. In many cases the !CUSTID!
symbol is optional. Pandorabots uses cookies to track customer dialogs. In
some cases however the client may have cookies disabled on the browser side.
In those cases Pandorabots uses a customer ID tag to track conversations.
Even in those cases it is not much of a problem for Pandorabots to keep track of
the conversation. It really only becomes a problem when Pandorabots tries to
log the conversation in a dialog file. When cookies are disabled, and there is no
customer id, the conversation might appear as a large number of short, one-
sentence conversations, rather than one long conversation in the log files. When
you create custom HTML, it is your responsibility to include the !CUSTID! symbol
to prevent this rare problem. The !CUSTID! always appears inside text input
form.
Points to Remember
You can upload custom HTML or edit it on Pandorabots directly
Custom HTML can include one file or many
Custom HTML should include !OUTPUT! and !CUSTID!
Custom HTML must include an input text form.
Custom HTML may include javascript to position the text cursor
90
Exercises
1. Create a Custom HTML file for your bot
2. Write a javascript snippet to position the cursor in the text form.
3. Make the bot reply appear in bold, italic font.
4. Make the name of your bot appear as an HTML header <h1>
Setting Predicate Defaults
In Pandorabots, custom HTML has the special ability to evaluate AIML template
expressions as well. These AIML expressions are evaluated after the templates
forming the ordinary part of the bot’s response. You can include any kind of
AIML markup in these templates, including <srai>. They are evaluated just like
any other template in AIML, but with one important difference. The AIML
templates found in the custom HTML have no effect on the AIML <that/>
reference. In AIML, the value of <that/> always refers to the last sentence the
robot said. The indexed value of <that index=X,Y/> refers to the Yth sentence in
91
response to the Xth previous input. These references remain unchanged by any
templates found in the custom HTML processing.
Knowing this helps us write a custom HTML file for setting the default values of
AIML predicates. Unlike other AIML interpreters, Pandorabots has no built-in
facility (as yet) for setting the default values returned by AIML predicates when
these have not already been set. To accomplish this, we use one special
predicate called the meta predicate. We also use the Pandorabots predicates
interface to set all our predicates to return the default value of om, when they are
not already set.
Now, let us modify our custom HTML file by adding one additional line:
<html>
<head>
<title>Mary Bot</title>
<SCRIPT>
<!--
function sf(){document.f.input.focus();}
// -->
</SCRIPT>
</head>
<body onLoad="sf()">
!OUTPUT!
<template><think><srai>SET PREDICATES</srai></think></template>
<br/>
<FORM name=f action="" method=post>
!CUSTID!
<P><font face="arial"><b>You say:</b></font> <INPUT size=80 name=input>
</P>
</FORM><HR>
</body>
</html>
92
The new line in our custom HTML file tells Pandorabots to evaluate a template
that calls the <srai> function with the pattern SET PREDICATES. The AIML
category with the SET PREDICATES pattern may be found in the AAA file
Predicates.aiml. It has a simple template that uses <srai> to activate a three-
word pattern with SET PREDICTES and the value of the meta predicate.
<category>
<pattern>SET PREDICATES</pattern>
<template><srai>SET PREDICATES <get name="meta"/></srai>
</template>
</category> If the meta predicate has the value om, Pandorabots activates the category with
the pattern SET PREDICATES OM. Otherwise, Pandorabots activates the
default category with the pattern SET PREDICATES *:
<category>
<pattern>SET PREDICATES *</pattern>
<template>
The meta Predicate is set.
</template>
</category>
The category with SET PREDICTATES * does nothing. It has a reply “The meta
predicate is set”, but because the reply appears inside a <think> tag, the result is
a blank in the HTML output. Remember, the template in the custom HTML has
no effect on <that/>, so “The meta predicate is set” is discarded completely. If
the meta predicate was om, however, the following category would be activated:
<category>
<pattern>SET PREDICATES OM</pattern>
<template>
<think>
<set name="age">how many</set>
93
<set name="birthday">when</set>
<set name="boyfriend">who</set>
<set name="girlfriend">who</set>
<set name="gender">he</set>
<set name="firstname">what</set>
<set name="middlename">what</set>
<set name="lastname">what</set>
<set name="fullname">what</set>
<set name="has">mother</set>
<set name="dog">who</set>
<set name="cat">who</set>
<set name="phone">what</set>
<set name="email">what</set>
<set name="memory">my name</set>
<set name="nickname">what</set>
<set name="mother">who</set>
<set name="father">who</set>
<set name="brother">who</set>
<set name="sister">who</set>
<set name="husband">who</set>
<set name="wife">who</set>
<set name="favmovie">what</set>
<set name="favcolor">what</set>
<set name="friend">who</set>
<set name="password">what</set>
<set name="heard">where</set>
<set name="gender">he</set>
<set name="he">he</set>
<set name="her">her</set>
<set name="him">him</set>
<set name="is">a client</set>
<set name="it">it</set>
<set name="does">it</set>
<set name="religion">what</set>
<set name="job">your job</set>
<set name="like">to chat</set>
<set name="location">where</set>
<set name="looklike">a person</set>
<set name="memory">nothing</set>
<set name="meta">set</set>
<set name="name">judge</set>
<set name="personality">average</set>
<set name="she">she</set>
<set name="sign">your starsign</set>
<set name="them">them</set>
<set name="they">they</set>
<set name="thought">nothing</set>
<set name="want">to talk to me</set>
<set name="we">we</set>
<set name="etype">Unknown</set>
<set name="eindex">1A</set>
</think>
</template>
</category>
94
This elaborate category sets the default values for all the predicates in the AAA
bot, including the meta predicate. The result is, this category will be activated
only once. The default predicate values are set the first time the client loads the
custom HTML page. After that, the meta predicate is set to “set” and this
category is blocked. Pandorabots will therefore set the default values only once
for each new client.
Dialog History
We can also use custom HTML template processing to insert the dialog history
into the bot output.
<template><srai>DIALOG HISTORY</srai></template>
<category><pattern>DIALOG HISTORY</pattern>
<template>
<think>
<set name="input4"><input index="4"/></set>
<set name="input3"><input index="3"/></set>
<set name="input2"><input index="2"/></set>
<set name="input1"><input/></set>
</think>
<condition name="input4" value="*">
<br/>
<b><em>
Human: <input index="4"/>
</em></b>
<br/>
<b>ALICE: <em><that index="4,*"/></em></b>
</condition>
<condition name="input3" value="*">
<br/>
95
<b><em>
Human: <input index="3"/>
</em></b>
<br/>
<b>ALICE: <em><that index="3,*"/></em></b>
</condition>
<condition name="input2" value="*">
<br/>
<b><em>
Human: <input index="2"/>
</em></b>
<br/>
<b>ALICE: <em><that index="2,*"/></em></b>
</condition>
<condition name="input1" value="*">
<br/>
<b><em>
Human: <input index="1"/>
</em></b>
<br/>
<b>ALICE: <em><that index="1,*"/></em></b>
</condition>
</template>
</category>
Publishing your Bot with Oddcast SitePal
Pandorabots has partnered with Oddcast, Inc. to provide talking, animated
avatars you can customize to add speech and characters to your AIML bot. You
can try out the Oddcast SitePal service free, and have limited use of their high-
quality text-to-speech voice synthesis. For a small monthly fee, you can open
your own Oddcast SitePal account, and get unlimited speech synthesis for your
Pandorabots bot. For more information, log on to www.sitepal.com.
96
To create a SitePal character for Pandorabots, log on to your SitePal account
and click on “Add New Scene”.
After you have created a few characters, your SitePal account will be populated
with a set of scenes you saved. The characters appear as thumbnails on each
scene. The options of interest to us, as Pandorabots botmasters, are the
following: Edit, Playback Limit and Embed Scene. For the most part, we can rely
on Pandorabots to preview our character. Also, we probably aren’t using SitePal
to email VHosts, nor are we using it for eBay. The Playback limit is simply a
restriction on the number of times a single internet client can replay exactly the
same output sound. For most normal bot conversations, the limit of 99 would
97
never be reached. The restriction exists to prevent spammers and bots from
overcharging your SitePal account for too many voice requests.
The most important functions for us to consider are Edit and Embed Scene.
The Edit function opens the SitePal scene editor. On the right hand side we see
the Model Panel, where the botmaster selects the basic VHost character model.
99
Selecting the Oddcast VHost button on the Navigation bar displays the Oddcast
Vhost control page. If you have subscribed to the Oddcast SitePal account, then
the Oddcast Vhost control page should display not only the four demo faces, but
also the set of scenes you have created from your SitePal account. Select the
character from the scene you want to publish with your bot.
100
Pandorabots creates a default HTML interface for the published bot with VHost.
The default interface places the Vhost in a non-reloading frame, and a bot
dialogue in a reloading HTML text subframe.
Customizing your HTML with an Oddcast VHost [tm]
Pandorabots also allows you to create custom HTML pages when your bot is
integrated with an Oddcast VHost [tm] animated character. (Be sure to go
through the steps of publishing your bot with a VHost[tm] first, in order to select a
character and a voice for your bot). To do this, you need to become familiar with
the concept of custom HTML skin. The term “skin” in this case refers to the layer
of customized HTML that creates the unique appearance of your bot when
101
displayed in a browser window. They might have used the word “template”
instead, but this would have led to confusion with the term AIML template. You
will also need to be familiar with HTML frames in order to customize the HTML
appearance of a bot using a VHost.
The simplest way to understand the process is to look at an example. The
C.L.A.U.D.I.O. Personality Test bot is an application that uses custom HTML and
a VHost [tm] character. The first step is to create a frame for the bot. We
created our HTML files using a text editor like Notepad. The HTML for the frame
file looks like this:
<html>
<head>
<title>C. L. A. U. D. I. O. Personality Test</title>
</head>
<frameset rows="340,*">
<frame src="!TALKREF!&skin=vhost_claudio" name="vhost">
<frame src="!TALKREF!&skin=input_claudio&speak=true"
name="input_claudio">
</frameset>
</html>
We shall store this frame source in a file called frame_claudio.html.
The <title> inside the <head> section is self-explanatory. This section simply
displays the title of the bot inside the title bar of the browser. The <frameset> tag
declares that the HTML consists of a set of frames beginning with a top frame
taking up 340 rows. The bottom frame takes up all remaining rows available in
the browser window, as indicated by the star character *.
Take note of the next two lines declaring the actual frames. The frame source is
declared with the src attribute. The notation !TALKREF! is not standard HTML
102
and is a macro specific to Pandorabots. The Pandorabots program will expand
and replace this macro with its own code at runtime. Basically, !TALKREF! is a
link to the bot id of the published bot, but provided as a convenience so that you,
the botmaster, don’t have to keep track of the bot id yourself.
We are going to have two subframes called vhost_claudio.html and
input_claudio.html respectively. The top frame is vhost_claudio and contains the
Vhost[tm] animated character. The lower frame is the input frame and contains
the input form and output from the bot. The notation speak=true indicates that
we actually want the VHost[tm] to speak, rather than to just be a silent animated
character.
Let’s now look at the source code for the file vhost_claudio.html:
<html>
<body>
!VHOST!
</body>
</html>
As you can see, this frame is very simple. We could have made it more complex
and included a lot more decoration around the VHost[tm] character, but for now
we just wanted to show a simple example. The notation !VHOST! is again not
standard HTML, but specific to Pandorabots. The !VHOST! string tells the
Pandorabots program where to display the Vhosts[tm] character in the frame.
Next, we turn to the code for the frame input_claudio.html:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<SCRIPT>
<!--
function sf(){document.f.input.focus();}
// -->
103
</SCRIPT>
</HEAD>
<BODY lang=en-US bgColor=#9999AA onload=sf()>
!SPEAK!
<FONT face="Arial" color=#000000 size="4">
<table>
<tbody>
<tr><td bgcolor=#FFFFFF>
<template><think><srai>SET PREDICATE DEFAULTS</srai></think></template>
<b><em><template>><input/></template></em></b>
<P>
</td></tr>
<tr><td>
<em><b>!OUTPUT!</b></em></P>
</td></tr>
<tr><td>
<FORM name=f action="" method=post>
!CUSTID!
<P>You say:</font> <INPUT size=80 name=input> </P>
</FORM>
</td></tr>
</TD></TR>
<tr><td>
<p>
<p>
<font face="Arial" color="white" size="4"><b><em>Your personality type:
<template><srai>FORMAT PERSONALITY <get
name="etype"/></srai></template>
</em></b></font>
</td>
</tr>
</TBODY>
</TABLE>
<HR>
Notice that this custom HTML example is more like the custom HTML examples
developed for non-VHost[tm] bots in the previous section. There is a javascript
function to focus the cursor on the input form each time the page reloads. The
familiar !OUTPUT! string appears to display the bot output. The botmaster also
makes use of the AIML <template> tag to customize the appearance of the
HTML. The command <srai>FORMAT PERSONALITY <get
name=”etype”/></srai> inserts a string indicating the bot’s best guess of the
104
client’s personality type. All of these customization features could have been
added without regard to frames or the use of a VHost[tm].
The two new Pandorabots strings of importance in this example are !SPEAK! and
!CUSTID!. The function of the !SPEAK! string is to actually transmit the output
of the bot to the Oddcast server for text-to-speech synthesis for the Vhost[tm] to
speak. At runtime the Pandorabots program will replace the !SPEAK! tag with
the actual commands necessary to send the bot output to the Oddcast server so
that the animated VHost will speak and synchronize its lips and facial
movements.
The purpose of !CUSTID! is slightly more subtle. When logging conversations,
the Pandorabots program normally makes use of a client-side cookie to keep
track of which client is chatting with the bot. If however a client has disabled
cookies on his or her browser, the conversation may turn up as a series of many
short conversations of dialogue length one, making them inconvenient for the
botmaster to read in the conversation logs. The !CUSTID! helps the
Pandorabots program track the conversation from one exchange to the next,
even if the client has disabled cookies, by assigning the client a unique tracking
number. You, the botmaster, don’t have to worry about any of this, as long as
you remember to put the string !CUSTID! inside the <form> tag as shown.
105
The screenshot shows what the final result of the customization should look like.
To upload the files, select the bot under My Pandorabots page. Then, click on
the button marked “Custom HTML”. Use the Browse button to locate the HTML
files you created and then use the “Upload Custom HTML” button to transfer
them to the Pandorabots server. You should see a table something like this:
106
Be sure to select the HTML frame as the Default. You can click on the links
marked “Test” to check the behavior of each HTML file individually.
One final note: We could have achieved almost exactly the same result by
creating the file input_claudio.html and skipping the vhost_claudio.html and
frame_claudio.html files, and making the input_claudio.html the Default. This is
because Pandorabots has a default behavior for creating customized HTML
pages with Vhost[tm] characters. The default behavior is to construct a two
frame set, with one frame containing the VHost[tm] on top of another one
containing the custom HTML. If the botmaster does not specify any other
arrangement, this simple arrangement of frames is assumed. If you want to do
107
anything fancier, such as decorating the border around the VHost[tm], placing the
frames side-by-side, or use more than two frames, you will need to go through all
the steps in this section.
Points to Remember
To customize the HTML of your bot with a VHost[tm], first publish your bot with a VHost[tm] to select the character and the voice.
The HTML skin is the custom frame set you create to make a personalized appearance for your bot with a VHost[tm].
The outer HTML frame contains the string !TAKREF! that refers to the bot id.
The outer HTML frame refers to the frame containing the VHost[tm] and to the frame containing the input form and output text.
The frame containing the VHost[tm] uses the string !VHOST! to position the animated character.
The frame containing the input and output uses the string !SPEAK! to make the bot speak.
The string !CUSTID! helps the bot track the client conversation. Exercises
1. Go to your SitePal account and design a scene (character) for your bot.
2. Create the 3 HTML files needed to place your talking VHost in a frameset.
3. Publish your talking VHost Bot on Pandorabots.
Media Semantics Avatars
108
You can publish your bot using a Media Semantics Character Toolkit animated
Flash character. To try out this new feature create a Pandorabot, and click on
Media Semantics.
The character you see is dynamically generated from your bot output by an
instance of the toolkit's Character Server product, running on a Media Semantics
server. This service is provided free of charge for the purpose of evaluating the
Character Builder, and may be interrupted at any time. Please contact
[email protected] to customize your interface and arrange a level of
service. For more information on the Character Builder, including information on
building your own characters and hosting your own character-based applications,
please visit www.mediasemantics.com.
You can include gestures in your responses by adding additional tags. For
example the following AIML template will result in the character raising his or her
palm while saying "I swear it to be true".
<template> <palmup/> I swear it to be true. </template>
Here are some other tags you can use:
<eyeswide/>
<eyesnarrow/>
<handup/>
<handleft/>
<handright/>
<handsout/>
<handsin/>
<lookleft/>
<lookright/>
109
<lookup/>
<lookdown/>
Publishing Your Bot on AOL Instant Messenger
Pandorabots makes it easy for you to get your bot published not only on a web
page, but also on America Online (AOL) instant messenger (AOL IM). The first
thing you will need is an AOL IM account for yourself, so that you can chat with
the bot. If your machine does not already have AOL installed, go to
www.aol.com and get the free AOL IM chat client software to download. You can
set up an AOL instant messenger account free. You simply have to provide your
name, some basic identifying information, and agree to the terms of service.
You also need to set up an AOL IM account for your bot. Go through the same
procedure you did for yourself but create an AOL IM account under a different
screen name, the same one you plan to use for your bot. Once you have created
two AOL IM accounts, one for yourself and one for your bot, you are ready to
publish your bot on AOL IM.
On the My Pandorabots page, select your bot. On the navigation bar, select the
button marked “AOL IM. In this section you are asked to supply the screen name
and password you registered for your bot. After you have entered the screen
name and password, you may click “Activate” to actually get your bot to
communicate with AOL IM. In this example the bot’s screen name is
“alicewallace2004.”
110
You can also choose the typing speed for your bot. Instantaneous typing means
that the bot replies as fast as Pandorabots can compute an answer. Fast and
slow typing are designed to mimic a human typist at a high and low typing rate,
respectively.
111
To test your bot on AOL IM, sign on to AOL IM under your own personal screen
name (not the bot’s). You may wish to add the bot to your Buddy List.
112
Try sending an instant message to the screen name “alicewallace2004.” If the
bot is online, then you should be able to engage it in a conversation.
Later, when we learn about the conversation log feature of Pandorabots, we will
see that the Pandorabots server saves the dialogues with AOL clients just like it
saves the conversations with clients who visit the bot from web pages. If
someone meets your bot in an AOL chat room, you can read and review the
dialogues from those chats later on.
Points to Remember
Pandorabots has a feature that lets you publish your bot on AOL Instant Messenger.
You need at least two AOL IM accounts (screen names) if you want to chat with your own bot on AOL IM, one for yourself and one for the bot.
The bot conversations with clients on AOL IM are saved just like conversations with clients who meet the bot on the web.
113
Exercises
1. Create a screen name for yourself on AOL IM.
2. Create a screen name for your bot on AOL IM.
3. On the Edit page, publish your bot on AOL IM.
4. Have a short conversation with your bot on AOL IM.
5. If you have other friends (“Buddies”) who use AOL IM, send them your bot’s screen name and tell them to have a conversation with your “new friend”.
Other Interfaces
Pandorabots and Flash
Jamie Durrant wrote an informaive tutorial explaining how to set up a Flash
interface to your Pandorabot bot. You can view it at
http://www.lionhead.com/personal/jdurrant/flashbot .
Put Your Bot on MSN Messenger
There is a third party MSN Messenger Bot which takes its reponses from a
chatbot hosted at Pandorabots. You run the application on the same machine
that you're running your MSN Messenger client. The software is available from
http://www.mess.be. Select 'Search our files' from the 'Download' pull-down and
search for 'Binsonite'.
Please note that we cannot offer support for this at [email protected]. If
you need support with this you should contact the author.
Put Your Bot on IRC
114
An Eggdrop TCL script is available from http://www.tclscript.com/scripts.shtml.
Download alice.tcl and egghttp.tcl. A README.txt is included which gives
details on how to configure the TCL script and your Pandorabot.
Note that the alice.tcl bundle contains a file named alice.html. You need to save
this on your local machine and then from Botmaster Control, click on Edit for
the appropriate Pandorabot. Scroll down to Personalized published html page,
and enter the path to your saved copy of alice.html in the text field labeled
Filename: (alternatively in IE, you can click on the Browse... button to open a
File Chooser dialog to help you locate the file). Finally, click on Upload file and
you're done.
Get your client’s screen name
Yes. You need to use the (pseudo-) predicate "screename" – Pandorabots
automatically sets this to be the screen name of the AOL IM user talking with
your bot. You can access the predicate in templates using the <get> element and
you can also use it as a regular predicate within <condition> elements. For
example, the following category responds with the user's screen name if set.
<category>
<pattern>WHAT IS MY SCREEN NAME</pattern>
<template>
<condition name="screenname">
<li value="*">Your Screen Name is:
<get name="screenname">
</get>
</li>
<li>You're not talking to me via AOL IM.</li>
</condition>
</template>
</category>
You can either place the above category in an AIML file to upload, or using the
Training interface, cut and paste the green text as the desired response.
115
What is a "botid"?
The botid is that part of the Pandorabot's published URL after 'botid='.
For example, the Divabot linked to from the Pandorabots home page has a
published URL of:
http://www.pandorabots.com/pandora/talk?botid=f6d4afd83e34564d
So the botid is:
f6d4afd83e34564d
You can find the published URL of your bot by publishing it in Botmaster
Control and then examining the URL of the link to your published bot.
Pandorabots API
A client can interact with a Pandorabot by POST'ing to:
http://www.pandorabots.com/pandora/talk-xml
The form variables the client needs to POST are:
Botid – see above
input - what you want said to the bot.
custid - an ID to track the conversation with a particular customer. This
variable is optional. If you don't send a value Pandorabots will return a
custid attribute value in the <result> element of the returned XML. Use
this in subsequent POST's to continue a conversation.
This will give a text/xml response. For example:
<result status="0" botid="c49b63239e34d1d5" custid="d2228e2eee12d255">
116
<input>hello</input>
<that>Hi there!</that>
</result>
The <input> and <that> elements are named after the corresponding AIML
elements for bot input and last response. If there is an error, status will be non-
zero and there will be a human readable <message> element included
describing the error. For example:
<result status="1" custid="d2228e2eee12d255">
<input>hello</input>
<message>Missing botid</message>
</result>
Note that the values POST'd need to be form-urlencoded
Other AIML Programs
This section summarizes the use of Pandorabots AIML files with other free AIML
software. Because AIML is a standard language, you can freely export and
import your bot knowledge base between Pandorabots and other AIML software.
The AIML free software community has developed many different tools,
interpreters and servers for AIML. We cannot go over all the details of all of them
in the amount of space here. We have selected four standalone AIML programs,
Prorgams D, J, N and P, for review here. Even these reviews barely scratch the
117
surface of these four varied programs. Each represents the work of its own
community of volunteer programmers, users, and fans. To get more information
on any of these programs, start with the ALICE A.I. Foundation web site at
http://www.alicebot.org. Then follow-up on the home pages of the individual
project web sites.
A word on owning your AIML Files
Whether you are using Pandorabots or another free AIML interpreter, or whether
you intend to keep your AIML files proprietary or release them as free software, it
is a good idea to put a copyright statement at the beginning of each AIML file.
You may have noticed that each AIML file from the ALICE A. I. Foundation has a
notice like:
<?xml version="1.0" encoding="ISO-8859-1"?>
<aiml>
<!-- Free software © 1995-2004 ALICE A.I. Foundation.
-->
<!-- This program is open source code released under -->
<!-- the terms of the GNU General Public License -->
<!-- as published by the Free Software Foundation. -->
<!-- Complies with AIML 1.01 Tag Set Specification -->
<!-- as adopted by the ALICE A.I. Foundation. -->
<!-- Revision Adverbs-1.07 -->
<!-- Last Modified Sept 06 2004 -->
You may wish to copy our statement exactly, or write one of your own. By using
Pandorabots, you are not agreeing to give up ownership of your copyrights. You
can own your AIML content, if you plan to start a bot business or create a
proprietary, subscription bot, for example. In any case, if you plan to download
your AIML files or upload them to the Pandorabots server, it is a good idea to use
118
a text editor like Notepad, edit, emacs, or Word, to add a copyright notice to your
AIML files.
AIML on Pandorabots
1. You can create AIML files on Pandorabots:
a. From Library files, when you chose to create a bot from an existing
Pandorabot such as “Dr. Wallace’s A.L.I.C.E.” or “AAA AIML Set”
b. From AIML files you created yourself from scratch.
c. From AIML files you might upload to Pandorabots from your local
computer
d. In a special file, called update.aiml, created when a botmaster uses
the training interface and clicks on “Say Instead” or “Update” in the
Advanced Alter Response page. You can edit the update.aiml file.
Functionally, there are several ways to create or edit AIML files on Pandorabots.
On your bots AIML page, you will see a table of all your bots AIML files. You can
click on options to edit the AIML files online, download the AIML files, or view
them. Viewing them is one of the most interesting options. AIML is an XML
language. For historical reasons, the majority of browsers support XML data
parsing and viewing. This is very helpful for debugging AIML files, as well as
scanning them.
You can also download AIML files and, with Pandorabots, upload them five at a
time. The interface can be admittedly a bit cumbersome, when one wants to
upload 50 AIML files at a time, you must suffer through the browse dialog for
every single file. But this chapter is for those AIML freaks who love to do just
119
that, transfer AIML files between Pandorabots and other popular AIML and non-
AIML software on your local computer.
Pandorabots and Program D
No longer actively supported, but we are seeking donations, program D is the
Java implementation of the Alicebot engine. This was the version to get if you
wanted to use the latest technology, especially if you wanted to participate in
Alice's development, before all the other development projects took off.
Whilst this is no longer actively being developed at alicebot.org you can
download the latest distributions from alicebot.org/downloads.
Usually the biggest headache installing program D is downloading the Java
runtime environment from java.sun.com. One problem you’ll encounter is that
Sun seems to change their marketing strategy for Java every six months or so,
and redesign the Java web page as a result. It’s often a little hard to keep up
with the latest developments in Java technology, unless you are a dedicated
Java head. Fortunately backwards-compatibility is a watchword for Sun’s Java
designers, so you can always count on the latest and greatest release of the
Java Runtime Environment to be compatible with the last release of Program D.
I’m going to download all the AIML software into a directory called c:\alice on my
Windows XP machine. I will put program D in a folder called c:\alice\ProgramD.
From the downloads page on www.alicebot.org, I can get the binary only version
of Program D (since I am not planning to do any development work on the source
code). Downloading the zip file, I will unzip the contents into c:\alice\ProgramD.
120
The first configuration file to edit is called server.properties. This file contains a
property, programd.emptydefault, which is set to the same value for default
<get/> in Pandorabots.
From Pandorabots, download your AIML files into a folder like c:\alice\aaa\ on
your local PC. Edit the file conf\startup.xml to list the AIML you want your bot to
load using the <learn> tag.
121
After the Program D configuration files are set, and Java is installed, you can run
program D with the command “run.bat”. The trace of a typical program D run is
displayed here.
122
Program D executes a special category with the pattern CONNECT at startup-
time. This is a useful place to put any initialization AIML that might have
appeared in the AIML of Pandorabots custom HTML.
<category>
<pattern>CONNECT</pattern>
<template>
<think> <srai>SET PREDICATES OM</srai>
<set name="name">JUDGE <star/></set>
</think>
<random>
<li>Hello!</li>
<li>Have we started yet?</li>
<li>Are you there?</li>
<li>Hello? Is anyone there?</li>
</random>
</template>
</category>
123
One of the nice features of Program D is AIML match tracing. If you input a
complex sentence like “Alice, do you know where Japan is?”, the console display
prints out the sequence of patterns matched by <srai> as the interpreter
cascades through the sequence of recursive matches.
124
.
Pandorabots and Program J (J-Alice)
J-Alice by Jonathan Roewen and Taras Glek is an AIML engine written in C++. It
comes with a built-in IRC client, with support for multiple channels and servers,
and a small webserver. Each IRC setup (per irc network/server) supports
configuration of an IRC Server, to allow the botmaster, for example, to connect
and control the bot, and even pretend to be a bot with your favourite IRC client.
We download the J-Alice program to the directory c:\alice\ProgramJ.
125
Loading AIML files in J-Alice is accomplished by editing the AIML categories in
the file std-startup.xml.
We made a shortcut of the J-Alice.exe startup file from the home directory
c:\alice\programJ to the desktop.
127
Pandorabots and Program N (AIMLPad)
Program N by Gary Dubuque is the Alice chatbot hosted in a notepad text editor
with an additional script processing language for authoring dialogs to assist in
creating new AIML, which extends and develops the program's personality.
128
Program N Embrace and Extend
Program N includes a lot more features than standard AIML. Of all the open
source programs, AIMLPad has gone the furthest to embrace and extend as
many different other freeware technologies as possible. A large part of the
extension is based on the work of Kino Corsey, who adapted the Program N
engine to the OpenCyc system to create the hybrid CyN, so that for example
given the information in AIML that “the boss of John is Steve”, the existing
OpenCyc ontology helps generate (or modify) the equivalence class of AIML
categories with patterns:
“Who is John’s Boss?’, “Who does John know?”, “ Who is the boss of John?”,
“Who is the Superior of John?”, :”Who influences John?”, “Who does Steve
boss?” and “”Who does Steve influence?”.
129
Pandorabots and Program P (Pascalice)
Kim Sullivan gives us Program P, a.k.a. "PASCALice". P is written in Delphi, and
is released under the GNU GPL. Kim's page also provides an AIML checking tool
called ShadowChecker.
The equivalent file for setting predicate defaults in Pascalice is called
TestBot.variables.
130
You can create a symbolic link from the Pascalice.exe to your desktop and get a
desktop icon to start Pascalice. Click on your desktop icon to start Pascalice:
On the same web page for Pascalice, you will find a companion program called
AIML Shadow Checker. This is a very useful tool for detecting a common
131
problem in AIML files. A shadow happens when the pattern in one AIML
category blocks the pattern in another AIML category from ever being activated.
Such shadowed categories are difficult for the botmaster to find manually and yet
easy to create by careless AIML writing. The Shadow Checker is a great tool to
help you find these AIML shadows.
Incidentally, sometimes AIML shadows happen for perfectly legitimate reasons.
You may be merging the contents of two bots written by different botmasters,
who have covered the same input content. In any case, the Shadow Checker
can automatically and efficiently detect these blocked AIML categories.
The Shadow Checker works best with one or two AIML files at a time. You can
load an AIML file with the Load File button. It uses a standard Browse and Load
dialog box.
132
The button “Test all” tells us which categories had duplicate patterns between the
two files we loaded. Now we can edit the files, remove the duplicate patterns if
desired, retest them with Shadow Checker, and upload the repaired files to
Pandorabots.
Using a Spreadsheet or Database Program to Write AIML
There are many authoring tools and editors one could use to write AIML. You
can use your favorite text editor, be it MS WORD, Notepad, or a powerful text
editor like EMACS. In addition, there are many tools developed by the AIML free
software community designed to help make writing AIML easier. Pandorabots,
for example, has a web based interface that helps you write one AIML category
at a time. It also has a tool called Pandorawriter that converts dialog transcripts
into AIML categories. Other software to help write AIML categories is listed on
the A. I. Foundation web site under www.alicebot.org/downloads. Because
AIML is an XML language, you can also use editors specifically designed for
XML to author your AIML files. This document concerns a different approach,
however; one based on using a spreadsheet or database program to help write
massive numbers of AIML categories.
We will take you through a step-by-step example of creating an AIML file using a
spreadsheet program, specifically MS Excel. But the principles and procedures
are about the same for any spreadsheet or database program that allows you to
enter data in table format. There are a few pitfalls to using these programs, and
133
we will point them out. Their advantage is that you can create a large number of
AIML and manage them fairly easily. Especially, the ability to sort categories by
<pattern> or <template> makes it easy, in some cases, to eliminate duplicate
categories or find opportunities to simplify your AIML with <srai>.
The following example is a simple case of creating categories that have only a
<pattern> and <template>. More complex categories using <that> and <topic>
do not appear in this example. But after following the example, it should be easy
to see how to generalize this AIML authoring technique to categories with <that>
and <topic>.
One word of caution: a botmaster may end up wasting a lot of time creating AIML
categories that will never be activated. This is because, it is difficult to predict in
advance what kinds of conversations and inputs clients will have with your bot. A
common mistake is to create categories with patterns that are too specific to ever
be activated in a realistic conversation. This is why we generally prefer the
approach called “Targeting” to create AIML categories.
In the most general form, Targeting simply means reading the log files of
conversations with your bot to get an idea about what inputs the bot cannot
answer, and then writing new categories to handle those inputs. It is based on
the principle that if one client makes a specific input to your bot, another client
will come along later and make the same, or almost the same input, over again.
134
So it is most productive to focus your efforts on the inputs people have already
tried on your bot, than to try to predict in advance what those inputs will be.
Believe us when we say that after your bot is running online and well publicized,
you will collect plenty of conversation data to keep you busy writing AIML through
the Targeting approach.
Some AIML programs, such as Pandorabots, have special software tools to
make Targeting even more efficient. You won’t even have to read the
conversation log files one by one. The software automatically detects client
inputs for which the bot does not have a specific reply, and alerts the botmaster
to these as potential new input patterns. If you are starting a bot from scratch,
you can build up your bot’s brain using Targeting to find the most common inputs
first, and writing replies for those inputs. You can prioritize your work by writing
AIML for common inputs first, and then work on less frequent input forms later.
This approach guarantees that your bot will have the greatest “coverage” of
inputs for the amount of work you put in.
Having made that disclaimer about the Targeting approach, there are some
circumstances when you just want to write a large amount of AIML categories
without referring to dialogues or Targets. In these cases, using a database or
spreadsheet program may be a useful and timesaving approach.
135
We begin by observing that much AIML code is redundant XML, and that we
would prefer to avoid typing the same
<category><pattern></pattern><template></template></category> tags over and
over for every new AIML category. The parts that really interests us are what
goes between those <pattern> and <template> tags. So we can use a form-
entry program like MS Excel to create the data for our AIML file.
The first screenshot illustrates an example of using MS Excel to input a large
number of AIML patterns and templates, using the A and B columns of the
spreadsheet respectively.
136
Notice that we have adjusted the width of the A and B columns to take into
account the expected size of our patterns and templates. Although this is not
necessary, it makes it easier to read the categories and provides better
formatting if you want to print them out.
One convenience often provided by such programs is auto completion, which
means that if you start to type the same thing over again in the same column, the
program will match what you have typed with a previous entry and complete the
entry for you. This may not always give you what you want, but it often improves
efficiency if you are entering many similar patterns or identical templates.
It is a good idea to save your work from time to time as you enter your AIML
data, especially if you intend to create a large file. This example file is called
Psychology.aiml, so we use the File/Save menu option to repeatedly save that
file as we add new data. Eventually, the file filled up with 500 lines of data
representing 500 new AIML categories.
Another great convenience of these programs is that you can sort the categories
by different columns. For example we can take the data we have entered and
sort it by the A column by clicking on the A/Z button in MS Excel, or by pulling
down the Data/Sort menu option. As the next screenshot shows, we can click on
the A column and sort the categories by AIML pattern. One note of caution here:
if you are using Excel be sure to select both the A and B columns before running
137
the sort, otherwise you run the risk of sorting the patterns independently of the
templates, and mixing up all your categories. Database programs, unlike
spreadsheets, usually work differently and assume that the data is connected
across every row, so sorting by any column keeps the row data together. In
Excel, you can sort all the data by A or B, depending on which you select first,
but it is important to select both.
The next screenshot shows how we have sorted the categories by A, the AIML
pattern. This is extremely useful for finding specific categories or for eliminating
categories with duplicate patterns. For instance, suppose we know that the input
138
pattern BUT * appears in another AIML file, and is duplicated in this new data.
We can easily find it by sorting and then delete the BUT * category.
Now, we consider how to format our data into proper AIML categories. First, we
use the Insert menu to choose the Insert Columns option. Select the A column
first and insert a new column to the left. Select the B column next and insert a
new column between A and B.
139
Now, scroll down to the last row of data in your spreadsheet. It is important to
start at the bottom because we are going to use the Fill command to fill up the
new A column with identical data. If we start at the top, Excel won’t know where
to stop filling and create too many empty AIML categories. Go down to the last
row of data and type <category><pattern> into the last row of the A column, as
the next screen shot shows:
140
Now, select that last data entry box and use your cursor to move up to the first
data row, thereby selecting all the data boxes from 499 (in this case) back down
to one. Then, use the Edit menu to select the Fill/Up option and you should see
the A column fill up with identical entries of <category><pattern>. You may then
want to adjust the width of the A column for appearance:
141
Now, we basically repeat the same procedure in the C column by entering the
data </pattern><template> and again in the E column with the data
</template></category>. Again, scroll down to the last row of data and use the
Fill/Up option so you don’t overflow the columns with empty categories.
142
Now we are ready to convert the spreadsheet file to a text file and complete the
process of conversion to proper AIML. Using the File menu, select the Save
As… option. A dialog box will appear giving you the option to export the
spreadsheet to many different file formats. For our purposes, the best choice is
called “Text (tab delimited) *.txt”. Choosing this option will automatically create a
file name called Psychology.txt, because our original file was called
Psychology.xls.
When you click the Save button, you may encounter a series of dialog boxes
warning you about problems such as “The selected file type does not support
143
multiple sheets” and “Psychology.txt may contain features that are not
compatible with Text (tab delimited)”. Generally you can ignore these warnings
and simply click OK or Yes as your option.
After you have saved the file, you will now need to use a text editor to make
some final formatting touch-ups to create a well-formed AIML file. At this point
we often transfer the text file over to a Linux machine and use emacs to make
the final changes, but a text editor as simple as Notepad works equally well.
Let’s open the text file in Notepad and see what we have:
144
The first item of business now is to eliminate all the tabs used as delimiters. This
step is not strictly necessary for many AIML interpreters, because they will ignore
the tabs or treat them as spaces. But eliminating them makes the file look nicer.
With Notepad, you can use Edit/Replace option to replace a Tab with “nothing”.
Sometimes it is not possible to type a Tab character directly into the Find What:
text box, but you can get around this by copying and pasting a Tab character
from your source. You don’t have to type anything in the Replace With: text box,
just leave it empty and click Replace All.
145
Now we can save our work as an AIML file. Use the File/Save As… menu item
and select Save As Type: All Files. Name your file Psychology.aiml (or whatever
name you choose, use a .aiml file extension). There is only a little more work to
do to finalize your AIML file.
If you look closely, you can see that the exported spreadsheet file contains some
extra, unwanted double-quote marks. These were inserted in two cases:
whenever your XML tag contained a quoted attribute value like index=”2” and
whenever quote marks appeared in the AIML template. You need to follow the
following steps to rewrite these categories
1. Use Edit/Replace to replace all occurrences of “” (two double-quotes) with
a one “ (a single double quote).
2. Use Edit/Replace to replace all occurrences of >” with > (these occur at
the beginning of a quoted <template>.
3. Use Edit/Replace to replace all occurrences of “< with <. (these occur at
the end of a quoted template.
Of course, these rules are not foolproof. You may have wanted to have quote
marks around your template. You may have templates that contain, for whatever
146
reasons, a pair of double quotes together “”. But apart from these unusual
circumstances, the substitutions will clean up your AIML file quite well.
Finally, we need to add some text to the beginning and end of the AIML file to
make it conform to the AIML schema. The end of the file is simple, just add a
line that says </aiml>.
At the beginning of the file, you may want to include a copyright statement in
XML comment form, as well as the XML specification and the opening <aiml>
tag:
147
Finally, we have finished creating a well-formed AIML ready to upload to your
favorite AIML interpreter. As we mentioned earlier, it should be easy for you to
see how to create a similar file, which includes <that>, or <topic> patterns. In the
case of <that>, you will start by entering three columns of data and fill up two
columns with </pattern><that> and </that><template> respectively. You can add
any <topic> tags using the text editor.
In conclusion, you can use a spreadsheet or database program to efficiently write
large numbers of AIML categories. The file export features of these programs
allow you to convert the data from two- or three-column format to delimited text.
Depending on which data entry program you used and its available file export
functions, you may have to use a text editor to touch-up the file to finalize its
AIML format. These procedures may be helpful in some AIML authoring
scenarios, but you should also consider other options such as Targeting and
AIML-specific authoring tools.
Subscriptions
Pandorabots has developed a unique bot subscription service providing you the
opportunity to make money with your bot. Going back to the beginning, look over
the list of “killer apps” for chat robots. If you can think of any way to turn any of
those into a subscription service, then this chapter is for you. The non-profit
ALICE A.I. Foundation launched three subscription bots on the Pandorabots
148
server: A.L.I.C.E. Silver Edition, the CLAUDIO Personality Test Bot, and The
DAVE E.S.L. Bot. Working in partnership with Oddcast, Inc., these bots combine
animated Vhost avatars for speech synthesis and face animation, with AIML chat
features. The first step to using Pandorabots subscriptions is to find a way to
collect payments. One simple method, not 100% foolproof, is to join an online
payment transfer service like PayPal.
Now, you need to advertise your subscription bot. How you do this, is completely
up to you. You can try to get your web site in the press, in blogs, in search
engines, in other words, promoted in any way you can think of. You can also try
direct advertising of your site.
149
In its simplest form, the PayPal interface will contact you by email when a
customer signs up for a subscription. Generally the customer will be expecting
instant gratification. So it is usually a good idea to put some language in your ad
to the effect that “subscriptions will normally begin within 24-36 hours of payment
processing”. Thus it will be possible for you to get some sleep between
checking your emails in this business.
150
Access the Pandorabots subscriber list by selecting the Subscribers button for
your bot. If there are no subscribers for this bot, Pandorabots will ask you if you
wish to begin signing up subscribers to this bot.
151
In either case, you will need to fill in the fields with the subscriber’s email
address, the number of months, and the HTML skin (which will almost always be
“default”, even when we use a Vhost). When you have filled in these three items,
click on “Add Subscriber” and wait for the table to refresh.
The new subscriber appears as the last item on the new table. It is a good idea
to have a standard form letter prepared in order to notify your subscriber that his
or her bot is activated:
Dear Subscriber,
Thank you so much for supporting the research efforts of the ALICE A.I. Foundation by
152
subscribing to the A. L. I. C. E. Silver Edition. Your personal URL for unlimited private
chat with the latest edition of the award winning ALICE chat robot is:
http://www.pandorabots.com/pandora/talkbot?subid=xxxxxxxxxxxxxxx
Be sure to bookmark this URL and keep it private.
Sincerely yours,
Dr. Rich Wallace
PayPal provides its customers with a debit card, so you can withdraw the funds
your bot earns from any ATM machine as soon as your customers pay for a bot
subscription. If you are clever enough to develop a true killer app that is really
appealing on a massive scale, then you may have found a way to cash in on the
subscription bot business model. Pandorabots provides the infrastructure for you
to test out your ideas.
Pandorabots Embrace & Extend
Pandorabots had inevitably to add some features to AIML that were not part of
the AIML specification. The following code fragment demonstrates some of
these new features. Pandorabots provides the unique ability to run AIML
templates inside the HTML that will appear on the client’s browser. This very
feature, the ability to process AIML templates inside the browser HTML, is itself
an example of Pandorabot’s embrace and extend approach to AIML.
One useful set of AIML templates displays history of the last four exchanges with
the client, a dialogue history, updated every time the client says something and
the bot responds. Such a set of templates is easy to program in Pandorabots
153
AIML. But as we shall see, it makes use of almost every feature of Pandorabots
“embraced and extended” AIML.
Human inputs are displayed with a prefix prompt “Human:” and bot responses
are displayed with the bot’s name followed by a “:”. If there have been fewer
than four exchanges, the screen should appear blank rather than show unfilled
lines with prompts.
<template>
<think>
<set name="_history">
<request index="3"/>
</set>
</think>
<condition name="_history">
<li value="*">
<i><b>Human:</b></i> <request index="3"/><br/>
<i><b><bot name="name"/>:</b></i> <response index="3"/><br/>
</li>
</condition>
<br/>
</template>
<template>
<think>
<set name="_history">
<request index="2"/>
</set>
</think>
<condition name="_history">
<li value="*">
<i><b>Human:</b></i> <request index="2"/><br/>
<i><b><bot name="name"/>:</b></i> <response index="2"/><br/>
</li>
</condition>
<br/>
</template>
<template>
<think>
<set name="_history">
<request index="1"/>
</set>
</think>
<condition name="_history">
<li value="*">
<i><b>Human:</b></i> <request index="1"/><br/>
<i><b><bot name="name"/>:</b></i> <response index="1"/><br/>
</li>
154
</condition>
<br/>
</template>
Wildcard in conditions
Pandorabots has adopted a boundary condition in AIML where the list item in the
condition tag has a value equal to the wild card “*”. In this example the <set>
operation sets the AIML predicate “_history” to the value of <request index=”1”/>.
If <request index=”1”/> has not been set, then it cannot match any value,
including “*”. Using this bit of AIML trickery, Pandorabots says that the AIML
code inside the <li value=”*”> will not be executed because “_history” is set to
“undefined”. I am as much in favor of the undefined as the next person, but this
is not standard AIML.
Wildcard in indexes
This example doesn’t show it, but Pandorabots also allows wildcards in some
AIML tag indexes. For example, the tag
<that index="1,*"/>
indicates the set of input sentences included in <that index=”1,1”/>…<that
index=”1,N”/>.
Request and Response
Here is a general problem of mathematical reference that appears in AIML. You
might call it, the problem of “multiline response”. Consider a dialogue between
155
two individuals. One of them, B, asks, or says, something, that begins and ends
with a sentence. It consists of several sentences. What B says is, as we say,
“multiline”. The respondent, A, next utters his or her own reply to what he or she
has heard. What A says is also multiline.
And so what B says next. Sometimes, of course, the multiline utterances consist
of just one line, but in general a script consists of sequences of such back-and-
forth, multiline responses.
At the lowest level AIML provides for processing individual input sentences. One
AIML pattern matches one input sentence. The next level of context is usually
provided by the <that> variable. Most of the time, AIML has no way to
distinguish whether inputs came from multiline input sequences, or from
individual inputs, which may help explain some bizarre constructions that emerge
from unpredictable multiline input queries.
The AIML specification provides for indexed <input/> and <that/> tags to store
the values of previous input values and robot replies. The <input index=”X”/>
tag is one dimensional but the <that index=”X,Y”/> tag is already two
dimensional, owing to the fact that the Xth previous input can have Y sentences
in it’s reply. We see here that AIML makes no distinction for input sentences that
come from multiline inputs, or one shots, so to speak, because doing so would
add another needless indexing dimension to <input/> and <that/>.
156
The typical AIML interpreter master loop is to append all of the output sentences
together into a single output paragraph for the bot output. If the program keeps a
history of these outputs and the associated multiline inputs, then it has created
something very similar to the Pandorabots <request/> and <response/> tags.
Getting back to the example, <request/> and <response/> are the indexed history
tags of the entire multiline input and output of the human and bot, respectively.
Formatted date tag
Pandorabots supports three extension attributes to the date element in
templates:
locale
format
timezone
timzeone should be an integer number of hours +/- from GMT and that locale is
the iso language/country code pair e.g., en_US, ja_JP. Locale defaults to
en_US. The set of supported locales are:
af_ZA ar_OM da_DK en_HK es_CO es_PY fr_CA is_IS mt_MT sh_YU vi_VN
ar_AE ar_QA de_AT en_IE es_CR es_SV fr_CH it_CH nb_NO sk_SK zh_CN
ar_BH ar_SA de_BE en_IN es_DO es_US fr_FR it_IT nl_BE sl_SI zh_HK
ar_DZ ar_SD de_CH en_NZ es_EC es_UY fr_LU ja_JP nl_NL sq_AL zh_SG
ar_EG ar_SY de_DE en_PH es_ES es_VE ga_IE kl_GL nn_NO sr_YU zh_TW
ar_IN ar_TN de_LU en_SG es_GT et_EE gl_ES ko_KR no_NO sv_FI
ar_IQ ar_YE el_GR en_US es_HN eu_ES gv_GB kw_GB pl_PL sv_SE
ar_JO be_BY en_AU en_ZA es_MX fa_IN he_IL lt_LT pt_BR ta_IN
ar_KW bg_BG en_BE en_ZW es_NI fa_IR hi_IN lv_LV pt_PT te_IN
ar_LB bn_IN en_BW es_AR es_PA fi_FI hr_HR mk_MK ro_RO th_TH
ar_LY ca_ES en_CA es_BO es_PE fo_FO hu_HU mr_IN ru_RU tr_TR
ar_MA cs_CZ en_GB es_CL es_PR fr_BE id_ID ms_MY ru_UA uk_UA
format is a format string as given to the Unix strftime function:
http://www.opengroup.org/onlinepubs/007908799/xsh/strftime.html
You can include your own message in the format string, along with one or more
format control strings. These format control strings tell the date function whether
157
to print the date or time, whether to use AM or PM, a 24 hour clock or a 12 hour,
abbreviate the day of the week or not, and so on. Some of the supported format
control strings include:
%a Abbreviated weekday name
%A Full weekday name
%b Abbreviated month name
%B Full month name
%c Date and time representation appropriate for locale
%d Day of month as decimal number (01 – 31)
%H Hour in 24-hour format (00 – 23)
%I Hour in 12-hour format (01 – 12)
%j Day of year as decimal number (001 – 366)
%m Month as decimal number (01 – 12)
%M Minute as decimal number (00 – 59)
%p Current locale’s A.M./P.M. indicator for 12-hour clock
%S Second as decimal number (00 – 59)
%U Week of year as decimal number, with Sunday as first day of week (00
– 53)
%w Weekday as decimal number (0 – 6; Sunday is 0)
%W Week of year as decimal number, with Monday as first day of week (00
– 53)
%x Date representation for current locale
%X Time representation for current locale
%y Year without century, as decimal number (00 – 99)
%Y Year with century, as decimal number
%Z Time-zone name or abbreviation; no characters if time zone is
unknown
%% Percent sign
If you don't specify a format you'll just get the date using the default format for the
particular locale.
timezone is the time zone expressed as the number of hours west of GMT.
If any of the attributes are invalid, it will fall back to the default
behavior of <date/> (i.e. with no attributes specified)
158
To display the date and time in French using Central European time you would
use:
<date locale="fr_FR" timezone="-1" format="%c"/>
You can also improve the specificity of common certain time and date related
inquiries to the ALICE bot, as illustrated by the following dialogue fragment.
Human: what day is it ALICE: Thursday. Human: what month is it ALICE: December. Human: what year is this ALICE: 2004. Human: what is the date ALICE: Thursday, December 02, 2004.
No system tag
The AIML <system> tag is the key to creating the operating system of the future,
because it runs any operating system command. In standard AIML, you can use
<system> to do everything from tell you the date and time, to open a Notebook
editor, to control a robot, you name it! Your imagination is the limit when you
consider all the possibilities. But unfortunately Pandorabots does not let you take
over their system with the <system> tag, which is exactly what hackers and
malicious coders would do if it were available to the general public for free.
Which is unfortunate too because Pandorabots is written in Lisp, and a <system>
tag to the Lisp evaluator would be a fascinating project for AIML developers. But
remember, you are running your bot on their server, so it makes sense that a
159
limitation like no <system> tag might exist. Likewise, there is no equivalent of the
server-side <javascript> tag.
You can of course write client-side Javascript code, or any client-side code that
you can embed in HTML, such as an applet, because you may include any
HTML inside the AIML response. The <script> tag is normally safe inside AIML
responses in Pandorabots. It will be passed along to the browser and interpreted
there.
No predicate defaults
Although we saw in a previous section how to set predicate defaults in
Pandorabots with AIML, most other AIML interpreters support predicate defaults
in different way, using a startup data file. Similarly, Pandorabots lacks botmaster
control over a variety of functions that are pretty much closed or hard-wired, at
least for the time being, in Pandorabots.
Deperiodization – Removing ambiguous punctuation like “Dr.” and “St”, and
also applying heuristic rules to determine what makes a sentence a sentence.
This feature is hard wired in Pandorabots.
Normalization – Expanding contractions, removing all remaining punctuation,
repairing many spelling errors. This feature is hard wired in Pandorabots.
Predicate defaults – AIML predicates have a default value for <get/>. You
can only set one global <get/> value in Pandorabots. In this book, under the
160
section on custom HTML, we showed a trick using embedded HTML-side
AIML (another non-standard, embrace-and-extend feature) to set the default
value of predicates.
Predicate <set/> returns – Some predicates return the predicate name, such
as pronouns, and some return the set values. These choices are hard wired
in Pandorabots.
Pandorabots AIML Tags Set
The tags in the table below correspond to the set of AIML implemented by the
Pandorabots AIML interpeter. These are not exactly the same set of AIML
tags adopted by the AIML Architecture committee for the Artificial Intelligence
Markup Language (AIML) Version 1.0.1 A.L.I.C.E. AI Foundation Working
Draft, 18 February 2005 (rev 007). For comparison see the table of AIML 1.0.1
tags at alicebot.org. This table, however, refers to that document where
appropriate.
There are both small and large differences between the Pandorabots tag set
and the AIML standard. In particular, there is no <id/>, <size/>, <version/>,
<gossip>, <system>, or <javascript> tag in Pandorabots, and the interpretation
of the <learn> tag is quite different.
161
Other documents found on alicebot.org, useful for understanding the
Pandorabots AIML tags include:
The AIML 1.0.1 Tags Set
The AIML 1.0 Tags Set
The AIML Overview by Dr Rich Wallace.
A Tutorial for adding knowledge to your robot by Doubly Aimless
In the table, XML tags are shown in a shorthand notation. Closing tags are not
shown. The index attribute whenever it appears is optional. The default value
is index="1" (or index="1,1" for 2-d indexes). The index tag uses offset one
indexing.
AIML Tag WD Reference Remark
<aiml> 3.2. AIML Element AIML block delimeter
<topic name="X"> 4. Topic X is AIML pattern
<category> 5. Category AIML knowledge unit
<pattern> 6. Pattern AIML input pattern
<that> 6.1. Pattern-side That contains AIML pattern
<template> 7. Template AIML response template
<star index="N"/> 7.1.1. Star binding of *
<that index="M,N"/> 7.1.2. Template-side That
previous bot utterance
<input index="N"/> 7.1.3. Input input sentence
<thatstar index="N"/> 7.1.4. Thatstar binding of * in that
<topicstar index="N"/> 7.1.5. Topicstar binding of * in topic
<get name="XXX"/> 7.1.6. Get Botmaster defined XXX, default
<bot name="XXX"/> 7.1.6.1. Bot Custom bot parameter
<sr/> 7.1.7. Short-cut elements
<srai><star/></srai>
<person2/> 7.1.7. Short-cut elements
<person2><star/></person2;>
<person/> 7.1.7. Short-cut elements
<person><star/></person;>
<gender/> 7.1.7. Short-cut elements
<gender><star/></gender;>
<uppercase> 7.2.1. Uppercase convert all text to Uppercase
162
<lowercase> 7.2.2. Lowercase convert all text to Lowercase
<formal> 7.2.3. Formal capitalize every word
<condition name="X" value="Y"> 7.3.1. Condition One shot branch
<condition name="X"> 7.3.1. Cond>,<condition>
<set name="XXX"> 7.4.1. Set May return XXX or value
<srai> 7.5.1. SRAI Recursion
<person2> 7.6.1. Person2 swap 1st & 3rd person
<person> 7.6.2. Person swap 1st & 2nd person
<gender> 7.6.3. Gender change gender pronouns
<think> 7.7.1. Think Hides side-effects
Pandorabots Extension Purpose Remark
<condition name="X" value="*"> Branch with undefined value
One shot branch
<li name="X" value="*"> Branch with undefined value
used by <condition>
<li value="*"> Branch with undefined value
used by <condition>
<date locale="X" timezone="Y" format="Z"/>
date and time Unix strftime format
<that index="M,*"/> previous bot utterances multi-sentence
<request index="N"/> input request multi-sentence
<response index="N"/> output response multi-sentence
<learn> save AIML category non standard
<eval> AIML evaluation expression inside <learn>
Finding Other Resources
This book has only touched upon some of the major points of AIML. If you want
to get into Artificial Intelligence Markup Language in more depth, we recommend
you join the ALICE A. I. Foundation at www.alicebot.org. The ALICE A. I.
Foundation sets the standard for AIML and releases the free software behind the
A. L. I. C. E. brain. You can keep up with all the latest developments of in AIML
by joining the A. I. Foundation.
163
My other book, The Elements of AIML Style, also available on the A. I.
Foundation web site, provides much more detail about the AIML language itself.
Pandorabots is really only one of many different implementation of AIML. You
can take the knowledge you’ve gained here and apply it to many different pieces
of free AIML software available on the alicebot.org web site. You can also join
mailing lists and enter the discussion with other botmasters like yourself from
around the world who are on their own journey just like yours. Many of the
questions you are asking, they will already know the answers to and be happy to
share with you. The site www.alicebot.org is an excellent starting point for
meeting the worldwide A. L. I. C. E. and AIML community.
The End of The Journey
If you have made it this far, you have absorbed everything you need to know to
get started creating, hosting and selling your bot on Pandorabots. This
guidebook has provided you with all the basic steps necessary to get your bot up
and running on Pandorabots, to publish it on the web, to link it with talking,
animated virtual hosts, and to make money with your bot by signing up paying
customers as subscribers.
This is not really the end of your bot journey, but the beginning. Using the tools
you’ve acquired in this book, you can now begin your new career in Artificial
Intelligence as a botmaster. Remember, the most important skill for a botmaster
164
is not computer programming, but writing. The art of AIML is writing believable
responses for your bot that are brief, entertaining, grammatically correct, concise,
and above all evoke a “suspension of disbelief” in the client. The skill is not that
different from what is needed to develop characters for novels, movies, or
television.
Glossary
AIML - a markup/programming language for creating chat bots. AIML is a subset
of XML.
A. L. I. C. E. – a chat robot personality developed by Dr. Richard S. Wallace
Bot – The artificial intelligence chat robot.
Botmaster – The author or creator of the chat robot.
category – The basic unit of knowledge in AIML. A category contains an input
pattern, optional <that> and <topic> patterns, and a response template
Client – A person chatting with the bot.
165
Default category – A category with a pattern that contains a wildcard.
Default Response template - what is said when nothing is matched.
Graphmaster - A data representation of the categories in AIML.
Input pattern - The AIML pattern that matches the input sentence (or sentences)
provided to the robot.
Lisp – An artificial intelligence programming language used to program the
underlying code for Pandorabots.
Navigation Bar – The list of HTML links you can follow to navigate around the
Pandorabots web site.
pattern – The input part of the AIML category. AIML patterns are made up of
letters, numbers, spaces and the wildcard characters * and _.
pattern Matching - A capability of matching input sentences against stored
sentences
predicates – Variables relating to the client, which may change during the course
of a client conversation.
166
Properties – Variables relating to the bot, which remain constant.
template – the output or response part of the AIML category, either a response or
a program to generate a response.
Ultimate default category – A category that matches when the input sentence
fails to match any other category.
Wildcard – A special character that can match one or more words. AIML
wildcards include the star * and underscore character _.
XML - a markup/programming language similar to HTML.
Index
A. I. ................................................. 7
A. L. I. C. E..... 42, 43, 62, 63, 68, 72,
74, 168, 169
AAA ............................................... 16
Advanced Alter Response ............ 91
Advanced Alter Response page... 39,
40, 47, 71, 72, 73
AIML .... 6, 17, 30, 37, 38, 39, 40, 41,
42, 43, 44, 45, 46, 47, 48, 50, 52, 53,
54, 55, 56, 58, 60, 61, 64, 65, 68, 70,
71, 72, 73, 74, 77, 78, 80, 81, 83,
168, 169, 170, 171
avatar .............................................. 6
Avatars ............................................ 7
blogs............................................ 153
bot property ................. 39, 65, 67, 68
botid ...................................... 19, 120
Botmaster Control ................... 18, 19
167
Botmaster Control Page ................ 83
botmaster. ....................................... 6
category ... 31, 38, 39, 40, 41, 42, 43,
46, 50, 54, 55, 59, 60, 61, 62, 63, 64,
74, 75, 170, 171
chat robot ....................................... 6
Chatterbot Collection .................... 11
client 6, 7, 39, 41, 45, 46, 50, 53, 55,
56, 57, 61, 63, 72, 73, 74, 75, 77, 78,
81, 169, 171
condition ..................................... 159
Create a Pandorabot ............... 13, 14
date ............................................. 161
default response ........................... 50
deperiodiation ............................... 32
Deperiodization ..................... 32, 164
ELIZA ...................................... 63, 64
embrace and extend ........... 133, 157
Graphmaster ............................... 170
HTML ..... 30, 37, 39, 45, 54, 92, 106,
170, 172
indexes ....................................... 159
javascript ....................................... 93
Lisp.............................................. 170
Loebner Prize ................................ 17
mailing lists .................................... 12
matching algorithm. ....................... 33
MS Excel ..................................... 137
MSN Messenger.......................... 118
My Pandorabots ............................ 13
Navigation Bar ..... 13, 18, 78, 92, 170
normalization ................................. 33
Normalization ........................ 33, 165
Oddcast VHost ............................ 105
Pandorawriter ........ 78, 80, 81, 82, 83
Pandorbaots.com, ........................... 6
pattern 31, 38, 39, 42, 43, 50, 51, 52,
54, 55, 59, 61, 63, 71, 72, 74, 75, 81,
83, 170
PayPal ......................................... 153
predicate .. 40, 41, 42, 43, 44, 45, 46,
63, 171
predicate defaults ................ 134, 164
Predicate defaults........................ 165
predicates .......................... 26, 47, 73
pronouns ........................... 45, 64, 73
168
properties .......................... 65, 68, 73
property ....................... 21, 39, 41, 46
publish .......................................... 19
recursion ....................................... 32
screename .................................. 119
SitePal ........ 100, 101, 102, 104, 112
Speech recognition ......................... 7
spreadsheet ................................ 137
star ..... 50, 52, 54, 55, 59, 63, 64, 65,
171
subscription ........... See subscriptions
symbolic reduction ........................ 32
target ....................................... 88, 89
Targeting ..................................... 138
template ... 31, 38, 39, 42, 43, 44, 46,
48, 50, 54, 55, 58, 63, 65, 68, 69, 70,
71, 73, 75, 76, 83, 170, 171
Train .............................................. 27
-training ......................................... 71
Training ......................................... 27
ultimate default .................. 59, 61, 62
Ultimate default ... 59, 60, 63, 64, 171
underscore ................ 50, 51, 52, 171
VHost . 102, 104, 105, 106, 107, 108,
109, 111, 112
voice recognition ............................. 7
Voice synthesis ............................... 7
wildcard .... 50, 51, 52, 54, 55, 59, 61,
63, 64, 65, 87, 92, 170, 171
wildcards ..................... 33, 50, 58, 83
Wildcards ...................................... 52
XML ................... 39, 41, 45, 169, 172