managing uncertanity in text-to-sketch tracking problem

7/29/2019 Managing Uncertanity in Text-to-Sketch Tracking Problem

1/8

Managing Uncertainty in Text-To-Sketch Tracking Problems

Matthew D. Schmill and Tim Oates

Computer Science and Electrical Engineering Department

University of Maryland Baltimore CountyBaltimore, Maryland

[email protected], [email protected]

AbstractText-to-Sketch (T2S) is a class of problems inwhich geolocation is performed using natural language descrip-tions of a location or locations as input. This is a challengingproblem due to the many sources of uncertainty inherent to thetask: there is often syntactic and semantic ambiguity presentin the input observations, as well as referential ambiguitywhen the language used to describe the scene may refer tomany possible objects or locations in the world. Trackingproblems, in which the Text-to-Sketch paradigm is extended to

incorporate multiple locations and movements over a temporaldimension, introduce additional uncertainty. We describe a toolfor managing the uncertainty in Text-to-Sketch problems calledMUTTS. The MUTTS system combines traditional natural lan-guage processing (NLP) tools with algorithms used to manageuncertainty in mobile robot navigation to allow the temporaland geographical constraints in the text to incrementally reducethe overall uncertainty of a subjects location and produce highquality sketches of the subjects location and movements overtime.

Keywords-interactive systems; particle filters; uncertainty;natural language processing; text to sketch;

I. INTRODUCTION

The goal of Text to Sketch (T2S) systems is to produce

sketches from natural language descriptions. Exactly what

constitutes a sketch varies from system to system. Existing

approaches generate 2d topological maps based on textual

descriptions of physical features (buildings, etc.) [1]. Those

maps can then be matched against satellite imagery to pro-

vide geolocation services; one might imagine an agent trying

to orient herself in a foreign environment, and utilizing

an intelligent T2S system to provide geolocation details

based on a spoken description of her surroundings. Another

application of text to sketch is robot navigation based on

qualitative specifications [2].

An extension to the 2d version of T2S that is of particular

interest to the intelligence community introduces a temporaldimension, to allow temporally extended sketches that can

represent not just location but movement and routes [3]. This

extension allows us to consider not just geolocation or map

building, but tracking. But extending the T2S paradigm has

a significant impact on how T2S is executed due to how it

affects the handling of uncertainty.

A key issue in T2S systems is how to manage and

represent uncertainty. Natural language is often imprecise

and ambiguous, especially the terms we use to refer to

space, locations, time, and duration. Among the sources of

uncertainty in T2S:

Explicit imprecision refers to the use of language that

explicitly represents uncertainty, such as near his

house or around 3 oclock.

Syntactic ambiguity refers to sentence structure in

which more than one parse is possible given the lan-

guage.

Semantic ambiguity arises when there are multiple legal

word senses given the syntactic interpretation.

Referential ambiguity is possible when a word or phrase

could refer to more than one known physical location,

object, or person.

Spatial imprecision results when words are used that

are geographically non-specific, such as the park or

downtown.

Text to sketch systems, in order to be useful, must repre-

sent these uncertainties, and when possible, use constraints

present in the data and background knowledge to reduce

them. Furthermore, a T2S system must be able to present

sketches and uncertainty in a manner that is useful to thehuman user, and in the best case, allow the user to supply

background knowledge that will improve the results.

In our work, we consider the task of tracking a subject

as he moves around an urban environment. The goal is

to produce a sketch of the subjects locations, movements,

and the routes he has taken based on natural language

observations, either in real-time or post-hoc. The textual

descriptions may include eye-witness accounts, overheard

conversations (possibly from the subject himself), police

reports, and so on, and may refer to the movements of

the subject as well as landmarks and locations that he has

encountered. Some examples of the types of accounts we

might expect:

We saw him near the pizza restaurant. (eye-witness

account)

subject walking north for one half mile. subject turns

east and continues for 5 minutes. (police report)

I am meeting Jerry at the hospital. (overheard)

The introduction of multiple speakers adds an additional

level of uncertainty to the task: there may be irrelevant

information in the text stream and some text may refer to

2011 23rd IEEE International Conference on Tools with Artificial Intelligence

1082-3409/11 $26.00 2011 IEEE

DOI 10.1109/ICTAI.2011.70

430


2/8

Figure 1. An overview of the MUTTS pipeline.

the subject in the first person, third person, or not at all.Speakers may use colloquialisms, be non-native speakers

of the target language, and there even exists the possibility

of adversarial intelligence false observations intended to

make the tracking task more difficult.

In this paper, we present a system we have developed

called MUTTS, which combines a mix of off-the-shelf and

in-house NLP tools with a probabilistic framework called a

particle filter [4] to tackle the problem of Text-to-Sketch for

subject tracking. In the sections that follow, we describe the

MUTTS system and its components, including the particle

filter, a variant of which which we have adapted to T2S, and

show how it can be used to represent, reduce, and visualize

uncertainty. We present the text processing elements ofMUTTS, and how it processes and displays information

in a manner useful for intelligence analysis in a usage

example. We conclude with a discussion of our ongoing

and future efforts to improve the tool and its underlying

AI components.

I I . SYSTEM

Our system for generating sketches for tracking analy-

sis is called MUTTS: Managing Uncertainty in Text To

Sketch. MUTTS is a web-based application, written using

the Google Web Toolkit (GWT), which compiles pure Java

down to a combination of JavaScript and external libraries.

The GWT offers access to a suite of Google functionalitythat includes Google Maps, Local Search, and Directions,

all of which are used at various stages of processing and

visualization. Supplementary road data is also available

using the Census Bureaus TIGER/Line R data files.

MUTTS takes natural language textual1 accounts of a

subjects locations and movements as input, and provides

as output visualizations of the most likely waypoints and

routes that the subject took during the time period being

tracked. A rough overview of the MUTTS system is shown

in figure 1.

The T2S process is treated as a pipeline in MUTTS. First,

natural language processing tools are utilized to produce

representations that encode syntactic structure and roles inthe text. Those representations are then queried to infer

semantics, producing text meaning representations (TMRs),

and finally, those representations are passed to an adapted

particle filter, which updates its own internal models of

where the subject might be and how he might have gotten

there. In the remainder of this section, we start by describing

1Adding automated speech recognition would be a straightforward ex-tension that would introduce an additional source of uncertainty.

the particle filter and how it is adapted for text-to-sketch.

We follow with details of the natural language processing

that MUTTS performs when there is input to be processed.

Finally, we conclude with a discussion of domain knowledge

and how its ubiquity in computer systems it can be used to

enhance text to sketch.

A. Particle Filters

One of the key insights that makes the work described

here possible is that text to sketch shares properties with

a well-studied problem in robotics called localization. The

goal of localization in mobile robotics is to integrate noisy,

time series sensor and odometry data with a map to produce

a probability distribution over possible robot locations. Sens-

ing decreases uncertainty about the robots location because

it enables reasoning about where on the map such readings

might be produced. Movement increases uncertainty about

the robots location because using typical dead reckoning

algorithms that estimate change in location are inaccurate; it

is impossible to know exactly how far the robot has moveddue to effector noise and environmental factors. However,

repeatedly sensing and moving can dramatically decrease

uncertainty about the robots location. For example, a robot

whose sonar detects a doorway on the left could be just

inside or just outside any office in an office building, but it

becomes clear that the robot is in a hallway when it observes

a second doorway on the left while moving in a straight line.

To better understand the relationship between mobile

robot localization and text-to-sketch, consider Mr. Jones,

who is known to be in Washington DC. Being told that

ones is near the memorial enables reasoning about Jones

location. We dont know which memorial Jones is near, nor

do we know his precise location relative to the memorial

due to the use of the word near, but we can represent his

location as a probability density with values that increase the

closer you get to anything on a map of Washington labeled

as a monument. This is very much like the robot above

that knows its approximate distance (due to sensor noise)

to a door, but has no idea which door. Next, suppose were

told Jones walked north for 20 minutes. To account for

this information, the probability density describing possible

locations is shifted north by the average distance that a

person can walk in 20 minutes, but it is also spread out

(reflecting an increase in uncertainty about Jones location)

to account for the fact that Jones could have been walkingfaster or slower than average or he could have diverged

from a due north trajectory. Again, this uncertainty about the

distance traveled is precisely the problem faced by mobile

robots with noisy effectors.

A number of algorithms exist that solve the localization

problem efficiently and, in some cases, optimally from the

standpoint of using available information to maximally re-

duce uncertainty about location. We use an approach known

as particle filtering [5]. A particle is, roughly, a point on

431


3/8

the map that carries a quantum of probability mass. The

more particles there are in an area of the map the higher

the probability that the person of interest is there. The

particle filter algorithm updates the positions of the particles

in response to new information (e.g., the fact that Jones

walked north for 20 minutes). The number of particles can

be chosen to trade off computational cost and resolution ofthe probability density, but since computation is linear in the

number of particles (i.e., constant per particle per update)

and is easily made parallel, it is not unusual to have tens or

even hundreds of thousands of particles.

After each update, particles are redistributed by sampling

(importance sampling [6]) from the density they approx-

imate, so that particles will die out in low probability

(unimportant) areas and become more concentrated in high

probability (important) areas. In this way, particles initially

allocated to, say, parses or reference resolutions that make

subsequent observations improbable will be reallocated near

particles based on parses or reference resolutions that are

supported by subsequent observations.The MUTTS system implements an adapted particle filter

as a probabilistic framework for representing uncertainty

about a subjects location on a map. Next, we consider how

analogs for sensing and moving are generating in the system.

B. Text Processing in MUTTS

Incoming text is first processed by the Stanford Parser [7],

[8], which we use to produce structured syntactic represen-

tations from raw text. The representations used by MUTTS

are parse trees and typed dependency lists. A parse tree is a

tree structure that represents the syntax of a sentence. Words

are grouped into phrases and their roles in the sentence can

be determined by examining the path back to the root. The

typed dependency list is generated from a parse tree and

expresses how words relate to one another in a sentence.

Consider the following sentence:

The subject was seen near a Popeyes Fried

Chicken.

The most probable parse for this sentence follows:

(ROOT

(S

(NP (DT the) (NN subject))

(VP (VBD was)

(VP (VBN seen)

(PP (IN near)

(NP(NP (DT a) (NNP Popeye) (POS s))

(NNP Fried) (NNP Chicken)))))))

Note that with this representation, the trained eye (or com-

puter) can quickly identify important parts of the sentence

such as the verb phrase was seen. The typed dependency

list for this parse tree above is as follows:

det(subject-2, the-1)

nsubjpass(seen-4, subject-2)

auxpass(seen-4, was-3)

det(Popeye-7, a-6)

poss(Chicken-10, Popeye-7)

nn(Chicken-10, Fried-9)

prep_near(seen-4, Chicken-10)

Note that the typed dependency list provides a convenient

representation for locating words related to one another;

for instance, that Fried is a compound noun modifier forChicken.

Together, the parse tree and typed dependency list provide

enough information for the next phase of processing to

begin. In this phase, the sentence structure is examined

to produce a text meaning representation that can be used

to update the particle filter in downstream processing. The

module responsible for this process is called the semantic

interpretation engine (SIE) as shown in figure 1. We have

designed and developed the MUTTS SIE as a rule-based

template matching system. The SIE comprises a lexicon

of English words organized into an ontology to allow for

generalizing across word classes, and a set of rules. On the

left hand side of the rules are mechanisms for matchingparse trees and typed dependency lists, and on the right

hand side is code for extracting semantics to generate a

useful TMR. For example, suppose we wanted to generate

a rule to catch the phrase above. A rule is constructed first

to look for the passive voice (a passive verb is used as the

root of the verb phrase), next to check for a verb that is

the descendant of observation in the lexical ontology (by

extracting seen from the auxpass relation in the typed

dependency list), and finally requiring that there is a prepo-

sitional phrase beginning with a spatial preposition (again,

using a combination of the parse tree, typed dependency list,

and information in the lexical ontology).

The pattern of the rule described above would match

our sample sentence, and the right hand side of the rule

would be used to generate a text meaning representation.

The first step is to decide whether the sentence represents a

sensing event or a movement event, in the sense described in

section II-A. Does the sentence refer to movement, in which

case the TMR will be used to update particle locations in a

dead-reckoning type update, or does the sentence reference

a landmark or location, in which case the sentence is

analogous to sensing in particle filter localization? Currently,

classification of a sentence as a sense or movement event is

hardwired into the rule. In this case, the combination of an

observation verb and a spatial preposition indicate a senseevent.

Sense and movement events have parameters that are

filled out during the processing of the right hand side of

a rule. In the case of sense events, the primary objective

of the rule is to extract a landmark or location reference,

and the secondary objective is to determine the specificity

of the reference. The primary objective is achieved by first

extracting the object of the spatial preposition (Chicken),

then pulling all modifiers of the object that indicate they

432


4/8

belong together (Popeyes and Fried). The secondary

objective, in this case, is achieved by looking up a specificity

level for the preposition being used, which is part of the

lexical ontology. In this case, the use of near implies some

uncertainty of the actual proximity to the landmark, whereas

at would indicate relative certainty that the subject is

actually at the landmark. In the case of our sentence, theTMR might look like this: 2

(SENSE :specificity moderate

:landmark "Popeyes Fried Chicken")

This representation is almost actionable by the particle

filter. There remains the question of where, exactly, is

Popeyes Fried Chicken? Particles are represented by latitude

and longitude, not by their common name. To resolve this

mapping between landmark or location names and points of

latitude and longitude, we use Googles Local Search API.

The functionality Local Search offers is to provide points

that match a keyword search. In the case of Popeyes, and

if our area of interest (AOI) is Baltimore, Local Searchwill return 13 Popeyes locations, complete with latitude,

longitude, and a variety of other information in hypertext

format. The landmark field of the sense event can then be

replaced by the corresponding points in the search results.

MUTTS allows the user to define an area of interest outside

of which search results will be ignored.

Generating movement events is a somewhat simpler pro-

cess. Consider the following text:

He walked east on Reisterstown Road, for maybe

15 minutes.

The TMR for a movement action includes direction,

distance and duration (any of which may be extracted fromthe text, derived by computation, or set to defaults), any

known road references, and uncertainties associated with the

direction, distance, and duration fields. Rules and templates

are written to identify movement events and the typed

dependency list is interrogated to fill in the TMR.

(MOVE :specificity approximate

:direction (0.0 0.1)

:duration (15 3.0)

:distance (0.75 0.15)

:onroad Reisterstown Rd.)

Note the introduction of a list notation to represent normal

distributions. In the above TMR, the duration is expressed asa normal distribution with mean 15 (minutes) and a standard

deviation of 3. The mean here is drawn directly from the

text (15 minutes), while the standard deviation is derived

from the combination of what is the typical inaccuracy of

a human observer and any uncertainty modifiers present in

the text (in this case, maybe).

2Those familiar with LISP will find this symbology familiar, even thoughMUTTS is not implemented in LISP.

These text meaning representations are ready for the

particle filter to process sense and movement events, as

described in section II-A.

C. Applying Domain Knowledge

Using Google Maps, Local Search, and Tiger/Lines R

allows MUTTS to bring a great deal of domain knowledgeto bear on managing the uncertainty in T2S a much

broader range than any human analyst could be expected

to have. The strength of automated text to sketch is the

amount of domain knowledge available, encoded in search

engines and databased, and the challenge is to exploit this

knowledge while performing adequately where human intel-

ligence excels: in natural language processing, commonsense

reasoning, and so on. In this section we will describe a

tool called path verification that is made possible by Google

Directions and augments the utility of MUTTS in just such

a manner.

The complex geometry of high resolution maps, coupled

with the surface features that go along with these maps transportation networks, waterways, green space, and so on

create a conundrum for the particle filter when performing

a dead reckoning update. If an observation comes in that

has an agent driving or walking to the northeast for half a

mile, then the particles must all be translated roughly a half

mile, roughly to the northeast. A most basic update would

simply move the particles, regardless of road networks or

geographical features, and then the sketch might involve

the agent driving over the Chesapeake Bay. On the other

end of the spectrum, the particle filter could be tasked with

incorporating all the various map features, conducting a

search over the road network (and incorporating footpaths

in the case of walking directions), and producing only legal

particle updates that respect the rules of the road and the lay

of the land.

While the former approach is obviously too naive, the

latter approach appears quite daunting. Fortunately, Google

Directions essentially accomplishes exactly that task. To

produce particle updates for movement events, MUTTS

processes the simplistic dead reckoning update, and uses

Google Directions to verify whether or not the updated

particle location is realistic given the features of the map.

This is path verification. If utterance ui moves particle

p from location pi to pi+1, we conclude that the update

is verifiable if the distance and time Google Directionsderives for pi pi+1 is probable given the duration and

distance distributions derived from the processing ofui. Said

differently, if the source text says 10 minutes, but Google

Directions returns a best route that takes 20 minutes, then

the particle path is not verifiable, and it should be resampled.

III. USAGE CAS E

In this section we detail a typical usage of the system and

describe some of the investigative features and visualizations

433


5/8

Figure 2. A screenshot of the MUTTS application.

that exist in MUTTS. Recall that MUTTS is a web applica-

tion built using Googles GWT framework, and incorporates

a suite of online tools to support the operations necessary

for geolocation and visualization. A screenshot of the full

MUTTS application can be seen in figure 2; it contains a

map view, a tree view for breaking down the text input,

and an interaction panel for visualizing search and sketch

results, and text areas to input data and otherwise interact

with MUTTS. The discussion here is based on a tutorial

developed for users of the system.

The use case here is that someone (who we will refer to as

the analyst) has received a collection of textual observationsthat refer to the locations and movements of a subject. What

the analyst would like is to provide an automated system

with the text, and get back a detailed map of the subjects

locations at all times throughout the observation period;

ideally, this would be a path through the map, annotated

with all the subjects stops. Due to the various sources of

uncertainty in the text stream, a single, true, accurate sketch

cannot generally be known. Therefore, MUTTS generates

sketches probabilistically, and allows the analyst to consider

and visualize the possibilities.

Analysis of a tracking problem begins with the analyst

constraining the area of interest. In this case, the AOI is the

Arington/Mount Washington area of Baltimore, Maryland.

The initial configuration of the particle filter places the

particles uniformly distributed over the the AOI. Particles are

rendered to the MUTTS map view as triangles representing

the hypothesized location and direction of movement. The

analyst begins the process by collecting the textual accounts

and entering them as input to MUTTS. Consider the follow-

ing collection of descriptions of a subjects whereabouts:

(9:30pm): We saw him near the pizza restaurant.

Figure 3. Particles distributed around annotated search results for pizzarestaurant in the Arington area of Baltimore.

Figure 4. A movement event has introduced uncertainty in the location

of the subject.

(9:41pm): subject walking north for one half mile.

subject turns east and continues for 5 minutes.

(9:52pm): I am meeting Jerry at the hospital.

This is what one might expect in a typical tracking

scenario (thought typically one would have more data). We

have three textual accounts, from different sources, with

approximate timestamps. In this example, the observations

come from an eye witness, from an police report, and an

overheard conversation of the subject. MUTTS will begin

by processing the first observation, which it will classifyas a sense event, with search query pizza restaurant. The

query returns 6 hits that are labeled A G in figure 3

(D is off the screen). Note that the particle filter has

processed the sense event and those particles consistent

with the locations of the pizza restaurants are given more

weight, while those inconsistent are given lower weighting

or resampled to locations consistent with a 2 dimensional

normal distribution, centered at the nearest pizza restaurant,

and consistent with models of the term near.

434


6/8

Figure 5. A sensing event that removes uncertainty.

The second record is then processed. MUTTS will gener-

ate two observations, both movement events, for the second

report. Movements are processed by the particle filter asdescribed in sections II-A and II-C. Essentially, a dead

reckoning update is performed and path verification is used

to quantify the likelihood that it may have happened. Move-

ment events either contain explicit distance information or

it can be derived from duration language and models of

movement. In this case, the subject was observed walking,

and MUTTS can model the translation described in the ob-

servations by a normal distribution consistent with a model

of walking. The resulting particle distribution is shown in

figure 4. Note the spreading effect that a movement event

has on the particles, expressing the uncertainty associated

when a subject begins moving. Not only may one half

mile be a rough estimate, but the subject may have taken anumber of different routes and side streets in traversing that

distance.

The third record is spoken in the first person and is

processed as a sense event. The search query, hospital,

is highly specific, as is evidenced by the updated particle

filter shown in figure 5. There is only one hospital, and all

particles that are not in the vicinity of the hospital after the

prior update are resampled to reflect the relative certainty

that at 9:52pm, the subject is at that particular hospital.

At this point, having incorporated four events into the

tracking problem, it is reasonable to start considering what a

sketch looks like, along with how it is generated, visualized,and evaluated. A sketch is generated by iterating over par-

ticles in the particle filter, retrieving each particles history

as its position and orientation has changed in response to

processing the text, and generating routes with the help

of Google directions. Thus, each particle tracked by the

filter has a corresponding sketch, and each such sketch can

be scored and ranked according to total distance traveled,

duration, or by a believability ranking, which incorporates

the particles weight over its history as well as external

Figure 6. A sketch that is consistent with the text.

measures, such as the path verification score for the various

segments of the sketch. The analyst sees a ranked list

of particle sketches, along with direction, duration, andbelievability, and begins viewing the sketches in order to

envision the possible scenarios.

Two sketches are shown in figures 6 and 7. The former

figure contains a sketch with low duration and distance

traveled, and high believability. The high believability score

is derived from two main factors. First, the particle weight

remains high over the duration of the sketch, indicating

that when sense events were processed (in particular, the

meeting at the hospital), the particles were already in close

proximity to where the subject was suspected to be. Second,

the path verifier found the duration and distance traveled

in all segments of the sketch could be reasonably expected

given the corresponding movement events.

The sketch shown in figure 7, in contrast, has a longer

overall duration and distance traveled, and a lower believ-

ability score. This is in large part due to the particles initial

position at the pizza restaurant labeled F in figure 3. It

is unlikely that this is the restaurant referred to by the

eye witness given subsequent movements and the eventual

meeting at the unambiguously located hospital. The particle

weights are correspondingly low in this sketch. In addition,

the paths required to arrive at the hospital are unlikely.

The location of Woodberry Woods and the Jones Falls

Expressway prevent the subject from having a clear and

timely route to the hospital, and this is precisely the role

of the path verifier: to flag routes as unlikely given the

movements described in the text.

The cycle of adding observations, visualizing the sketches,

and evolving a picture of the most likely tracking scenarios

can continue as long as there is additional data. We view the

MUTTS system as an increasingly mixed-initiative, allowing

the analyst to participate in the process by manually ruling

out or adding landmarks and routes, as well as providing

input to the language processing pipeline as well. Improving

435


7/8

Figure 7. A sketch that is unlikely given the input and backgroundknowledge about travel times.

the interactivity between the analyst and MUTTS is an

ongoing area of development.

IV. FUTURE WOR K

There is still much that can be done to improve MUTTS.

Future work falls into two categories: refining the tool and

basic research. MUTTS is currently in Alpha and initial

usability testing and evaluations are being performed by

intelligence analysts. The feedback is still in its preliminary

stages as of this writing, but adding to the mixed-initiative

capabilities as well as improving the rule base of the

semantic interpretation engine (to cover more constructions)

are obvious areas to improve on performance and enhance

the utility of the tool. We feel that the semantic interpretation

engine is also an obvious area that would benefit fromtransitioning from a home-grown ontology to a larger scale,

established product such as WordNet 3, and an opportunity

exists for analysts to teach MUTTS new semantic tem-

plates when new language constructs are observed by the

system. Indeed, the goal is automated data acquisition from

internal reports as well as the field, and we must expect

to receive unusual linguistic constructions from a variety of

speakers with various backgrounds.

Basic research goals include those areas where good,

working solutions to AI aspects of T2S are not established.

Here, we are not looking to make incremental improvements

to the parser, for example, but to explore new avenues where

advances to the field in general may be made. While weare always trying to incorporate new methods for managing

uncertainty, we are particularly encouraged about a novel

learning paradigm that is well-suited to the problem of T2S

for tracking and MUTTS in particular.

Recall the pipeline diagram in figure 1. In actuality, since

the parsing process can also be viewed as a pipeline, the

pipeline is somewhat longer, consisting of: a part-of-speech

3http://wordnet.princeton.edu/

Figure 8. Advice-giving in the MUTTS pipeline.

tagger, a named-entity recognizer, a k-best parser, a typed

dependency generator, the semantic interpretation engine,

and finally, the particle filter and path verifier. Many of these

processes are trainable components, based on supervised or

semi-supervised learning from labeled examples.

Consider the following scenario. Rules and their corre-

sponding templates have been generated to cover a variety

of possible textual constructions. In the course of processing

a large stream of text, MUTTS encounters the following two

sentences:

He walked north for 8 minutes. . .. . . then, he walked east for 8 minutes.

These two sentences, in a prior release of the parser, were

treated differently. 4 Here is the typed dependency list for

the west observation:

nsubj(walked-1, north-2)

num(minutes-5, 8-4)

prep_for(north-2, minutes-5)

The SIE contains a rule that matches on a movement

verb and a duration specification (walked and minutes,

respectively), and creates a movement event that can be filled

out searching for the num dependency of the TDL. But, the

east instance was processed differently:

advmod(walked-1, east-2)

prep_for(walked-1, 8-4)

nsubj(walked-1, minutes-5)

The absence of the num dependency prevents the rule

from completing the movement event in a way that is most

useful to the particle filter. Though this particular pathology

no longer occurs in the parser, it is illustrative of a general

condition. The parser is a large, complex system that may not

always parse sentences in a manner most convenient for our

semantic interpretation engine, especially when dealing with

unorthodox constructions found in casual speech. In these

cases, we would like to invoke the learning componentsopportunistically to improve performance.

Since the SIE and the parser are coupled in the MUTTS

pipeline, and since the SIE has an existing rule that almost

fires completely, it is possible for the SIE to express its

ideal input as a training instance, and pass it back in the

pipeline as advice for upstream components to learn from.

Ideally, the upstream component would then generate new

4This particular anomaly no longer occurs.

436


8/8

output closer to the SIEs target. This process is shown

diagrammatically in figure 8. In this case, each process

in the pipeline that receives advice may take the advice

itself, retrain, and emend its output, or upon examining

the advice, may decide to pass the advice upstream for

other components to consider. In this particular example, the

parser may be able to consider lower-ranked parse trees inthe k-best set of trees, compute their TDLs, and determine

if more usable output could be provided to the SIE. If a

preferable TDL was found, the parser could then update its

own scoring metric to better reflect the preference of the SIE.

Here, the proper trees were available in the k-best set, the

correct output could be provided, and adjustments could be

made. We are enthusiastic that this approach will provide

improvements to the robustness of not only the MUTTS

system, but in other pipelined machine learning systems with

supervised and semi-supervised learning components.

V. CONCLUSIONS

We have presented MUTTS: a web application that per-forms automated text-to-sketch for tracking problems. This

tool combines state-of-the-art natural language processing

algorithms with an adaptation of a mobile robot localization

algorithm called a particle filter to manage the many sources

of uncertainty in tracking from textual descriptions. We have

demonstrated how the use of off-the-shelf syntactic pro-

cessing, coupled with a special purpose semantic ontology

and template-matching system, can generate sensing and

movement events that correspond to sensing and acting in

mobile robot navigation and localization. The system also

leverages vast amounts of existing spatial knowledge in the

form of Google Maps, Local Search, and Directions, as well

as the TIGER/Line R road data to bridge the gap betweentextual observations and geolocation and tracking.

By presenting a use case, we have illustrated the utility

of MUTTS as an analysts assistant. It provides the ability

to iteratively reduce uncertainty about the sketch by adding

observations and providing mixed-initiative constraints, and

provides visualizations and scoring metrics for assessing

the likelihood of individual sketches. We finished by out-

lining directions for future development and areas in which

progress can be made on the intelligence aspects of the tool.

The MUTTS tools is currently being alpha tested by the

intelligence community and we are enthusiastic about its

potential as both an analysts tool and a platform for machine

learning and natural language research.

ACKNOWLEDGMENT

This project was supported by a grant from the Intelli-gence Community Postdoctoral Research Fellowship Pro-

gram through funding from the Office of the Director of

National Intelligence.

REFERENCES

[1] I. Sledge and J. Keller, Mapping natural language to imagery:Placing objects intelligently, in Fuzzy Systems, 2009. FUZZ-

IEEE 2009. IEEE International Conference on, aug. 2009, pp.518 523.

[2] T. S. Levitt and D. T. Lawton, Qualitative navigation formobile robots, Artificial Intelligence, vol. 44, no. 3, pp. 305 360, 1990. [Online]. Available: http://www.sciencedirect.

com/science/article/pii/000437029090027W

[3] B. Tversky and P. U. Lee, Pictorial and verbal tools forconveying routes, in Spatial information theory: cognitive andcomputational foundations of geographic information science,C. Freksa and D. Mark, Eds. Springer, 1999, pp. 5164.

[4] N. Metropolis and S. Ulam, The monte carlo method, Journalof the American Statistical Association, vol. 44, no. 247, pp.335341, September 1949.

[5] D. Fox, S. Thrun, F. Dellaert, and W. Burgard, Particle filtersfor mobile robot localization, in Sequential Monte Carlo

Methods in Practice, A. Doucet, N. de Freitas, and N. Gordon,Eds. New York: Springer Verlag, 2000.

[6] D. B. Rubin, Using the sir algorithm to simulate posteriordistributions, in Bayesian Statistics 3: Proceedings of theThird Valencia International Meeting, J. Bernardo, M. Degroot,D. Lindley, and A. Smith, Eds. Oxford: Oxford UniversityPress, 1987, pp. 385402.

[7] D. Klein and C. D. Manning, Accurate unlexicalized parsing,in Proceedings of the 41st Meeting of the Association forComputational Linguistics, 2003, pp. 423430.

[8] B. M. Marie-Catherine de Marneffe and C. D. Manning,Generating typed dependency parses from phrase structureparses, in LREC 2006, 2006.

437

managing uncertanity in text-to-sketch tracking problem

Documents