managing uncertanity in text-to-sketch tracking problem
TRANSCRIPT
-
7/29/2019 Managing Uncertanity in Text-to-Sketch Tracking Problem
1/8
Managing Uncertainty in Text-To-Sketch Tracking Problems
Matthew D. Schmill and Tim Oates
Computer Science and Electrical Engineering Department
University of Maryland Baltimore CountyBaltimore, Maryland
[email protected], [email protected]
AbstractText-to-Sketch (T2S) is a class of problems inwhich geolocation is performed using natural language descrip-tions of a location or locations as input. This is a challengingproblem due to the many sources of uncertainty inherent to thetask: there is often syntactic and semantic ambiguity presentin the input observations, as well as referential ambiguitywhen the language used to describe the scene may refer tomany possible objects or locations in the world. Trackingproblems, in which the Text-to-Sketch paradigm is extended to
incorporate multiple locations and movements over a temporaldimension, introduce additional uncertainty. We describe a toolfor managing the uncertainty in Text-to-Sketch problems calledMUTTS. The MUTTS system combines traditional natural lan-guage processing (NLP) tools with algorithms used to manageuncertainty in mobile robot navigation to allow the temporaland geographical constraints in the text to incrementally reducethe overall uncertainty of a subjects location and produce highquality sketches of the subjects location and movements overtime.
Keywords-interactive systems; particle filters; uncertainty;natural language processing; text to sketch;
I. INTRODUCTION
The goal of Text to Sketch (T2S) systems is to produce
sketches from natural language descriptions. Exactly what
constitutes a sketch varies from system to system. Existing
approaches generate 2d topological maps based on textual
descriptions of physical features (buildings, etc.) [1]. Those
maps can then be matched against satellite imagery to pro-
vide geolocation services; one might imagine an agent trying
to orient herself in a foreign environment, and utilizing
an intelligent T2S system to provide geolocation details
based on a spoken description of her surroundings. Another
application of text to sketch is robot navigation based on
qualitative specifications [2].
An extension to the 2d version of T2S that is of particular
interest to the intelligence community introduces a temporaldimension, to allow temporally extended sketches that can
represent not just location but movement and routes [3]. This
extension allows us to consider not just geolocation or map
building, but tracking. But extending the T2S paradigm has
a significant impact on how T2S is executed due to how it
affects the handling of uncertainty.
A key issue in T2S systems is how to manage and
represent uncertainty. Natural language is often imprecise
and ambiguous, especially the terms we use to refer to
space, locations, time, and duration. Among the sources of
uncertainty in T2S:
Explicit imprecision refers to the use of language that
explicitly represents uncertainty, such as near his
house or around 3 oclock.
Syntactic ambiguity refers to sentence structure in
which more than one parse is possible given the lan-
guage.
Semantic ambiguity arises when there are multiple legal
word senses given the syntactic interpretation.
Referential ambiguity is possible when a word or phrase
could refer to more than one known physical location,
object, or person.
Spatial imprecision results when words are used that
are geographically non-specific, such as the park or
downtown.
Text to sketch systems, in order to be useful, must repre-
sent these uncertainties, and when possible, use constraints
present in the data and background knowledge to reduce
them. Furthermore, a T2S system must be able to present
sketches and uncertainty in a manner that is useful to thehuman user, and in the best case, allow the user to supply
background knowledge that will improve the results.
In our work, we consider the task of tracking a subject
as he moves around an urban environment. The goal is
to produce a sketch of the subjects locations, movements,
and the routes he has taken based on natural language
observations, either in real-time or post-hoc. The textual
descriptions may include eye-witness accounts, overheard
conversations (possibly from the subject himself), police
reports, and so on, and may refer to the movements of
the subject as well as landmarks and locations that he has
encountered. Some examples of the types of accounts we
might expect:
We saw him near the pizza restaurant. (eye-witness
account)
subject walking north for one half mile. subject turns
east and continues for 5 minutes. (police report)
I am meeting Jerry at the hospital. (overheard)
The introduction of multiple speakers adds an additional
level of uncertainty to the task: there may be irrelevant
information in the text stream and some text may refer to
2011 23rd IEEE International Conference on Tools with Artificial Intelligence
1082-3409/11 $26.00 2011 IEEE
DOI 10.1109/ICTAI.2011.70
430
-
7/29/2019 Managing Uncertanity in Text-to-Sketch Tracking Problem
2/8
Figure 1. An overview of the MUTTS pipeline.
the subject in the first person, third person, or not at all.Speakers may use colloquialisms, be non-native speakers
of the target language, and there even exists the possibility
of adversarial intelligence false observations intended to
make the tracking task more difficult.
In this paper, we present a system we have developed
called MUTTS, which combines a mix of off-the-shelf and
in-house NLP tools with a probabilistic framework called a
particle filter [4] to tackle the problem of Text-to-Sketch for
subject tracking. In the sections that follow, we describe the
MUTTS system and its components, including the particle
filter, a variant of which which we have adapted to T2S, and
show how it can be used to represent, reduce, and visualize
uncertainty. We present the text processing elements ofMUTTS, and how it processes and displays information
in a manner useful for intelligence analysis in a usage
example. We conclude with a discussion of our ongoing
and future efforts to improve the tool and its underlying
AI components.
I I . SYSTEM
Our system for generating sketches for tracking analy-
sis is called MUTTS: Managing Uncertainty in Text To
Sketch. MUTTS is a web-based application, written using
the Google Web Toolkit (GWT), which compiles pure Java
down to a combination of JavaScript and external libraries.
The GWT offers access to a suite of Google functionalitythat includes Google Maps, Local Search, and Directions,
all of which are used at various stages of processing and
visualization. Supplementary road data is also available
using the Census Bureaus TIGER/Line R data files.
MUTTS takes natural language textual1 accounts of a
subjects locations and movements as input, and provides
as output visualizations of the most likely waypoints and
routes that the subject took during the time period being
tracked. A rough overview of the MUTTS system is shown
in figure 1.
The T2S process is treated as a pipeline in MUTTS. First,
natural language processing tools are utilized to produce
representations that encode syntactic structure and roles inthe text. Those representations are then queried to infer
semantics, producing text meaning representations (TMRs),
and finally, those representations are passed to an adapted
particle filter, which updates its own internal models of
where the subject might be and how he might have gotten
there. In the remainder of this section, we start by describing
1Adding automated speech recognition would be a straightforward ex-tension that would introduce an additional source of uncertainty.
the particle filter and how it is adapted for text-to-sketch.
We follow with details of the natural language processing
that MUTTS performs when there is input to be processed.
Finally, we conclude with a discussion of domain knowledge
and how its ubiquity in computer systems it can be used to
enhance text to sketch.
A. Particle Filters
One of the key insights that makes the work described
here possible is that text to sketch shares properties with
a well-studied problem in robotics called localization. The
goal of localization in mobile robotics is to integrate noisy,
time series sensor and odometry data with a map to produce
a probability distribution over possible robot locations. Sens-
ing decreases uncertainty about the robots location because
it enables reasoning about where on the map such readings
might be produced. Movement increases uncertainty about
the robots location because using typical dead reckoning
algorithms that estimate change in location are inaccurate; it
is impossible to know exactly how far the robot has moveddue to effector noise and environmental factors. However,
repeatedly sensing and moving can dramatically decrease
uncertainty about the robots location. For example, a robot
whose sonar detects a doorway on the left could be just
inside or just outside any office in an office building, but it
becomes clear that the robot is in a hallway when it observes
a second doorway on the left while moving in a straight line.
To better understand the relationship between mobile
robot localization and text-to-sketch, consider Mr. Jones,
who is known to be in Washington DC. Being told that
ones is near the memorial enables reasoning about Jones
location. We dont know which memorial Jones is near, nor
do we know his precise location relative to the memorial
due to the use of the word near, but we can represent his
location as a probability density with values that increase the
closer you get to anything on a map of Washington labeled
as a monument. This is very much like the robot above
that knows its approximate distance (due to sensor noise)
to a door, but has no idea which door. Next, suppose were
told Jones walked north for 20 minutes. To account for
this information, the probability density describing possible
locations is shifted north by the average distance that a
person can walk in 20 minutes, but it is also spread out
(reflecting an increase in uncertainty about Jones location)
to account for the fact that Jones could have been walkingfaster or slower than average or he could have diverged
from a due north trajectory. Again, this uncertainty about the
distance traveled is precisely the problem faced by mobile
robots with noisy effectors.
A number of algorithms exist that solve the localization
problem efficiently and, in some cases, optimally from the
standpoint of using available information to maximally re-
duce uncertainty about location. We use an approach known
as particle filtering [5]. A particle is, roughly, a point on
431
-
7/29/2019 Managing Uncertanity in Text-to-Sketch Tracking Problem
3/8
the map that carries a quantum of probability mass. The
more particles there are in an area of the map the higher
the probability that the person of interest is there. The
particle filter algorithm updates the positions of the particles
in response to new information (e.g., the fact that Jones
walked north for 20 minutes). The number of particles can
be chosen to trade off computational cost and resolution ofthe probability density, but since computation is linear in the
number of particles (i.e., constant per particle per update)
and is easily made parallel, it is not unusual to have tens or
even hundreds of thousands of particles.
After each update, particles are redistributed by sampling
(importance sampling [6]) from the density they approx-
imate, so that particles will die out in low probability
(unimportant) areas and become more concentrated in high
probability (important) areas. In this way, particles initially
allocated to, say, parses or reference resolutions that make
subsequent observations improbable will be reallocated near
particles based on parses or reference resolutions that are
supported by subsequent observations.The MUTTS system implements an adapted particle filter
as a probabilistic framework for representing uncertainty
about a subjects location on a map. Next, we consider how
analogs for sensing and moving are generating in the system.
B. Text Processing in MUTTS
Incoming text is first processed by the Stanford Parser [7],
[8], which we use to produce structured syntactic represen-
tations from raw text. The representations used by MUTTS
are parse trees and typed dependency lists. A parse tree is a
tree structure that represents the syntax of a sentence. Words
are grouped into phrases and their roles in the sentence can
be determined by examining the path back to the root. The
typed dependency list is generated from a parse tree and
expresses how words relate to one another in a sentence.
Consider the following sentence:
The subject was seen near a Popeyes Fried
Chicken.
The most probable parse for this sentence follows:
(ROOT
(S
(NP (DT the) (NN subject))
(VP (VBD was)
(VP (VBN seen)
(PP (IN near)
(NP(NP (DT a) (NNP Popeye) (POS s))
(NNP Fried) (NNP Chicken)))))))
Note that with this representation, the trained eye (or com-
puter) can quickly identify important parts of the sentence
such as the verb phrase was seen. The typed dependency
list for this parse tree above is as follows:
det(subject-2, the-1)
nsubjpass(seen-4, subject-2)
auxpass(seen-4, was-3)
det(Popeye-7, a-6)
poss(Chicken-10, Popeye-7)
nn(Chicken-10, Fried-9)
prep_near(seen-4, Chicken-10)
Note that the typed dependency list provides a convenient
representation for locating words related to one another;
for instance, that Fried is a compound noun modifier forChicken.
Together, the parse tree and typed dependency list provide
enough information for the next phase of processing to
begin. In this phase, the sentence structure is examined
to produce a text meaning representation that can be used
to update the particle filter in downstream processing. The
module responsible for this process is called the semantic
interpretation engine (SIE) as shown in figure 1. We have
designed and developed the MUTTS SIE as a rule-based
template matching system. The SIE comprises a lexicon
of English words organized into an ontology to allow for
generalizing across word classes, and a set of rules. On the
left hand side of the rules are mechanisms for matchingparse trees and typed dependency lists, and on the right
hand side is code for extracting semantics to generate a
useful TMR. For example, suppose we wanted to generate
a rule to catch the phrase above. A rule is constructed first
to look for the passive voice (a passive verb is used as the
root of the verb phrase), next to check for a verb that is
the descendant of observation in the lexical ontology (by
extracting seen from the auxpass relation in the typed
dependency list), and finally requiring that there is a prepo-
sitional phrase beginning with a spatial preposition (again,
using a combination of the parse tree, typed dependency list,
and information in the lexical ontology).
The pattern of the rule described above would match
our sample sentence, and the right hand side of the rule
would be used to generate a text meaning representation.
The first step is to decide whether the sentence represents a
sensing event or a movement event, in the sense described in
section II-A. Does the sentence refer to movement, in which
case the TMR will be used to update particle locations in a
dead-reckoning type update, or does the sentence reference
a landmark or location, in which case the sentence is
analogous to sensing in particle filter localization? Currently,
classification of a sentence as a sense or movement event is
hardwired into the rule. In this case, the combination of an
observation verb and a spatial preposition indicate a senseevent.
Sense and movement events have parameters that are
filled out during the processing of the right hand side of
a rule. In the case of sense events, the primary objective
of the rule is to extract a landmark or location reference,
and the secondary objective is to determine the specificity
of the reference. The primary objective is achieved by first
extracting the object of the spatial preposition (Chicken),
then pulling all modifiers of the object that indicate they
432
-
7/29/2019 Managing Uncertanity in Text-to-Sketch Tracking Problem
4/8
belong together (Popeyes and Fried). The secondary
objective, in this case, is achieved by looking up a specificity
level for the preposition being used, which is part of the
lexical ontology. In this case, the use of near implies some
uncertainty of the actual proximity to the landmark, whereas
at would indicate relative certainty that the subject is
actually at the landmark. In the case of our sentence, theTMR might look like this: 2
(SENSE :specificity moderate
:landmark "Popeyes Fried Chicken")
This representation is almost actionable by the particle
filter. There remains the question of where, exactly, is
Popeyes Fried Chicken? Particles are represented by latitude
and longitude, not by their common name. To resolve this
mapping between landmark or location names and points of
latitude and longitude, we use Googles Local Search API.
The functionality Local Search offers is to provide points
that match a keyword search. In the case of Popeyes, and
if our area of interest (AOI) is Baltimore, Local Searchwill return 13 Popeyes locations, complete with latitude,
longitude, and a variety of other information in hypertext
format. The landmark field of the sense event can then be
replaced by the corresponding points in the search results.
MUTTS allows the user to define an area of interest outside
of which search results will be ignored.
Generating movement events is a somewhat simpler pro-
cess. Consider the following text:
He walked east on Reisterstown Road, for maybe
15 minutes.
The TMR for a movement action includes direction,
distance and duration (any of which may be extracted fromthe text, derived by computation, or set to defaults), any
known road references, and uncertainties associated with the
direction, distance, and duration fields. Rules and templates
are written to identify movement events and the typed
dependency list is interrogated to fill in the TMR.
(MOVE :specificity approximate
:direction (0.0 0.1)
:duration (15 3.0)
:distance (0.75 0.15)
:onroad Reisterstown Rd.)
Note the introduction of a list notation to represent normal
distributions. In the above TMR, the duration is expressed asa normal distribution with mean 15 (minutes) and a standard
deviation of 3. The mean here is drawn directly from the
text (15 minutes), while the standard deviation is derived
from the combination of what is the typical inaccuracy of
a human observer and any uncertainty modifiers present in
the text (in this case, maybe).
2Those familiar with LISP will find this symbology familiar, even thoughMUTTS is not implemented in LISP.
These text meaning representations are ready for the
particle filter to process sense and movement events, as
described in section II-A.
C. Applying Domain Knowledge
Using Google Maps, Local Search, and Tiger/Lines R
allows MUTTS to bring a great deal of domain knowledgeto bear on managing the uncertainty in T2S a much
broader range than any human analyst could be expected
to have. The strength of automated text to sketch is the
amount of domain knowledge available, encoded in search
engines and databased, and the challenge is to exploit this
knowledge while performing adequately where human intel-
ligence excels: in natural language processing, commonsense
reasoning, and so on. In this section we will describe a
tool called path verification that is made possible by Google
Directions and augments the utility of MUTTS in just such
a manner.
The complex geometry of high resolution maps, coupled
with the surface features that go along with these maps transportation networks, waterways, green space, and so on
create a conundrum for the particle filter when performing
a dead reckoning update. If an observation comes in that
has an agent driving or walking to the northeast for half a
mile, then the particles must all be translated roughly a half
mile, roughly to the northeast. A most basic update would
simply move the particles, regardless of road networks or
geographical features, and then the sketch might involve
the agent driving over the Chesapeake Bay. On the other
end of the spectrum, the particle filter could be tasked with
incorporating all the various map features, conducting a
search over the road network (and incorporating footpaths
in the case of walking directions), and producing only legal
particle updates that respect the rules of the road and the lay
of the land.
While the former approach is obviously too naive, the
latter approach appears quite daunting. Fortunately, Google
Directions essentially accomplishes exactly that task. To
produce particle updates for movement events, MUTTS
processes the simplistic dead reckoning update, and uses
Google Directions to verify whether or not the updated
particle location is realistic given the features of the map.
This is path verification. If utterance ui moves particle
p from location pi to pi+1, we conclude that the update
is verifiable if the distance and time Google Directionsderives for pi pi+1 is probable given the duration and
distance distributions derived from the processing ofui. Said
differently, if the source text says 10 minutes, but Google
Directions returns a best route that takes 20 minutes, then
the particle path is not verifiable, and it should be resampled.
III. USAGE CAS E
In this section we detail a typical usage of the system and
describe some of the investigative features and visualizations
433
-
7/29/2019 Managing Uncertanity in Text-to-Sketch Tracking Problem
5/8
Figure 2. A screenshot of the MUTTS application.
that exist in MUTTS. Recall that MUTTS is a web applica-
tion built using Googles GWT framework, and incorporates
a suite of online tools to support the operations necessary
for geolocation and visualization. A screenshot of the full
MUTTS application can be seen in figure 2; it contains a
map view, a tree view for breaking down the text input,
and an interaction panel for visualizing search and sketch
results, and text areas to input data and otherwise interact
with MUTTS. The discussion here is based on a tutorial
developed for users of the system.
The use case here is that someone (who we will refer to as
the analyst) has received a collection of textual observationsthat refer to the locations and movements of a subject. What
the analyst would like is to provide an automated system
with the text, and get back a detailed map of the subjects
locations at all times throughout the observation period;
ideally, this would be a path through the map, annotated
with all the subjects stops. Due to the various sources of
uncertainty in the text stream, a single, true, accurate sketch
cannot generally be known. Therefore, MUTTS generates
sketches probabilistically, and allows the analyst to consider
and visualize the possibilities.
Analysis of a tracking problem begins with the analyst
constraining the area of interest. In this case, the AOI is the
Arington/Mount Washington area of Baltimore, Maryland.
The initial configuration of the particle filter places the
particles uniformly distributed over the the AOI. Particles are
rendered to the MUTTS map view as triangles representing
the hypothesized location and direction of movement. The
analyst begins the process by collecting the textual accounts
and entering them as input to MUTTS. Consider the follow-
ing collection of descriptions of a subjects whereabouts:
(9:30pm): We saw him near the pizza restaurant.
Figure 3. Particles distributed around annotated search results for pizzarestaurant in the Arington area of Baltimore.
Figure 4. A movement event has introduced uncertainty in the location
of the subject.
(9:41pm): subject walking north for one half mile.
subject turns east and continues for 5 minutes.
(9:52pm): I am meeting Jerry at the hospital.
This is what one might expect in a typical tracking
scenario (thought typically one would have more data). We
have three textual accounts, from different sources, with
approximate timestamps. In this example, the observations
come from an eye witness, from an police report, and an
overheard conversation of the subject. MUTTS will begin
by processing the first observation, which it will classifyas a sense event, with search query pizza restaurant. The
query returns 6 hits that are labeled A G in figure 3
(D is off the screen). Note that the particle filter has
processed the sense event and those particles consistent
with the locations of the pizza restaurants are given more
weight, while those inconsistent are given lower weighting
or resampled to locations consistent with a 2 dimensional
normal distribution, centered at the nearest pizza restaurant,
and consistent with models of the term near.
434
-
7/29/2019 Managing Uncertanity in Text-to-Sketch Tracking Problem
6/8
Figure 5. A sensing event that removes uncertainty.
The second record is then processed. MUTTS will gener-
ate two observations, both movement events, for the second
report. Movements are processed by the particle filter asdescribed in sections II-A and II-C. Essentially, a dead
reckoning update is performed and path verification is used
to quantify the likelihood that it may have happened. Move-
ment events either contain explicit distance information or
it can be derived from duration language and models of
movement. In this case, the subject was observed walking,
and MUTTS can model the translation described in the ob-
servations by a normal distribution consistent with a model
of walking. The resulting particle distribution is shown in
figure 4. Note the spreading effect that a movement event
has on the particles, expressing the uncertainty associated
when a subject begins moving. Not only may one half
mile be a rough estimate, but the subject may have taken anumber of different routes and side streets in traversing that
distance.
The third record is spoken in the first person and is
processed as a sense event. The search query, hospital,
is highly specific, as is evidenced by the updated particle
filter shown in figure 5. There is only one hospital, and all
particles that are not in the vicinity of the hospital after the
prior update are resampled to reflect the relative certainty
that at 9:52pm, the subject is at that particular hospital.
At this point, having incorporated four events into the
tracking problem, it is reasonable to start considering what a
sketch looks like, along with how it is generated, visualized,and evaluated. A sketch is generated by iterating over par-
ticles in the particle filter, retrieving each particles history
as its position and orientation has changed in response to
processing the text, and generating routes with the help
of Google directions. Thus, each particle tracked by the
filter has a corresponding sketch, and each such sketch can
be scored and ranked according to total distance traveled,
duration, or by a believability ranking, which incorporates
the particles weight over its history as well as external
Figure 6. A sketch that is consistent with the text.
measures, such as the path verification score for the various
segments of the sketch. The analyst sees a ranked list
of particle sketches, along with direction, duration, andbelievability, and begins viewing the sketches in order to
envision the possible scenarios.
Two sketches are shown in figures 6 and 7. The former
figure contains a sketch with low duration and distance
traveled, and high believability. The high believability score
is derived from two main factors. First, the particle weight
remains high over the duration of the sketch, indicating
that when sense events were processed (in particular, the
meeting at the hospital), the particles were already in close
proximity to where the subject was suspected to be. Second,
the path verifier found the duration and distance traveled
in all segments of the sketch could be reasonably expected
given the corresponding movement events.
The sketch shown in figure 7, in contrast, has a longer
overall duration and distance traveled, and a lower believ-
ability score. This is in large part due to the particles initial
position at the pizza restaurant labeled F in figure 3. It
is unlikely that this is the restaurant referred to by the
eye witness given subsequent movements and the eventual
meeting at the unambiguously located hospital. The particle
weights are correspondingly low in this sketch. In addition,
the paths required to arrive at the hospital are unlikely.
The location of Woodberry Woods and the Jones Falls
Expressway prevent the subject from having a clear and
timely route to the hospital, and this is precisely the role
of the path verifier: to flag routes as unlikely given the
movements described in the text.
The cycle of adding observations, visualizing the sketches,
and evolving a picture of the most likely tracking scenarios
can continue as long as there is additional data. We view the
MUTTS system as an increasingly mixed-initiative, allowing
the analyst to participate in the process by manually ruling
out or adding landmarks and routes, as well as providing
input to the language processing pipeline as well. Improving
435
-
7/29/2019 Managing Uncertanity in Text-to-Sketch Tracking Problem
7/8
Figure 7. A sketch that is unlikely given the input and backgroundknowledge about travel times.
the interactivity between the analyst and MUTTS is an
ongoing area of development.
IV. FUTURE WOR K
There is still much that can be done to improve MUTTS.
Future work falls into two categories: refining the tool and
basic research. MUTTS is currently in Alpha and initial
usability testing and evaluations are being performed by
intelligence analysts. The feedback is still in its preliminary
stages as of this writing, but adding to the mixed-initiative
capabilities as well as improving the rule base of the
semantic interpretation engine (to cover more constructions)
are obvious areas to improve on performance and enhance
the utility of the tool. We feel that the semantic interpretation
engine is also an obvious area that would benefit fromtransitioning from a home-grown ontology to a larger scale,
established product such as WordNet 3, and an opportunity
exists for analysts to teach MUTTS new semantic tem-
plates when new language constructs are observed by the
system. Indeed, the goal is automated data acquisition from
internal reports as well as the field, and we must expect
to receive unusual linguistic constructions from a variety of
speakers with various backgrounds.
Basic research goals include those areas where good,
working solutions to AI aspects of T2S are not established.
Here, we are not looking to make incremental improvements
to the parser, for example, but to explore new avenues where
advances to the field in general may be made. While weare always trying to incorporate new methods for managing
uncertainty, we are particularly encouraged about a novel
learning paradigm that is well-suited to the problem of T2S
for tracking and MUTTS in particular.
Recall the pipeline diagram in figure 1. In actuality, since
the parsing process can also be viewed as a pipeline, the
pipeline is somewhat longer, consisting of: a part-of-speech
3http://wordnet.princeton.edu/
Figure 8. Advice-giving in the MUTTS pipeline.
tagger, a named-entity recognizer, a k-best parser, a typed
dependency generator, the semantic interpretation engine,
and finally, the particle filter and path verifier. Many of these
processes are trainable components, based on supervised or
semi-supervised learning from labeled examples.
Consider the following scenario. Rules and their corre-
sponding templates have been generated to cover a variety
of possible textual constructions. In the course of processing
a large stream of text, MUTTS encounters the following two
sentences:
He walked north for 8 minutes. . .. . . then, he walked east for 8 minutes.
These two sentences, in a prior release of the parser, were
treated differently. 4 Here is the typed dependency list for
the west observation:
nsubj(walked-1, north-2)
num(minutes-5, 8-4)
prep_for(north-2, minutes-5)
The SIE contains a rule that matches on a movement
verb and a duration specification (walked and minutes,
respectively), and creates a movement event that can be filled
out searching for the num dependency of the TDL. But, the
east instance was processed differently:
advmod(walked-1, east-2)
prep_for(walked-1, 8-4)
nsubj(walked-1, minutes-5)
The absence of the num dependency prevents the rule
from completing the movement event in a way that is most
useful to the particle filter. Though this particular pathology
no longer occurs in the parser, it is illustrative of a general
condition. The parser is a large, complex system that may not
always parse sentences in a manner most convenient for our
semantic interpretation engine, especially when dealing with
unorthodox constructions found in casual speech. In these
cases, we would like to invoke the learning componentsopportunistically to improve performance.
Since the SIE and the parser are coupled in the MUTTS
pipeline, and since the SIE has an existing rule that almost
fires completely, it is possible for the SIE to express its
ideal input as a training instance, and pass it back in the
pipeline as advice for upstream components to learn from.
Ideally, the upstream component would then generate new
4This particular anomaly no longer occurs.
436
-
7/29/2019 Managing Uncertanity in Text-to-Sketch Tracking Problem
8/8
output closer to the SIEs target. This process is shown
diagrammatically in figure 8. In this case, each process
in the pipeline that receives advice may take the advice
itself, retrain, and emend its output, or upon examining
the advice, may decide to pass the advice upstream for
other components to consider. In this particular example, the
parser may be able to consider lower-ranked parse trees inthe k-best set of trees, compute their TDLs, and determine
if more usable output could be provided to the SIE. If a
preferable TDL was found, the parser could then update its
own scoring metric to better reflect the preference of the SIE.
Here, the proper trees were available in the k-best set, the
correct output could be provided, and adjustments could be
made. We are enthusiastic that this approach will provide
improvements to the robustness of not only the MUTTS
system, but in other pipelined machine learning systems with
supervised and semi-supervised learning components.
V. CONCLUSIONS
We have presented MUTTS: a web application that per-forms automated text-to-sketch for tracking problems. This
tool combines state-of-the-art natural language processing
algorithms with an adaptation of a mobile robot localization
algorithm called a particle filter to manage the many sources
of uncertainty in tracking from textual descriptions. We have
demonstrated how the use of off-the-shelf syntactic pro-
cessing, coupled with a special purpose semantic ontology
and template-matching system, can generate sensing and
movement events that correspond to sensing and acting in
mobile robot navigation and localization. The system also
leverages vast amounts of existing spatial knowledge in the
form of Google Maps, Local Search, and Directions, as well
as the TIGER/Line R road data to bridge the gap betweentextual observations and geolocation and tracking.
By presenting a use case, we have illustrated the utility
of MUTTS as an analysts assistant. It provides the ability
to iteratively reduce uncertainty about the sketch by adding
observations and providing mixed-initiative constraints, and
provides visualizations and scoring metrics for assessing
the likelihood of individual sketches. We finished by out-
lining directions for future development and areas in which
progress can be made on the intelligence aspects of the tool.
The MUTTS tools is currently being alpha tested by the
intelligence community and we are enthusiastic about its
potential as both an analysts tool and a platform for machine
learning and natural language research.
ACKNOWLEDGMENT
This project was supported by a grant from the Intelli-gence Community Postdoctoral Research Fellowship Pro-
gram through funding from the Office of the Director of
National Intelligence.
REFERENCES
[1] I. Sledge and J. Keller, Mapping natural language to imagery:Placing objects intelligently, in Fuzzy Systems, 2009. FUZZ-
IEEE 2009. IEEE International Conference on, aug. 2009, pp.518 523.
[2] T. S. Levitt and D. T. Lawton, Qualitative navigation formobile robots, Artificial Intelligence, vol. 44, no. 3, pp. 305 360, 1990. [Online]. Available: http://www.sciencedirect.
com/science/article/pii/000437029090027W
[3] B. Tversky and P. U. Lee, Pictorial and verbal tools forconveying routes, in Spatial information theory: cognitive andcomputational foundations of geographic information science,C. Freksa and D. Mark, Eds. Springer, 1999, pp. 5164.
[4] N. Metropolis and S. Ulam, The monte carlo method, Journalof the American Statistical Association, vol. 44, no. 247, pp.335341, September 1949.
[5] D. Fox, S. Thrun, F. Dellaert, and W. Burgard, Particle filtersfor mobile robot localization, in Sequential Monte Carlo
Methods in Practice, A. Doucet, N. de Freitas, and N. Gordon,Eds. New York: Springer Verlag, 2000.
[6] D. B. Rubin, Using the sir algorithm to simulate posteriordistributions, in Bayesian Statistics 3: Proceedings of theThird Valencia International Meeting, J. Bernardo, M. Degroot,D. Lindley, and A. Smith, Eds. Oxford: Oxford UniversityPress, 1987, pp. 385402.
[7] D. Klein and C. D. Manning, Accurate unlexicalized parsing,in Proceedings of the 41st Meeting of the Association forComputational Linguistics, 2003, pp. 423430.
[8] B. M. Marie-Catherine de Marneffe and C. D. Manning,Generating typed dependency parses from phrase structureparses, in LREC 2006, 2006.
437