managing uncertanity in text-to-sketch tracking problem

Upload: bhattermurli

Post on 03-Apr-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 Managing Uncertanity in Text-to-Sketch Tracking Problem

    1/8

    Managing Uncertainty in Text-To-Sketch Tracking Problems

    Matthew D. Schmill and Tim Oates

    Computer Science and Electrical Engineering Department

    University of Maryland Baltimore CountyBaltimore, Maryland

    [email protected], [email protected]

    AbstractText-to-Sketch (T2S) is a class of problems inwhich geolocation is performed using natural language descrip-tions of a location or locations as input. This is a challengingproblem due to the many sources of uncertainty inherent to thetask: there is often syntactic and semantic ambiguity presentin the input observations, as well as referential ambiguitywhen the language used to describe the scene may refer tomany possible objects or locations in the world. Trackingproblems, in which the Text-to-Sketch paradigm is extended to

    incorporate multiple locations and movements over a temporaldimension, introduce additional uncertainty. We describe a toolfor managing the uncertainty in Text-to-Sketch problems calledMUTTS. The MUTTS system combines traditional natural lan-guage processing (NLP) tools with algorithms used to manageuncertainty in mobile robot navigation to allow the temporaland geographical constraints in the text to incrementally reducethe overall uncertainty of a subjects location and produce highquality sketches of the subjects location and movements overtime.

    Keywords-interactive systems; particle filters; uncertainty;natural language processing; text to sketch;

    I. INTRODUCTION

    The goal of Text to Sketch (T2S) systems is to produce

    sketches from natural language descriptions. Exactly what

    constitutes a sketch varies from system to system. Existing

    approaches generate 2d topological maps based on textual

    descriptions of physical features (buildings, etc.) [1]. Those

    maps can then be matched against satellite imagery to pro-

    vide geolocation services; one might imagine an agent trying

    to orient herself in a foreign environment, and utilizing

    an intelligent T2S system to provide geolocation details

    based on a spoken description of her surroundings. Another

    application of text to sketch is robot navigation based on

    qualitative specifications [2].

    An extension to the 2d version of T2S that is of particular

    interest to the intelligence community introduces a temporaldimension, to allow temporally extended sketches that can

    represent not just location but movement and routes [3]. This

    extension allows us to consider not just geolocation or map

    building, but tracking. But extending the T2S paradigm has

    a significant impact on how T2S is executed due to how it

    affects the handling of uncertainty.

    A key issue in T2S systems is how to manage and

    represent uncertainty. Natural language is often imprecise

    and ambiguous, especially the terms we use to refer to

    space, locations, time, and duration. Among the sources of

    uncertainty in T2S:

    Explicit imprecision refers to the use of language that

    explicitly represents uncertainty, such as near his

    house or around 3 oclock.

    Syntactic ambiguity refers to sentence structure in

    which more than one parse is possible given the lan-

    guage.

    Semantic ambiguity arises when there are multiple legal

    word senses given the syntactic interpretation.

    Referential ambiguity is possible when a word or phrase

    could refer to more than one known physical location,

    object, or person.

    Spatial imprecision results when words are used that

    are geographically non-specific, such as the park or

    downtown.

    Text to sketch systems, in order to be useful, must repre-

    sent these uncertainties, and when possible, use constraints

    present in the data and background knowledge to reduce

    them. Furthermore, a T2S system must be able to present

    sketches and uncertainty in a manner that is useful to thehuman user, and in the best case, allow the user to supply

    background knowledge that will improve the results.

    In our work, we consider the task of tracking a subject

    as he moves around an urban environment. The goal is

    to produce a sketch of the subjects locations, movements,

    and the routes he has taken based on natural language

    observations, either in real-time or post-hoc. The textual

    descriptions may include eye-witness accounts, overheard

    conversations (possibly from the subject himself), police

    reports, and so on, and may refer to the movements of

    the subject as well as landmarks and locations that he has

    encountered. Some examples of the types of accounts we

    might expect:

    We saw him near the pizza restaurant. (eye-witness

    account)

    subject walking north for one half mile. subject turns

    east and continues for 5 minutes. (police report)

    I am meeting Jerry at the hospital. (overheard)

    The introduction of multiple speakers adds an additional

    level of uncertainty to the task: there may be irrelevant

    information in the text stream and some text may refer to

    2011 23rd IEEE International Conference on Tools with Artificial Intelligence

    1082-3409/11 $26.00 2011 IEEE

    DOI 10.1109/ICTAI.2011.70

    430

  • 7/29/2019 Managing Uncertanity in Text-to-Sketch Tracking Problem

    2/8

    Figure 1. An overview of the MUTTS pipeline.

    the subject in the first person, third person, or not at all.Speakers may use colloquialisms, be non-native speakers

    of the target language, and there even exists the possibility

    of adversarial intelligence false observations intended to

    make the tracking task more difficult.

    In this paper, we present a system we have developed

    called MUTTS, which combines a mix of off-the-shelf and

    in-house NLP tools with a probabilistic framework called a

    particle filter [4] to tackle the problem of Text-to-Sketch for

    subject tracking. In the sections that follow, we describe the

    MUTTS system and its components, including the particle

    filter, a variant of which which we have adapted to T2S, and

    show how it can be used to represent, reduce, and visualize

    uncertainty. We present the text processing elements ofMUTTS, and how it processes and displays information

    in a manner useful for intelligence analysis in a usage

    example. We conclude with a discussion of our ongoing

    and future efforts to improve the tool and its underlying

    AI components.

    I I . SYSTEM

    Our system for generating sketches for tracking analy-

    sis is called MUTTS: Managing Uncertainty in Text To

    Sketch. MUTTS is a web-based application, written using

    the Google Web Toolkit (GWT), which compiles pure Java

    down to a combination of JavaScript and external libraries.

    The GWT offers access to a suite of Google functionalitythat includes Google Maps, Local Search, and Directions,

    all of which are used at various stages of processing and

    visualization. Supplementary road data is also available

    using the Census Bureaus TIGER/Line R data files.

    MUTTS takes natural language textual1 accounts of a

    subjects locations and movements as input, and provides

    as output visualizations of the most likely waypoints and

    routes that the subject took during the time period being

    tracked. A rough overview of the MUTTS system is shown

    in figure 1.

    The T2S process is treated as a pipeline in MUTTS. First,

    natural language processing tools are utilized to produce

    representations that encode syntactic structure and roles inthe text. Those representations are then queried to infer

    semantics, producing text meaning representations (TMRs),

    and finally, those representations are passed to an adapted

    particle filter, which updates its own internal models of

    where the subject might be and how he might have gotten

    there. In the remainder of this section, we start by describing

    1Adding automated speech recognition would be a straightforward ex-tension that would introduce an additional source of uncertainty.

    the particle filter and how it is adapted for text-to-sketch.

    We follow with details of the natural language processing

    that MUTTS performs when there is input to be processed.

    Finally, we conclude with a discussion of domain knowledge

    and how its ubiquity in computer systems it can be used to

    enhance text to sketch.

    A. Particle Filters

    One of the key insights that makes the work described

    here possible is that text to sketch shares properties with

    a well-studied problem in robotics called localization. The

    goal of localization in mobile robotics is to integrate noisy,

    time series sensor and odometry data with a map to produce

    a probability distribution over possible robot locations. Sens-

    ing decreases uncertainty about the robots location because

    it enables reasoning about where on the map such readings

    might be produced. Movement increases uncertainty about

    the robots location because using typical dead reckoning

    algorithms that estimate change in location are inaccurate; it

    is impossible to know exactly how far the robot has moveddue to effector noise and environmental factors. However,

    repeatedly sensing and moving can dramatically decrease

    uncertainty about the robots location. For example, a robot

    whose sonar detects a doorway on the left could be just

    inside or just outside any office in an office building, but it

    becomes clear that the robot is in a hallway when it observes

    a second doorway on the left while moving in a straight line.

    To better understand the relationship between mobile

    robot localization and text-to-sketch, consider Mr. Jones,

    who is known to be in Washington DC. Being told that

    ones is near the memorial enables reasoning about Jones

    location. We dont know which memorial Jones is near, nor

    do we know his precise location relative to the memorial

    due to the use of the word near, but we can represent his

    location as a probability density with values that increase the

    closer you get to anything on a map of Washington labeled

    as a monument. This is very much like the robot above

    that knows its approximate distance (due to sensor noise)

    to a door, but has no idea which door. Next, suppose were

    told Jones walked north for 20 minutes. To account for

    this information, the probability density describing possible

    locations is shifted north by the average distance that a

    person can walk in 20 minutes, but it is also spread out

    (reflecting an increase in uncertainty about Jones location)

    to account for the fact that Jones could have been walkingfaster or slower than average or he could have diverged

    from a due north trajectory. Again, this uncertainty about the

    distance traveled is precisely the problem faced by mobile

    robots with noisy effectors.

    A number of algorithms exist that solve the localization

    problem efficiently and, in some cases, optimally from the

    standpoint of using available information to maximally re-

    duce uncertainty about location. We use an approach known

    as particle filtering [5]. A particle is, roughly, a point on

    431

  • 7/29/2019 Managing Uncertanity in Text-to-Sketch Tracking Problem

    3/8

    the map that carries a quantum of probability mass. The

    more particles there are in an area of the map the higher

    the probability that the person of interest is there. The

    particle filter algorithm updates the positions of the particles

    in response to new information (e.g., the fact that Jones

    walked north for 20 minutes). The number of particles can

    be chosen to trade off computational cost and resolution ofthe probability density, but since computation is linear in the

    number of particles (i.e., constant per particle per update)

    and is easily made parallel, it is not unusual to have tens or

    even hundreds of thousands of particles.

    After each update, particles are redistributed by sampling

    (importance sampling [6]) from the density they approx-

    imate, so that particles will die out in low probability

    (unimportant) areas and become more concentrated in high

    probability (important) areas. In this way, particles initially

    allocated to, say, parses or reference resolutions that make

    subsequent observations improbable will be reallocated near

    particles based on parses or reference resolutions that are

    supported by subsequent observations.The MUTTS system implements an adapted particle filter

    as a probabilistic framework for representing uncertainty

    about a subjects location on a map. Next, we consider how

    analogs for sensing and moving are generating in the system.

    B. Text Processing in MUTTS

    Incoming text is first processed by the Stanford Parser [7],

    [8], which we use to produce structured syntactic represen-

    tations from raw text. The representations used by MUTTS

    are parse trees and typed dependency lists. A parse tree is a

    tree structure that represents the syntax of a sentence. Words

    are grouped into phrases and their roles in the sentence can

    be determined by examining the path back to the root. The

    typed dependency list is generated from a parse tree and

    expresses how words relate to one another in a sentence.

    Consider the following sentence:

    The subject was seen near a Popeyes Fried

    Chicken.

    The most probable parse for this sentence follows:

    (ROOT

    (S

    (NP (DT the) (NN subject))

    (VP (VBD was)

    (VP (VBN seen)

    (PP (IN near)

    (NP(NP (DT a) (NNP Popeye) (POS s))

    (NNP Fried) (NNP Chicken)))))))

    Note that with this representation, the trained eye (or com-

    puter) can quickly identify important parts of the sentence

    such as the verb phrase was seen. The typed dependency

    list for this parse tree above is as follows:

    det(subject-2, the-1)

    nsubjpass(seen-4, subject-2)

    auxpass(seen-4, was-3)

    det(Popeye-7, a-6)

    poss(Chicken-10, Popeye-7)

    nn(Chicken-10, Fried-9)

    prep_near(seen-4, Chicken-10)

    Note that the typed dependency list provides a convenient

    representation for locating words related to one another;

    for instance, that Fried is a compound noun modifier forChicken.

    Together, the parse tree and typed dependency list provide

    enough information for the next phase of processing to

    begin. In this phase, the sentence structure is examined

    to produce a text meaning representation that can be used

    to update the particle filter in downstream processing. The

    module responsible for this process is called the semantic

    interpretation engine (SIE) as shown in figure 1. We have

    designed and developed the MUTTS SIE as a rule-based

    template matching system. The SIE comprises a lexicon

    of English words organized into an ontology to allow for

    generalizing across word classes, and a set of rules. On the

    left hand side of the rules are mechanisms for matchingparse trees and typed dependency lists, and on the right

    hand side is code for extracting semantics to generate a

    useful TMR. For example, suppose we wanted to generate

    a rule to catch the phrase above. A rule is constructed first

    to look for the passive voice (a passive verb is used as the

    root of the verb phrase), next to check for a verb that is

    the descendant of observation in the lexical ontology (by

    extracting seen from the auxpass relation in the typed

    dependency list), and finally requiring that there is a prepo-

    sitional phrase beginning with a spatial preposition (again,

    using a combination of the parse tree, typed dependency list,

    and information in the lexical ontology).

    The pattern of the rule described above would match

    our sample sentence, and the right hand side of the rule

    would be used to generate a text meaning representation.

    The first step is to decide whether the sentence represents a

    sensing event or a movement event, in the sense described in

    section II-A. Does the sentence refer to movement, in which

    case the TMR will be used to update particle locations in a

    dead-reckoning type update, or does the sentence reference

    a landmark or location, in which case the sentence is

    analogous to sensing in particle filter localization? Currently,

    classification of a sentence as a sense or movement event is

    hardwired into the rule. In this case, the combination of an

    observation verb and a spatial preposition indicate a senseevent.

    Sense and movement events have parameters that are

    filled out during the processing of the right hand side of

    a rule. In the case of sense events, the primary objective

    of the rule is to extract a landmark or location reference,

    and the secondary objective is to determine the specificity

    of the reference. The primary objective is achieved by first

    extracting the object of the spatial preposition (Chicken),

    then pulling all modifiers of the object that indicate they

    432

  • 7/29/2019 Managing Uncertanity in Text-to-Sketch Tracking Problem

    4/8

    belong together (Popeyes and Fried). The secondary

    objective, in this case, is achieved by looking up a specificity

    level for the preposition being used, which is part of the

    lexical ontology. In this case, the use of near implies some

    uncertainty of the actual proximity to the landmark, whereas

    at would indicate relative certainty that the subject is

    actually at the landmark. In the case of our sentence, theTMR might look like this: 2

    (SENSE :specificity moderate

    :landmark "Popeyes Fried Chicken")

    This representation is almost actionable by the particle

    filter. There remains the question of where, exactly, is

    Popeyes Fried Chicken? Particles are represented by latitude

    and longitude, not by their common name. To resolve this

    mapping between landmark or location names and points of

    latitude and longitude, we use Googles Local Search API.

    The functionality Local Search offers is to provide points

    that match a keyword search. In the case of Popeyes, and

    if our area of interest (AOI) is Baltimore, Local Searchwill return 13 Popeyes locations, complete with latitude,

    longitude, and a variety of other information in hypertext

    format. The landmark field of the sense event can then be

    replaced by the corresponding points in the search results.

    MUTTS allows the user to define an area of interest outside

    of which search results will be ignored.

    Generating movement events is a somewhat simpler pro-

    cess. Consider the following text:

    He walked east on Reisterstown Road, for maybe

    15 minutes.

    The TMR for a movement action includes direction,

    distance and duration (any of which may be extracted fromthe text, derived by computation, or set to defaults), any

    known road references, and uncertainties associated with the

    direction, distance, and duration fields. Rules and templates

    are written to identify movement events and the typed

    dependency list is interrogated to fill in the TMR.

    (MOVE :specificity approximate

    :direction (0.0 0.1)

    :duration (15 3.0)

    :distance (0.75 0.15)

    :onroad Reisterstown Rd.)

    Note the introduction of a list notation to represent normal

    distributions. In the above TMR, the duration is expressed asa normal distribution with mean 15 (minutes) and a standard

    deviation of 3. The mean here is drawn directly from the

    text (15 minutes), while the standard deviation is derived

    from the combination of what is the typical inaccuracy of

    a human observer and any uncertainty modifiers present in

    the text (in this case, maybe).

    2Those familiar with LISP will find this symbology familiar, even thoughMUTTS is not implemented in LISP.

    These text meaning representations are ready for the

    particle filter to process sense and movement events, as

    described in section II-A.

    C. Applying Domain Knowledge

    Using Google Maps, Local Search, and Tiger/Lines R

    allows MUTTS to bring a great deal of domain knowledgeto bear on managing the uncertainty in T2S a much

    broader range than any human analyst could be expected

    to have. The strength of automated text to sketch is the

    amount of domain knowledge available, encoded in search

    engines and databased, and the challenge is to exploit this

    knowledge while performing adequately where human intel-

    ligence excels: in natural language processing, commonsense

    reasoning, and so on. In this section we will describe a

    tool called path verification that is made possible by Google

    Directions and augments the utility of MUTTS in just such

    a manner.

    The complex geometry of high resolution maps, coupled

    with the surface features that go along with these maps transportation networks, waterways, green space, and so on

    create a conundrum for the particle filter when performing

    a dead reckoning update. If an observation comes in that

    has an agent driving or walking to the northeast for half a

    mile, then the particles must all be translated roughly a half

    mile, roughly to the northeast. A most basic update would

    simply move the particles, regardless of road networks or

    geographical features, and then the sketch might involve

    the agent driving over the Chesapeake Bay. On the other

    end of the spectrum, the particle filter could be tasked with

    incorporating all the various map features, conducting a

    search over the road network (and incorporating footpaths

    in the case of walking directions), and producing only legal

    particle updates that respect the rules of the road and the lay

    of the land.

    While the former approach is obviously too naive, the

    latter approach appears quite daunting. Fortunately, Google

    Directions essentially accomplishes exactly that task. To

    produce particle updates for movement events, MUTTS

    processes the simplistic dead reckoning update, and uses

    Google Directions to verify whether or not the updated

    particle location is realistic given the features of the map.

    This is path verification. If utterance ui moves particle

    p from location pi to pi+1, we conclude that the update

    is verifiable if the distance and time Google Directionsderives for pi pi+1 is probable given the duration and

    distance distributions derived from the processing ofui. Said

    differently, if the source text says 10 minutes, but Google

    Directions returns a best route that takes 20 minutes, then

    the particle path is not verifiable, and it should be resampled.

    III. USAGE CAS E

    In this section we detail a typical usage of the system and

    describe some of the investigative features and visualizations

    433

  • 7/29/2019 Managing Uncertanity in Text-to-Sketch Tracking Problem

    5/8

    Figure 2. A screenshot of the MUTTS application.

    that exist in MUTTS. Recall that MUTTS is a web applica-

    tion built using Googles GWT framework, and incorporates

    a suite of online tools to support the operations necessary

    for geolocation and visualization. A screenshot of the full

    MUTTS application can be seen in figure 2; it contains a

    map view, a tree view for breaking down the text input,

    and an interaction panel for visualizing search and sketch

    results, and text areas to input data and otherwise interact

    with MUTTS. The discussion here is based on a tutorial

    developed for users of the system.

    The use case here is that someone (who we will refer to as

    the analyst) has received a collection of textual observationsthat refer to the locations and movements of a subject. What

    the analyst would like is to provide an automated system

    with the text, and get back a detailed map of the subjects

    locations at all times throughout the observation period;

    ideally, this would be a path through the map, annotated

    with all the subjects stops. Due to the various sources of

    uncertainty in the text stream, a single, true, accurate sketch

    cannot generally be known. Therefore, MUTTS generates

    sketches probabilistically, and allows the analyst to consider

    and visualize the possibilities.

    Analysis of a tracking problem begins with the analyst

    constraining the area of interest. In this case, the AOI is the

    Arington/Mount Washington area of Baltimore, Maryland.

    The initial configuration of the particle filter places the

    particles uniformly distributed over the the AOI. Particles are

    rendered to the MUTTS map view as triangles representing

    the hypothesized location and direction of movement. The

    analyst begins the process by collecting the textual accounts

    and entering them as input to MUTTS. Consider the follow-

    ing collection of descriptions of a subjects whereabouts:

    (9:30pm): We saw him near the pizza restaurant.

    Figure 3. Particles distributed around annotated search results for pizzarestaurant in the Arington area of Baltimore.

    Figure 4. A movement event has introduced uncertainty in the location

    of the subject.

    (9:41pm): subject walking north for one half mile.

    subject turns east and continues for 5 minutes.

    (9:52pm): I am meeting Jerry at the hospital.

    This is what one might expect in a typical tracking

    scenario (thought typically one would have more data). We

    have three textual accounts, from different sources, with

    approximate timestamps. In this example, the observations

    come from an eye witness, from an police report, and an

    overheard conversation of the subject. MUTTS will begin

    by processing the first observation, which it will classifyas a sense event, with search query pizza restaurant. The

    query returns 6 hits that are labeled A G in figure 3

    (D is off the screen). Note that the particle filter has

    processed the sense event and those particles consistent

    with the locations of the pizza restaurants are given more

    weight, while those inconsistent are given lower weighting

    or resampled to locations consistent with a 2 dimensional

    normal distribution, centered at the nearest pizza restaurant,

    and consistent with models of the term near.

    434

  • 7/29/2019 Managing Uncertanity in Text-to-Sketch Tracking Problem

    6/8

    Figure 5. A sensing event that removes uncertainty.

    The second record is then processed. MUTTS will gener-

    ate two observations, both movement events, for the second

    report. Movements are processed by the particle filter asdescribed in sections II-A and II-C. Essentially, a dead

    reckoning update is performed and path verification is used

    to quantify the likelihood that it may have happened. Move-

    ment events either contain explicit distance information or

    it can be derived from duration language and models of

    movement. In this case, the subject was observed walking,

    and MUTTS can model the translation described in the ob-

    servations by a normal distribution consistent with a model

    of walking. The resulting particle distribution is shown in

    figure 4. Note the spreading effect that a movement event

    has on the particles, expressing the uncertainty associated

    when a subject begins moving. Not only may one half

    mile be a rough estimate, but the subject may have taken anumber of different routes and side streets in traversing that

    distance.

    The third record is spoken in the first person and is

    processed as a sense event. The search query, hospital,

    is highly specific, as is evidenced by the updated particle

    filter shown in figure 5. There is only one hospital, and all

    particles that are not in the vicinity of the hospital after the

    prior update are resampled to reflect the relative certainty

    that at 9:52pm, the subject is at that particular hospital.

    At this point, having incorporated four events into the

    tracking problem, it is reasonable to start considering what a

    sketch looks like, along with how it is generated, visualized,and evaluated. A sketch is generated by iterating over par-

    ticles in the particle filter, retrieving each particles history

    as its position and orientation has changed in response to

    processing the text, and generating routes with the help

    of Google directions. Thus, each particle tracked by the

    filter has a corresponding sketch, and each such sketch can

    be scored and ranked according to total distance traveled,

    duration, or by a believability ranking, which incorporates

    the particles weight over its history as well as external

    Figure 6. A sketch that is consistent with the text.

    measures, such as the path verification score for the various

    segments of the sketch. The analyst sees a ranked list

    of particle sketches, along with direction, duration, andbelievability, and begins viewing the sketches in order to

    envision the possible scenarios.

    Two sketches are shown in figures 6 and 7. The former

    figure contains a sketch with low duration and distance

    traveled, and high believability. The high believability score

    is derived from two main factors. First, the particle weight

    remains high over the duration of the sketch, indicating

    that when sense events were processed (in particular, the

    meeting at the hospital), the particles were already in close

    proximity to where the subject was suspected to be. Second,

    the path verifier found the duration and distance traveled

    in all segments of the sketch could be reasonably expected

    given the corresponding movement events.

    The sketch shown in figure 7, in contrast, has a longer

    overall duration and distance traveled, and a lower believ-

    ability score. This is in large part due to the particles initial

    position at the pizza restaurant labeled F in figure 3. It

    is unlikely that this is the restaurant referred to by the

    eye witness given subsequent movements and the eventual

    meeting at the unambiguously located hospital. The particle

    weights are correspondingly low in this sketch. In addition,

    the paths required to arrive at the hospital are unlikely.

    The location of Woodberry Woods and the Jones Falls

    Expressway prevent the subject from having a clear and

    timely route to the hospital, and this is precisely the role

    of the path verifier: to flag routes as unlikely given the

    movements described in the text.

    The cycle of adding observations, visualizing the sketches,

    and evolving a picture of the most likely tracking scenarios

    can continue as long as there is additional data. We view the

    MUTTS system as an increasingly mixed-initiative, allowing

    the analyst to participate in the process by manually ruling

    out or adding landmarks and routes, as well as providing

    input to the language processing pipeline as well. Improving

    435

  • 7/29/2019 Managing Uncertanity in Text-to-Sketch Tracking Problem

    7/8

    Figure 7. A sketch that is unlikely given the input and backgroundknowledge about travel times.

    the interactivity between the analyst and MUTTS is an

    ongoing area of development.

    IV. FUTURE WOR K

    There is still much that can be done to improve MUTTS.

    Future work falls into two categories: refining the tool and

    basic research. MUTTS is currently in Alpha and initial

    usability testing and evaluations are being performed by

    intelligence analysts. The feedback is still in its preliminary

    stages as of this writing, but adding to the mixed-initiative

    capabilities as well as improving the rule base of the

    semantic interpretation engine (to cover more constructions)

    are obvious areas to improve on performance and enhance

    the utility of the tool. We feel that the semantic interpretation

    engine is also an obvious area that would benefit fromtransitioning from a home-grown ontology to a larger scale,

    established product such as WordNet 3, and an opportunity

    exists for analysts to teach MUTTS new semantic tem-

    plates when new language constructs are observed by the

    system. Indeed, the goal is automated data acquisition from

    internal reports as well as the field, and we must expect

    to receive unusual linguistic constructions from a variety of

    speakers with various backgrounds.

    Basic research goals include those areas where good,

    working solutions to AI aspects of T2S are not established.

    Here, we are not looking to make incremental improvements

    to the parser, for example, but to explore new avenues where

    advances to the field in general may be made. While weare always trying to incorporate new methods for managing

    uncertainty, we are particularly encouraged about a novel

    learning paradigm that is well-suited to the problem of T2S

    for tracking and MUTTS in particular.

    Recall the pipeline diagram in figure 1. In actuality, since

    the parsing process can also be viewed as a pipeline, the

    pipeline is somewhat longer, consisting of: a part-of-speech

    3http://wordnet.princeton.edu/

    Figure 8. Advice-giving in the MUTTS pipeline.

    tagger, a named-entity recognizer, a k-best parser, a typed

    dependency generator, the semantic interpretation engine,

    and finally, the particle filter and path verifier. Many of these

    processes are trainable components, based on supervised or

    semi-supervised learning from labeled examples.

    Consider the following scenario. Rules and their corre-

    sponding templates have been generated to cover a variety

    of possible textual constructions. In the course of processing

    a large stream of text, MUTTS encounters the following two

    sentences:

    He walked north for 8 minutes. . .. . . then, he walked east for 8 minutes.

    These two sentences, in a prior release of the parser, were

    treated differently. 4 Here is the typed dependency list for

    the west observation:

    nsubj(walked-1, north-2)

    num(minutes-5, 8-4)

    prep_for(north-2, minutes-5)

    The SIE contains a rule that matches on a movement

    verb and a duration specification (walked and minutes,

    respectively), and creates a movement event that can be filled

    out searching for the num dependency of the TDL. But, the

    east instance was processed differently:

    advmod(walked-1, east-2)

    prep_for(walked-1, 8-4)

    nsubj(walked-1, minutes-5)

    The absence of the num dependency prevents the rule

    from completing the movement event in a way that is most

    useful to the particle filter. Though this particular pathology

    no longer occurs in the parser, it is illustrative of a general

    condition. The parser is a large, complex system that may not

    always parse sentences in a manner most convenient for our

    semantic interpretation engine, especially when dealing with

    unorthodox constructions found in casual speech. In these

    cases, we would like to invoke the learning componentsopportunistically to improve performance.

    Since the SIE and the parser are coupled in the MUTTS

    pipeline, and since the SIE has an existing rule that almost

    fires completely, it is possible for the SIE to express its

    ideal input as a training instance, and pass it back in the

    pipeline as advice for upstream components to learn from.

    Ideally, the upstream component would then generate new

    4This particular anomaly no longer occurs.

    436

  • 7/29/2019 Managing Uncertanity in Text-to-Sketch Tracking Problem

    8/8

    output closer to the SIEs target. This process is shown

    diagrammatically in figure 8. In this case, each process

    in the pipeline that receives advice may take the advice

    itself, retrain, and emend its output, or upon examining

    the advice, may decide to pass the advice upstream for

    other components to consider. In this particular example, the

    parser may be able to consider lower-ranked parse trees inthe k-best set of trees, compute their TDLs, and determine

    if more usable output could be provided to the SIE. If a

    preferable TDL was found, the parser could then update its

    own scoring metric to better reflect the preference of the SIE.

    Here, the proper trees were available in the k-best set, the

    correct output could be provided, and adjustments could be

    made. We are enthusiastic that this approach will provide

    improvements to the robustness of not only the MUTTS

    system, but in other pipelined machine learning systems with

    supervised and semi-supervised learning components.

    V. CONCLUSIONS

    We have presented MUTTS: a web application that per-forms automated text-to-sketch for tracking problems. This

    tool combines state-of-the-art natural language processing

    algorithms with an adaptation of a mobile robot localization

    algorithm called a particle filter to manage the many sources

    of uncertainty in tracking from textual descriptions. We have

    demonstrated how the use of off-the-shelf syntactic pro-

    cessing, coupled with a special purpose semantic ontology

    and template-matching system, can generate sensing and

    movement events that correspond to sensing and acting in

    mobile robot navigation and localization. The system also

    leverages vast amounts of existing spatial knowledge in the

    form of Google Maps, Local Search, and Directions, as well

    as the TIGER/Line R road data to bridge the gap betweentextual observations and geolocation and tracking.

    By presenting a use case, we have illustrated the utility

    of MUTTS as an analysts assistant. It provides the ability

    to iteratively reduce uncertainty about the sketch by adding

    observations and providing mixed-initiative constraints, and

    provides visualizations and scoring metrics for assessing

    the likelihood of individual sketches. We finished by out-

    lining directions for future development and areas in which

    progress can be made on the intelligence aspects of the tool.

    The MUTTS tools is currently being alpha tested by the

    intelligence community and we are enthusiastic about its

    potential as both an analysts tool and a platform for machine

    learning and natural language research.

    ACKNOWLEDGMENT

    This project was supported by a grant from the Intelli-gence Community Postdoctoral Research Fellowship Pro-

    gram through funding from the Office of the Director of

    National Intelligence.

    REFERENCES

    [1] I. Sledge and J. Keller, Mapping natural language to imagery:Placing objects intelligently, in Fuzzy Systems, 2009. FUZZ-

    IEEE 2009. IEEE International Conference on, aug. 2009, pp.518 523.

    [2] T. S. Levitt and D. T. Lawton, Qualitative navigation formobile robots, Artificial Intelligence, vol. 44, no. 3, pp. 305 360, 1990. [Online]. Available: http://www.sciencedirect.

    com/science/article/pii/000437029090027W

    [3] B. Tversky and P. U. Lee, Pictorial and verbal tools forconveying routes, in Spatial information theory: cognitive andcomputational foundations of geographic information science,C. Freksa and D. Mark, Eds. Springer, 1999, pp. 5164.

    [4] N. Metropolis and S. Ulam, The monte carlo method, Journalof the American Statistical Association, vol. 44, no. 247, pp.335341, September 1949.

    [5] D. Fox, S. Thrun, F. Dellaert, and W. Burgard, Particle filtersfor mobile robot localization, in Sequential Monte Carlo

    Methods in Practice, A. Doucet, N. de Freitas, and N. Gordon,Eds. New York: Springer Verlag, 2000.

    [6] D. B. Rubin, Using the sir algorithm to simulate posteriordistributions, in Bayesian Statistics 3: Proceedings of theThird Valencia International Meeting, J. Bernardo, M. Degroot,D. Lindley, and A. Smith, Eds. Oxford: Oxford UniversityPress, 1987, pp. 385402.

    [7] D. Klein and C. D. Manning, Accurate unlexicalized parsing,in Proceedings of the 41st Meeting of the Association forComputational Linguistics, 2003, pp. 423430.

    [8] B. M. Marie-Catherine de Marneffe and C. D. Manning,Generating typed dependency parses from phrase structureparses, in LREC 2006, 2006.

    437