a project summarylanguagelog.ldc.upenn.edu/myl/ldc/igert-draft.doc  · web viewstudents will learn...

63
A Project Summary Integrative Approaches to Communicative Interaction John Trueswell University of Pennsylvania We propose to establish an integrative educational foundation for students in eight different Penn graduate groups interested in empirically-based scientific studies of communicative interaction: Anthropology, Biology, Computer Science, Electrical Engineering, Linguistics, Neurology, Philosophy, and Psychology. This program will include two new common courses in mathematical foundations, a new jointly-managed program of summer research projects, a year of educational and research ``cross-training'' for each student, and a series of outside visitors linked to a journal club for IGERT students. The new Mathematical Foundations courses will be a two-semester sequence, based on practical computer exercises drawn from real problems in the associated disciplines. The cross-training year will allow students to take courses and conduct a research project in a relevant area outside of their core discipline. The summer research program will enable students to learn a variety of perspectives and methods in other disciplines via practical experience. In addition to adding depth, the outside visitor series will provide grounding in relevant research areas not covered at Penn. Students will learn to do research based on observations of natural behavior (as in ethology, corpus linguistics and clinical observation), as well as research in a laboratory setting (using both behavioral and physiological measures), and research based on algorithmic models of interacting agents. Students' common mathematical foundation will enable them to perform sophisticated analyses of signals, and to develop rigorous, testable models of communicative interaction, whether the object of study is a conversation among friends, a negotiation with a computer to obtain services, or a baboon barking bout. Our goals are to enable students to do better research within the cooperating disciplines, and also to encourage the development of genuinely new types of cross-disciplinary research.

Upload: others

Post on 04-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

A Project Summary Integrative Approaches to Communicative Interaction

John TrueswellUniversity of Pennsylvania

We propose to establish an integrative educational foundation for students in eight different Penn graduate groups interested in empirically-based scientific studies of communicative interaction: Anthropology, Biology, Computer Science, Electrical Engineering, Linguistics, Neurology, Philosophy, and Psychology. This program will include two new common courses in mathematical foundations, a new jointly-managed program of summer research projects, a year of educational and research ``cross-training'' for each student, and a series of outside visitors linked to a journal club for IGERT students.

The new Mathematical Foundations courses will be a two-semester sequence, based on practical computer exercises drawn from real problems in the associated disciplines. The cross-training year will allow students to take courses and conduct a research project in a relevant area outside of their core discipline. The summer research program will enable students to learn a variety of perspectives and methods in other disciplines via practical experience. In addition to adding depth, the outside visitor series will provide grounding in relevant research areas not covered at Penn.

Students will learn to do research based on observations of natural behavior (as in ethology, corpus linguistics and clinical observation), as well as research in a laboratory setting (using both behavioral and physiological measures), and research based on algorithmic models of interacting agents. Students' common mathematical foundation will enable them to perform sophisticated analyses of signals, and to develop rigorous, testable models of communicative interaction, whether the object of study is a conversation among friends, a negotiation with a computer to obtain services, or a baboon barking bout.

Our goals are to enable students to do better research within the cooperating disciplines, and also to encourage the development of genuinely new types of cross-disciplinary research.

Page 2: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

B Table of Contents (auto-generated in Fastlane)

Page 3: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

C Project DescriptionThe project description section contains the following items a through k. Page limitations specified for each item are inclusive of tables, figures, or other graphical data, and must be adhered to.

a) List of Participants Benjamin Backus Psychology http://www.psych.upenn.edu/~backusNorman Badler CIS* http://www.cis.upenn.edu/~badlerSteven Bird CIS* http://morph.ldc.upenn.edu/David Brainard Psychology http://color.psych.upenn.edu/brainardDorothy Cheney Biology http://www.sas.upenn.edu/biology/faculty/cheney/Robin Clark Linguistics http://www.ling.upenn.edu/~rclark/home.htmlAnjan Chatterjee Neurology http://ccn.upenn.edu/people/anjan.htmlBranch Coslett Neurology John Crawford Psychology http://www.psych.upenn.edu/~jud/David Embick Linguistics http://www.ling.upenn.edu/~embick/home.htmlMartha Farah Psychology http://www.psych.upenn.edu/~mfarahLila Gleitman Psychology http://www.psych.upenn.edu/~gleitman/Murray Grossman Neurology http://www.med.upenn.edu/~neuro/faculty/facprof/grossman.htmlAmishi Jha Psychology http://www.amishi.comAravind Joshi CIS* http://www.cis.upenn.edu/~joshiMark Jung-Beeman Psychology http://ccn.upenn.edu/people/mjb.htmlAdam Kendon LinguisticsMichael Kearns CIS* http://www.cis.upenn.edu/~mkearnsAnthony Kroch Linguistics http://www.ling.upenn.edu/~kroch/index.htmlWilliam Labov Linguistics http://www.ling.upenn.edu/~labov/home.htmlDaniel Lee Electrical Engineering http://www.ee.upenn.edu/~ddleeMark Liberman Linguistics http://www.ling.upenn.edu/~myl/Mitch Marcus CIS* http://www.cis.upenn.edu/~mitchMartha Palmer CIS* http://www.cis.upenn.edu/~mpalmerFernando Pereira CIS* http://www.cis.upenn.edu/~pereiraEllen Prince Linguistics http://www.ling.upenn.edu/~ellen/home.htmlVirginia Richards Psychology http://psych.upenn.edu/~richards/John Sabini Psychology http://www.psych.upenn.edu/~sabini/Gillian Sankoff Linguistics http://www.ling.upenn.edu/~gillian/home.htmlLawrence K. Saul CIS* http://www.cis.upenn.edu/~lsaulP. Thomas Schoenemann Anthropology http://www.sas.upenn.edu/~ptschoen/Marc Schmidt Biology http://www.sas.upenn.edu/biology/faculty/schmidt/Robert Seyfarth Psychology http://www.psych.upenn.edu/~seyfarth/Barry Silverman Systems Engineering http://www.seas.upenn.edu:8080/~barryg/index.htmlSaul Sternberg Psychology http://www.psych.upenn.edu/~sternberg

Sharon Thompson-Schill Psychology http://www.psych.upenn.edu/~sschill

John Trueswell Psychology http://psych.upenn.edu/~trueswel/Greg Urban Anthropology http://www.sas.upenn.edu/anthro/faculty/profiles/urban.htmlJan van der Spiegel Electrical Engineering http://www.ee.upenn.edu/~jan/Scott Weinstein Philosophy http://www.cis.upenn.edu/~weinstein/

*Computer and Information Science

Page 4: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

b) Vision, Goals, and Thematic Basis

We propose a new approach to integrative graduate education and research training on the topic of communicative interaction. Animals, humans and machines commonly interact so as to exchange information. The mechanisms of information exchange are often complex and vary exceedingly – across species, within the human species, and across the artificial systems that humans have invented. However, we believe that concepts and techniques developed for the study of one communications system can often be applied productively to another. Based on our own cross-disciplinary experience, we feel that researchers in this area will benefit from access to a range of mathematical and computational modeling techniques, a body of substantive knowledge and experimental paradigms from biology, psychology and linguistics, and a set of overarching concepts such as information, learning and evolution. With this background, researchers can analyze and synthesize communicative signals, and explore the psychophysics and physiology of signal perception and production. They can monitor and model the flow of information between individuals and within groups. They can investigate the role of grammatical organization and logical interpretation of symbol sequences in mediating communicative interaction; and they can model the interplay among genetics, culture and individual experience in the development of communications capabilities, whether in the history of an individual, a population or a species.

We recognize that the various disciplines concerned with communicative interaction, at Penn and elsewhere, are quite different from one another, and will remain so. Researchers in these different areas cannot jettison their separate histories and goals, or their distinct sets of external connections – nor would we want them to. However, we strongly believe that greater integration of graduate education and research training across these disciplines, at least for a certain class of students, will lead to faster progress in the existing disciplinary frameworks, and also to the emergence of genuinely new types of research. Specifically, we propose to establish an integrative educational foundation for students in eight different Penn graduate groups interested in empirically-based scientific studies of communicative interaction. This will include two new common courses in mathematical foundations, a new jointly-managed program of summer research experiences, a year of educational and research “cross-training” for each student, and a set of student-run activities, including a “journal club” with outside speakers, topical workshops, and an annual graduate student conference. The new Mathematical Foundations courses will be a two-semester sequence, based on practical computer exercises connected to real problems in the associated disciplines. The cross-training year will allow students to take courses and do a research project in an area outside of their core discipline. The summer research program will enable students to learn a variety of perspectives and methods in other disciplines via practical experience. Students will participate in research based on observations of natural behavior (as in ethology, corpus linguistics and clinical observation), as well as research in a laboratory setting (using bothbehavioral and physiological measures), and research based on algorithmic models of interacting agents. Students' common mathematical foundation will enable them to perform sophisticatedanalyses of signals, and to model the form and information flow of behavioral sequences, whether the object of study is a conversation among friends, a negotiation with a computer to obtain services, or a baboon barking bout.

Page 5: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

c) Major Research Themes

Introduction

Training for empirically-based scientific study of communicative interaction now takes place in a large number of graduate programs at Penn, including Anthropology, Biology, Computer Science, Electrical Engineering, Linguistics, Neurology, Philosophy and Psychology. Relevant research areas include animal communication ([1], [2]), linguistic and non-linguistic communication among humans ([3], [4], [5], [6]), human-computer interaction ([7], [8], [9]), and human communication disorders ([10], [11]).

This distribution of research across disciplines is not unique to Penn, but rather reflects a world-wide pattern, developed over a period as long as the history of the disciplines themselves. In each discipline, researchers have focused on certain scientific or technical problems related to interactive communication, have selected certain aspects of the phenomena for intensive study, and have developed and applied methods and tools that are characteristic of their discipline but may be unfamiliar or inaccessible to outsiders. The boundaries between disciplines are not impermeable, but rather pass ideas and people to varying degrees, like other boundaries between cultural groups. However, despite these patterns of migration and trade, many local differences remain. There are especially large between-discipline differences in the degree and type of computational and mathematical training, the kinds of data deemed relevant, and the relationship between formal models and data.

Data in these disciplines may come from ethological field observation, ethnographic participant-observation, clinical or sociolinguistic interviews, introspection, corpora of transcribed conversations ([12], [13]), controlled behavioral or physiological experiments, or computer simulations. Noteworthy recent developments in experimental tools include eye-tracking machinery for observing attentional direction, and functional neuro-imaging techniques for observing the distribution of brain activity in time and space. Relevant mathematical and computational tools include quantitative analysis of recorded signals, synthesis of signals for perceptual experiments, inferential and exploratory statistics, models of communicating populations, and hand-built or statistically trained grammatical models ([14], [15]). Although these different methods are largely complementary rather than contradictory, researchers often find themselves adapting or re-inventing techniques that have been perfected in other disciplines. Psychologists find themselves doing signal processing, computer scientists wind up doing ethnological observations on task-oriented dialogues, linguists start experimenting with machine learning, neurologists begin doing discourse analysis, and so on. This long-standing cross-disciplinary borrowing reflects the fact that none of the traditional academic disciplines is a good overall fit to the study of communicative interaction. As a result, researchers at all levels struggle as outsiders to understand or re-develop concepts, methods and techniques whose natural home is elsewhere. The effort is needlessly difficult, and there is often an initial period of failure or low-quality progress. What is worse, some interesting strands of research are neglected, because they are cross-disciplinary in their conceptual foundations, not just in their methods.

Building on Penn’s strong research groups within the relevant disciplines, and on a well-established tradition of cooperation among them, we propose to establish a new integrative program for interested students of communicative interaction, across all the participating disciplines. Students in this program, whatever their home discipline, will get a shared mathematical foundation, a cross-disciplinary introduction to key concepts and methods, and participatory training in research across at least two of the cooperating disciplines. This program includes common courses in the mathematical foundations of communication, and a year in educational and research training outside the home discipline (See detailed training program proposal in section C.d below).

Page 6: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

Penn is an ideal home for such a graduate training program, with internationally-renowned faculty in all the relevant disciplines, and an excellent record of turning out students who become intellectual leaders themselves. There are also strong intellectual and personal ties among the associated faculty across disciplinary boundaries, with a history of joint advising of students, joint research and publication, joint grant support, and joint development of courses at both undergraduate and graduate levels. The detailed discussion of major research themes will highlight some of these ties. However, this history of cooperation means that the proposed program of integrative graduate education is feasible, not that such a program already exists. For graduate students interested in communicative interaction, the necessary connections across departments and disciplines are now informal, incomplete and sometimes difficult to find and exploit. The purpose of this proposal is to create formal structures, including new courses and a formal system of laboratory cross-training, that will allow us to bring better integrative training to more students more efficiently. This document has emerged from discussions among forty faculty members in nine departments at Penn, whose research relates to the general theme of communicative interaction, and who want to participate in the program of integrative graduate education and research training here proposed. In the following sections, we describe the relevant parts of the research programs of these forty individuals under five thematic headings in a total of about a dozen pages. As a result of this compression, many interesting and relevant research projects are necessarily neglected, while others are described only briefly. Our goal is to give a picture of the range of research projects connected to this proposal, with enough detail to enable the reader to judge the nature and quality of the work in each area, and with an emphasis on the human and intellectual connections that make the proposed training program both desirable and feasible.

The five thematic headings, and the key participating faculty in each, are:

1. Neurobiological and Field Study of Animal Communication: Cheney, Crawford, Liberman, Seyfarth, Schmidt, Schoenemann

2. Experimental and Clinical Study of Human Communication: Cheney, Clark, Chatterjee, Coslett, Embick, Farah, Gleitman, Grossman, Jha, Joshi, Jung-Beeman, Kearns, Labov, Liberman, Marcus, Sabini, Seyfarth, Trueswell, Thompson-Schill

3. Analysis and Modeling of Language Structure and Use: Bird, Clark, Embick, Joshi, Jung-Beeman, Kendon, Kearns, Kroch, Labov, Liberman, Marcus, Palmer, Pereira, Prince, Sankoff, Saul, Schoenemann, Seyfarth, Silverman, Thompson-Schill, Trueswell, Urban, Weinstein

4. Enhanced Communication Environments and Systems: Badler, Joshi, Kearns, Lee, Palmer, Pereira, Saul, Silverman

5. Foundations of Communicative Interaction: Production, Perception, and Information Processing: Backus, Brainard, Crawford, Lee, Richards, Saul, Schmidt, Silverman, Sternberg, van der Spiegel.

c.1 Neurobiological and Field Study of Animal CommunicationIGERT-related research on animal communication occurs mainly in three laboratories. John Crawford studies the neural basis of electric and auditory communication in fish; Marc Schmidt studies the neural mechanisms that underlie song learning in birds; and Robert Seyfarth and Dorothy Cheney study vocal communication and social behavior in nonhuman primates. In addition to the phylogenetic diversity of their subjects, these three research groups use a variety of different techniques to study communication from a number of different perspectives. Nonetheless, they are united by a common approach to the

Page 7: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

naturalistic study of communication, and by intellectual links to IGERT colleagues in psycholinguistics, neuroscience, and computational biology.

All three groups are committed to an ethological approach that studies communication in its natural, social context wherever possible. In addition, all three investigators make explicit contact with research on human language. This commitment links their research with that of Gleitman, Trueswell, and Liberman, among others. Indeed, the simultaneous study of communication in animals and language in humans is one feature that will make our proposed IGERT group unique among those studying communicative interaction. Finally, all investigators (Crawford and Schmidt in particular) strive to incorporate quantitative, mathematical analyses and computational modeling into their research. The use of such mathematical techniques, now well established in computational linguistics and computer science, is a relatively new and creative addition to the study of animal communication. A focus on computation links Crawford’s and Schmidt’s research with that of Joshi and Marcus, among others, and with the Mathematical Foundations course that forms a central part of our training program. An important goal of IGERT training at Penn will be to produce a new generation of scientists in animal behavior, armed with the necessary mathematical tools to solve complex problems in the neural encoding of communicative interactions.

In Crawford’s laboratory, individuals of weakly electric Pollimyrus fish live in tanks where males defend territories, court females with sonic signals, and females communicate with electrical organ discharges (EODs). Complementing work in this naturalistic setting, Crawford uses operant behavioral methods to study sensory performance, and neurophysiological methods to study the mechanisms that underlie communication. This integration of rich ethological analysis with quantitative studies of perception and neurophysiology is unusual in any species. Three examples illustrate Crawford’s approach.

In their natural habitat, Pollimyrus males are nocturnal, communicating in pitch black water. Females appear to choose mates on the basis of their sounds (Crawford et al. 1997a, b). Under these circumstances, evolutionary theory predicts that male sound production will be energetically expensive (compared, for example, with female EODs); that the acoustic structure and/or rate of male sonic signals will correlate with a male’s physical condition; and that female auditory discrimination will be sensitive enough to distinguish among individual males. In one study, traditional ethological methods are used to test the first two hypotheses (Crawford et al. 1997a, b), and conditioning experiments to test the third. In conditioning experiments, animals respond to synthesized sounds with a burst of EODs, and are trained to change the EOD rate when they detect a sound or perceive a difference between two sounds. Results indicate that subjects’ audiograms closely match the energy in sounds produced by conspecifics (Marvit and Crawford 2000). The best sensitivity is close to 500 Hz, where there is also a prominent peak in the spectrum of conspecific sounds. Pollimyrus are also very sensitive to small differences in tone frequency and in the inter-click intervals of click trains (Marvit and Crawford 2001). These results have led to further work on the underlying computational mechanisms (see below).

How do fish distinguish small differences in auditory signals? In a second area of investigation, anatomical and neurophysiological methods are used to investigate the brain’s processing of auditory stimuli. Thus far, results indicate that the fish’s ear creates a neural code of the time structure of sounds (Fletcher & Crawford 2001). This encoded signal ascends from the ear into the medulla and on to the auditory midbrain, where the representation of acoustic information undergoes a major transformation (Crawford 1993, 1997a, b; Kozloski & Crawford 1998, 2000; Suzuki et al. in press). The transformation is revealed by the emergence of midbrain neurons that are highly selective, firing only in response to particular frequencies or to specific inter-click intervals. Thus, an initial temporal code is used to create a place-code for the period of repetitive sounds (for example, the period of a tone or the fundamental frequency of a complex sound).

Page 8: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

How is one neural representation of an auditory signal transformed into another? One computational model (Sullivan 1982; Crawford, 1997a) assumes that temporally synchronized spikes are relayed from the medulla to the midbrain, where input from the medulla branches and excites both a selective and an inhibitory neuron. The inhibitory neuron then produces relatively long- lasting inhibition in the selective neuron followed by rebound depolarization. The inhibitory input acts as a temporal gate, with the preferred interval determined by the time between the initial EPSP and the rebound depolarization. In a third study, Large & Crawford (in press) explore this hypothetical circuit using a dynamic computational model for temporal feature extraction. This model fits the midbrain physiological data closely, and is particularly valuable for generating predictions for new experiments.

In Marc Schmidt’s laboratory, neurophysiological techniques are used to explore song perception, learning, and performance in the zebra finch (Taeniopygia guttata), a species with a single, stereotyped, adult song. Using methods developed by Schmidt and Konishi (1998) and Tchernichovski et al. (1999, 2000), young birds have electrodes implanted in HVc, an area of the telencephalon known to be involved in both song learning and song production. Birds are then housed in cages with a plastic model of a male zebra finch and keys which, if pecked, induce song playback from a small speaker inside the model. This novel operant paradigm allows one to monitor neural activity during each presentation of the tutor song. Such experiments aim to understand the mechanism by which the song template is stored. In addition, because neural activity can be recorded continuously during this accelerated vocal imitation process, observers can track and correlate moment-to-moment changes in patterns of sound and neural activation. Neural recordings are combined with sophisticated spectral analyses of spontaneous vocalizations, allowing detailed comparisons of the relationship between neural activity and motor output. In further studies of song production, electrical stimulation through electrodes implanted in HVc is used to explore the mechanisms involved in the generation of complex song motor patterns. Using such stimulation in HVc but not in structures downstream, experimenters can modify the structural patterning of song, suggesting that HVc encodes part of the song pattern generating network. An additional, comparative study uses song sparrows (Melospiza melodia), whose males possess repertoires of 5-13 different songs types (Searcy 1984). Males sing different song types when courting females and competing with other males. Several hypotheses have been offered to explain the evolution of song repertoires and the mechanisms that underlie their adaptive value (Searcy 1992; Horn & Falls 1996). Work in other species (e.g., cowbirds; King & West 1989) suggests that different components of a male’s song exert different effects on the behavior of mates and/or rivals in different social contexts. In Schmidt’s laboratory, field observations and experiments are used to test the behavioral responsiveness of territorial males to natural stimuli. Focal individuals from these field studies are then subject to neurophysiological recordings in the laboratory. In comparison to the zebra finch, several striking differences are immediately apparent. First, song sparrow males are highly responsive to song playback (e.g., Stoddard et al. 1988), while zebra finch males are not. Second, song sparrow HVc neurons exhibit robust auditory responses to conspecific playback, while zebra finch HVc neurons do not (Schmidt and Konishi 1998), a difference consistent with song audition as a mediator of territorial behavior. Third, while zebra finch HVc neurons exhibit auditory responsiveness to playback of the bird's own song under anesthesia, anesthetized song sparrow HVc neurons are responsive only to subsets of the bird's song repertoire, suggesting different strategies for auditory tuning of individual HVc neurons in species with single-song and multiple-song repertoires (Nealen and Schmidt 2001). As with Crawford’s work, Schmidt’s research offers an unusual combination of ethology, state-of-the-art neurophysiology, and computational analysis, providing IGERT students with particularly rich opportunities for research in animal communication.

Robert Seyfarth’s and Dorothy Cheney’s research examines the vocal communication and social behavior of baboons (Papio cynocephalus ursinus) living in the Moremi Game Reserve of the Okavango Delta, Botswana. Two groups of over 100 known individuals have been studied continuously for ten years, using

Page 9: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

observational sampling, tape recording, and field playback experiments. Acoustic analysis is conducted in collaboration with Mark Liberman at the Linguistics Data Consortium, and with Julia Fischer at the Max Planck Institute for Evolutionary Anthropology, Leipzig. The overall goal of this work is to understand the function of vocalizations during social interactions. The investigators hope to clarify both the mechanisms that underlie call production and the information gained by listeners.

Recent studies have shown that listeners obtain rich, even “semantic” information from other individuals’ vocalizations. For example, playback experiments indicate that the low amplitude, tonal grunts given by baboons function at least in part to reconcile females following aggression. After hearing a dominant opponent’s grunt, subordinate victims are more likely to approach their former opponents and tolerate their opponent’s approach than in the absence of such a grunt (Cheney & Seyfarth 1997). Grunts therefore function to mediate and facilitate social interactions. Other experiments have shown that listeners discriminate between calls that are acoustically very similar. For example, grunts are not only given during social interactions but also when the group initiates a move into a new area of its range. Although grunts in these two contexts show some acoustic differences (Owren et al. 1997), the calls are graded and difficult for humans to distinguish by ear. Despite their acoustic similarity, however, listeners not only recognize signalers’ individual identities but also discriminate between the different grunt types (Rendall et al. 1999). Similarly, adult females give loud bark-like calls that grade from tonal, harmonically rich calls given when individuals are lost or separated from the group (contact barks) to harsh, noisier calls given to predators (alarm barks) (Fischer et al. 2000a). Each of these call types elicits measurably different responses (Fischer et al. 2001). Infants require experience to learn the distinction (Fischer et al. 2000b). Current experiments examine responses to the acoustically similar wahoo calls given by adult male baboons to predators and when displaying to conspecifics (Fischer et al. in press).

Although vocalizations provide listeners with information about both signaler identity and external events, there is little evidence that monkeys vocalize with the goal or intention of communicating to others. For example, baboons give loud contact barks when moving through wooded areas. Because barks are often clumped in time, animals appear to be exchanging information (calling and answering) about their location. Playback experiments, however, indicate that a baboon gives barks primarily when she herself is peripheral or at risk of becoming separated. She rarely answers the contact barks even of close kin when she is in the center of the group and surrounded by others (Cheney et al. 1996; Rendall et al. 2000). Thus the meaning of any vocalization is fundamentally different from the signaler’s and the recipient’s perspective. Calls may be informative (or functionally referential) even if the signaler is unaware of the sound-meaning relation and does not intend to provide information to others. This asymmetry between signaler and recipient may constitute an important functional distinction between animal communication and human language (Cheney & Seyfarth 1998).

While Seyfarth’s and Cheney’s research focuses on baboons, they have also directed graduate and post-doctoral research by students working on diana monkeys and mangabeys in Ivory Coast, cebus monkeys in Costa Rica, spider monkeys in Mexico, and suricates (a communal mongoose) in South Africa. IGERT students who elect to work in this laboratory will therefore have a wide variety of opportunities to conduct research on communication in primates or non-primate mammals.

c.2 Experimental and Clinical Study of Human CommunicationResearch in the experimental and clinical study of human communication is a large and diverse area, both in general and specifically at Penn. For expository purposes, we will divide it into three topical categories, although in fact individual researchers, centers and institutes often cross these boundaries. First, a well-established group of researchers in psycholinguistics, housed at the Institute for Research in Cognitive Science (IRCS) and led by Trueswell and Gleitman, focuses on the study of language learning, on-line word and sentence processing, and related topics, using experimental techniques such as eye gaze

Page 10: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

tracking. Second, the recently founded Center for Cognitive Neuroscience includes several researchers, especially Farah, Thompson-Schill and Jha, who study the neural mechanisms of lexical processing, semantic memory and similar topics, using techniques such as fMRI and ERP. Finally, several researchers in the Medical School, including Chatterjee, Coslett and Grossman, focus on speech and language processing deficits in various clinical populations, including aphasia produced by focal lesions, neurodegenerative conditions, and normal aging.

The Trueswell and Gleitman labs at IRCS study human language learning, production and comprehension. Infants, children and adults regularly face the formidable task of discovering the communicative intent behind a rapid flow of speech sounds – or producing speech sound to convey their intent to others. Speech perception is affected by many factors, including not only word recognition and sentence parsing, but also – and simultaneously – the surrounding discourse and referential setting, and the background knowledge of the communicating pair. A similar range of factors apply in speech production. Infants must infer the details of the language(s) spoken around them, while also learning to use their linguistic knowledge in communication.

John Trueswell’s lab focuses on human sentence processing: how readers and listeners recover the intended relations among words and phrases (who did what to whom), and the referential connections between these expressions and the world (e.g., the interpretation of contextually-dependent expressions like she, them, and the frog that is to the left of the napkin). The research is based on real-time measures of language comprehension, such as records of listeners’ eye gaze or word-by-word reading times. For example, head-mounted eye-tracking technology is used to record listeners’ shifting gaze as they listen to spoken instructions or descriptions of the surrounding visual world (Trueswell, Sekerina, Hill & Logrip, 1999). Subjects typically shift their gaze to possible referents as a spoken word or phrase is understood, so that this technique can be used to study the time course of the interpretation of potentially ambiguous pronouns (Donald saw Mickey while he…) and ambiguous phrases (e.g., Tap the frog with the stick). Such studies have revealed how multiple probabilistic cues are combined in real time, for instance verb-specific syntactic preferences (Trueswell & Kim, 1998), relevant visual scene information (Snedeker, Thorpe & Trueswell, 2001) and the structure of the prior discourse (Arnold et al., 2000). This work has led to the development of probabilistic models of human parsing and interpretation, with direct links to engineering research on statistical natural language processing [refs], grammatical representation (Srinivas & Joshi, 1998; Kim, Srinivas & Trueswell, 2002), and discourse analysis [refs on Centering Theory; Prince, Joshi and colleagues].

Gaze tracking techniques allow real-time language processing to be studied in extemporaneous human-to-human conversation, as well as in more carefully controlled experimental settings. For example, Trueswell’s group has tracked real-time ambiguity resolution during free discussion between naïve speakers and listeners (Snedeker & Trueswell, in press). In this work, speakers provided prosodic cues to structure only when other relevant sources of information in the referential setting were misleading or uninformative, although when prosodic cues were provided, they had an immediate impact, as the speech unfolded, on the listeners’ interpretation of otherwise ambiguous utterances.

Such methods open up many opportunities for collaborative study of real-time language processing in special populations, linking both to developmental and neurolinguistic research on communication. For instance, Trueswell and Gleitman have used gaze tracking to investigate the development of real-time language processing abilities in children (Arnold et al., 2001; Hurewitz et al., 2000; Snedeker et al., 2001). One finding is that young children rarely revise their initial interpretation of ambiguous phrases, wandering down the garden path and never returning. Hypothesizing that this may result from the development of executive function in prefrontal cortex, Thompson-Schill and Trueswell are using gaze tracking techniques to discover whether adult frontal-lobe patients show similar failures to revise. Interestingly, this kind of “garden path” misinterpretation of ambiguities is one of the main difficulties in

Page 11: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

the design of effective computerized systems for interacting with humans using speech recognition, an active research topic discussed in further detail in Section c.4.

Recent work in Lila Gleitman’s lab focuses on the relationship between vocabulary growth and the information available to the listener, in both children and in adult learners. Probabilistic, multiple-cue models of word learning and of real-time comprehension have been developed, in which the performance distinctions between child and adult populations can be attributed not to differences in learning machinery but to differences in available information. For instance, adults have access to a great deal of syntactic information, while infants have little or none. Syntactic-usage differences are especially valuable for identifying certain kinds of words, such as abstract verbs (such as know) and these thus pose learning difficulties for young children, and even for adults in laboratory conditions where their grammatical properties are not available (Gillette et al. 2000 ; Snedeker and Gleitman 2001; Kako and Gleitman 2001). As a result, the qualitative as well as quantitative differences between child and adult vocabularies can be explained. Gleitman’s group collaborates with computational linguists in corpus studies of the linguistic experience of children, based on the CHILDES corpus [ref]. Another relevant collaboration joins with neuroscientists in investigating relations among age, proficiency level, and specific linguistic function in second language learning, by varying the linguistic information available to subjects and examining the differential localization of brain activity using fMRI techniques.

The recently founded Center for Cognitive Neuroscience (CCN) includes several researchers, especially Martha Farah, Sharon Thompson-Schill and Amishi Jha, who study the neural mechanisms of lexical processing, semantic memory and similar topics, using techniques such as fMRI and ERP. For example, Thompson-Schill’s lab has focused on the nature of semantic knowledge representations and the cognitive control functions of the frontal lobes (Thompson-Schill et al., 1999, Neuropsychologica, Thompson-Schill et al., 1997, 1998, PNAS), which links directly to the studies cited earlier on retrieval and selection in language processing. Current collaborations between cognitive neuroscientists and psycholinguists focus on issues in semantic memory, executive function and lexical processing, along with work on sentence processing and interpretation in clinical populations, discussed below. This general area presents enormous opportunities for progress, and an important benefit of the proposed program will be to strengthen cross-training between neuroscientists and other researchers in our thematic area. This effort builds on the well-attended meetings of the joint IRCS/CCN Brain and Language Group, now in its fourth year, which provides a forum for tutorial presentations, discussions of background reading, and presentation and discussion of research results and work in progress.

Several researchers in the School of Medicine work on clinical aspects of interactive communication, including Grossman, Cosslet and Chatterjee. Many collaborative links already exist between these researchers and those mentioned elsewhere in this proposal. For example, Embick, Joshi, and Grossman used the corpus resources described in section c.3 to develop high- and low-frequency sentence frames for high- and low-frequency verbs. Functional magnetic resonance imaging (fMRI) was used to monitor patterns of brain activation during processing of the four different verb/frame combinations, in normal controls as well as patients with Progressive Non-fluent Aphasia and Fronto-Temporal Dementia (FTD). The results suggest that information-processing speed for lemma retrieval is mediated in part by left inferior frontal cortex during sentence comprehension. In another example, Clark, Weinstein, Pereira, and Grossman are studying the neural basis for the meaning of generalized quantifiers, comparing controls with FTD patients. Computational models emphasize the distinction between first-order quantifiers (such as “at least 3 of the cars are red”) that depend only on number knowledge, and higher-order quantifiers (such as “more than half…”) that also require working memory to support analysis relative to a larger context. Their research found parietal cortex involved in number processing, while dorsolateral prefrontal cortex was associated with the working memory component of higher order quantifiers. These findings imply that lexical semantic comprehension requires the integrated functioning of several brain regions.

Page 12: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

c.3 Analysis and Modeling of Language Structure and Language Use

Research in the analysis and modeling of language structure and language at Penn can be divided into two areas. The first, corresponding roughly to the traditional discipline of linguistics, investigates the ways in which the structures of human language map sounds to meanings, and the patterns of historical, geographical and social variation in this mapping. Among the faculty associated with this proposal, Clark, Embick, Kroch, Joshi, Labov, Liberman, Pereira, Prince, Sankoff, and Weinstein do research of this kind. The second sub-area, corresponding to computational linguistics and allied engineering sub-disciplines, works on techniques for computer analysis, generation and manipulation of lingustic structures and relations. Relevant technologies include speech recognition and synthesis, parsing and sentence interpretation, discourse modeling and machine translation. Among the faculty associated with this proposal, Joshi, Liberman, Marcus, Palmer, Pereira, van der Spiegel and Saul do research in these areas.

The faculty associated with these two areas overlap to a significant extent, and graduate student training and research at Penn have also often crossed these boundaries. This history of interdisciplinary collaboration, one of the foundations on which the current proposal is built, has seen strong intellectual and practical influences flowing in both directions. This interchange has been facilitated by a shared interest in the use of corpus and experimental data for exploration, hypothesis-testing and model-building, which has been a key theme of language-related research at Penn for many years. Corpus-based methods have played a central role in the development of models of language variation [refs] and in the study of historical change [Kroch & Taylor 2000] and discourse structure [refs], and have been the basis for a revolution in research approaches and practices in language and speech engineering, based on annotated corpora such as the Penn Treebank [ref] and many others published by Penn’s Linguistic Data Consortium (LDC).

There is an obvious affinity between these approaches and the ethological and experimental studies of animal communication discussed in section c.1 above, as well as the experimental study of human communication discussed in section c.2. Indeed, a collaboration is now underway between the LDC and Seyfarth, Cheney and colleagues, who are building annotated corpora of animal vocalizations similar to those used for human languages. Overall, a central goal of the proposed training program is to explore and deepen the use of such corpus and experimental data in elucidating the structure and function of communicative behavior. Furthermore, such corpora can provide important constraints on, and design principles for, automated systems for human-machine interaction (Section c.4).

Corpus-based approaches have been most visible in engineering practical text and speech processing systems through the use of machine learning, and the core statistical, algorithmic, and experimental techniques created in that research are an important component of the proposed program. The conceptual foundations and practical use of such methods will be a major part of the proposal Mathematical Foundations course, and such methods also play a central role in the exploration of enhanced communication environments and systems discussed in section c.4. But corpus-based approaches also have value in scientific investigations of language. In particular, they have extraordinary potential to bring together models of language structure, typically symbolic, and models of language use, often statistical.

Work at Penn on natural language parsing is a paradigmatic example of the interplay that the proposed program seeks to foster. Research in formal and computational linguistics by Joshi, Kroch, and their students created a formal grammatical framework (mildly context-sensitive formalisms exemplified by tree-adjoining grammar) and corresponding hypotheses — lexicalization, extended domains of locality — that advanced our understanding of the interplay between syntactic representation and computationally feasible language processing, Through the development and use of corpus resources such as the Penn

Page 13: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

Treebank, Marcus, Joshi and their students have compellingly demonstrated that grammars of this type can be learned effectively from data, leading to the most accurate automated parsers in existence today. This example illustrates an approach that we plan to foster throughout our educational program. Structural representations such as those provided by formal grammars form the core of hypotheses about information flow in language use. Suitably designed corpora allow us to test such hypotheses, and also to “train” or “learn” models of how general and local information are combined in interpreting or producing language. Probabilistic graphical models, which generalize latent-variable models well-known from the social sciences, have been especially effective in providing this linkage between structure and use. Ongoing research on such models by Kearns, Pereira, and Saul dovetails with previous research in statistically-trained parsing techniques, and is already the subject of graduate classes at Penn that include students from computer science, linguistics and psychology.

In line with this pattern, there are many opportunities to use corpora to study the interplay between form and function in other aspects of language and other types of communication. For instance, information-processing models of discourse developed by Joshi, Prince, and Weinstein [refs] hypothesize cognitive states involving structural constraints on accessibility of referential information. Referential processing in language production and perception can be modeled in terms of latent state variables and the optimization of communicative effectiveness, through ideas from Markov decision processes being developed by Kearns in his dialogue systems research, described in section c.5. Corpus-based work on reference is closely related to Trueswell’s use of gaze-tracking paradigms in experimental research on sentence processing in discourse context. In principle, gaze-tracking techniques can also be applied to track the time-course of attention to visible referents in animal interactions.

The Penn TreeBank, first published in 1992 and updated in 1995 and 1999, annotates several million words of English text with information about the categories of words and their syntactic relationships. It sparked a revolution in the theory of automatic parsing, in the development of practical parsers, and eventually had strong effects on theories of human sentence processing as well. Later efforts annotated production disfluencies and discourse functions in conversational portions of the TreeBank. Following the same model, two historical corpora of English texts have recently been created: the Penn-Helsinki Parsed Corpus of Middle English, and the Penn-Helsinki Parsed Corpus of Early Modern English. Based on these resources, Kroch and others have been able to traced the detailed time course of several syntactic changes. These results provide empirical motivation and constraint for models of the dynamics of long-time-scale language change, where it may take as long as a millennium for a new pattern to gradually replace an old one.

An important new development in corpus annotation is the on-going “PropositionBank” project, led by Palmer, Marcus and others at Penn, which adds information about predicate-argument structure (“who did what to whom”), along with annotation of co-reference among nouns and pronouns. This corpus will, we predict, create a new generation of machine-learning-based models of language interpretation. Students and post-doctoral fellows in the proposed program will have access to and be able to contribute refinements to this new resource, as well as being able to work with Penn faculty in exploring the linguistic and machine-learning aspects of the research it enables. Indeed, pilot work along these lines is already taking place, exploring new approaches to reference resolution and word-sense disambiguation.

Other corpus publications by the Linguistic Data Consortium (LDC) include telephone conversations in sixteen languages and dialects, transcribed broadcast recordings in four languages, and wideband conversational recordings in two. Now underway is a systematic collection of recordings of face-to-face meetings, in some cases including video as well as audio recording. LDC researchers are also preparing several large corpora of animal vocalizations for publication, in association with the NSF-funded TalkBank project [ref]. LDC corpora are always published for general research use, but co-location at

Page 14: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

Penn with the LDC gives researchers in the proposed interdisciplinary group an easy route to collaboration in the design and creation of new corpora or new types of annotation.

c.4 Enhanced Communication Environments and Systems

Four groups of participating faculty work on systems and artifacts that promote new ways of communicating. Kearns and his colleagues have been building and studying speech-based systems that are adaptive, and provide multi-modal, multi-user social interaction. Saul and Lee have been working on speech systems that provide instantaneous feedback and reaction. Badler’s lab works on the design and application of graphical human modeling and simulation, and Barry Silverman’s group aims to improve behavioral realism in computer agents and personas.

This work simultaneously connects and contrasts with the research efforts we have discussed so far, which emphasized human and animal communication in natural and experimental settings. The connections arise from several sources. First, the design of any effective system human-machine communication must obviously take into account the preferences, limitations and habits of natural human communication. Second, there is often deep interplay between the descriptive modeling of natural communicative behavior and the generation of artifacts for automation of this communication. Classical examples include the use of grammars and statistical language models in artificial text and spoken dialogue systems for human-machine interaction. Finally, research on artificial systems provides useful tools for the study of natural communication, including synthesis of artificial stimuli for perceptual studies, systems for controlling experimental interaction, and instrumentation for acquiring and processing data on naturally-occurring interactions.

On the other hand, the contrasts between artificial systems and more natural communication may be of equal or greater importance. The increasingly rapid adoption of new computing and networking technologies means that today’s communication artifacts may often be tomorrow’s natural modalities. Thus, instant messaging, chat rooms, and networked gaming systems have become common communication mechanisms for many users. In addition, systems engineered for novel communication situations may highlight important issues not revealed in more traditional modalities. Examples include the differences and similarities between human-human and human-machine interaction; the blurring of distinctions between “work” and “play” types of communication, as often happens in chat and instant messaging; and the ability of communication technologies to adapt to the habits and preferences of individual users or populations.

Kearns and his collaborators have been investigating the use of machine learning methods to improve the performance, flexibility and naturalness of the interactions provided by spoken dialogue systems. Spoken dialogue systems combine speech recognition and text-to-speech technologies to provide users with telephone access to interesting or useful data. While the goal is to make human-machine communication closer to normal spoken language, such systems are far from providing natural interaction, primarily because of the carefully scripted dialogues that must be employed due to the imperfections of current automatic speech recognition. They also are typically applied to limited communicative scenarios, such as the access of a relatively static database of information (such as stock quotes or airline arrival times). However, spoken dialogue systems are growing in functionality and number, driven at least in part by the advent of a number of commercial vendors building such systems for use by the general public.

A crucial issue in such systems is the ability to detect salient features of an ongoing dialogue with a user that might allow real-time modification of the interaction strategy. For example, a system should adopt a more conservative strategy (such using more restrictive prompts or speech recognizer grammars) if a user is struggling, while it should adopt a more efficient strategy (such as abbreviating prompts or even

Page 15: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

skipping some steps) if a stereotyped pattern of interaction is being repeated without errors [citations]. It is very hard to write rules for this, as they may depend on properties of the user population and the communications setting that are unknown a priori. A natural approach is thus to apply machine learning methods to sample dialogue data, as Kearns and others have done. Promising early results include a demonstrable and significant improvement in a spoken dialogue system using a well-known Markovian machine learning methodology known as reinforcement learning [cite]. This work provides a compelling example of technology that provides flexible and adaptive communication --- something that we routinely expect from other humans, but rarely from machines.

In related work, Kearns and colleagues created an artificial software agent [cite] that resides in a long-standing internet chat environment, and a spoken dialogue system that provides spoken telephone access to the agent and the chat system. The software agent in the chat environment learns from feedback provided by a large large population of users, while balancing their perhaps competing preferences, and also provides users with “social statistics” on their human-human interactions [ref].

Also in the speech domain, Saul and Lee have begun to developing robust speech-processing systems that can interact with users in real time, beginning with agents that attend to the pitch contours of human speech. Their pitch tracker is based on a novel algorithm that exploits some recent ideas in neural computation, updating its pitch estimate on a sample-by-sample basis, in real time, without the need for post processing to produce smooth contours. Two demonstration applications are a voice-to-midi converter that synthesizes instrumental music from vocalized melodies, and an enhanced Karaoke machine with audiovisual feedback, key adaptation, performance scoring and feedback about wrong notes.

Although this research is simple and preliminary, we would like to draw attention to an important fact: the resulting systems are fun. Unlike dictation programs or dialog managers, these more primitive interactive agents are not designed to replace human operators, but to entertain and amuse. The effect is to enhance the medium of voice, as opposed to highlighting the gap between human and machine performance. As more aspects of speech processing can be handled in this robust and real-time way, we can hope to see more sophisticated vocal agents that remain delightful. Continuations of this work also provide a platform for addressing core problems at the intersection of auditory computation and artificial intelligence [Bregman94, Cooke01], including attending to one out of several simultaneous voices, and tracking the locations of moving sound sources.

In the visual domain, Badler's group works on the graphical human modeling and simulation. They are developing fast, parameterized control methods for a broad range of communicative agent technologies, including the portrayal of natural language instructions as animated actions; the analysis of gestures, faces, and eye movements in order to build generative models for graphical synthesis; the creation of detailed and realistic simulation environments for person-to-person training; and the development of visual avatars and simulated physical environments to purely textual communication media [Embodied-agents-book-chapter].

This group studies various aspect of real-time embodied agent synthesis. Their primary approaches center on parameterizations for gesture and facial actions using movement observation principles, and on the representational basis for character believability, personality, and affect. Source material comes from movement observation professionals and directly sensed movement data. By grounding parameterized actions in observed behaviors they attempt to escape manual animation interfaces, and to provide a basis for modeling the links between internal agent states and external behavior. The EMOTE system offers such a parameterization based on Laban Movement Analysis [EMOTE]. This parameterization applies to arm gestures and torso shape and is being extended to facial expressions, eye movements, and gait.

Page 16: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

In recent work Zhao [Zhao2001] has shown that significant EMOTE parameters can be learned from human movements captured with electromagnetic sensors or video cameras -- even a single camera. By observing human behaviors and having them correlated with observational ground truth, one may begin to build a mapping from inner state to outward manifestation. Experiments will test the hypothesis that consistency across body communication channels is a prerequisite for believable behaviors.

In order to map between linguistic expressions and animated behaviors, Badler, Palmer and Joshi have developed Parameterized Action Representations, an intermediary level of representation for both visual and verbal components of action, and created a natural-language interface that allows a speaker to give instructions to a virtual agent in a simulated 3-D environment. Schuler has used this simulation’s world model to resolve ambiguities in users’ instructions [schuler], a process with many analogies to Trueswell’s gaze-tracking studies of human sentence processing discussed in section c.2. PARs may also be recognized from motion-captured input, yielding a transformation from performance to descriptive utterances [rama].

In a related vein, Barry Silverman’s lab focuses on improving the behavioral realism of computer-generated agents and personas, reflecting the state of the art in human behavior modeling, including physiological and stress factors, value ontologies and emotional construal, and stress- and emotion-bounds on rationality in decision-making. The goal is to help those interested in constructing more realistic software agents for use in simulations, in virtual reality environments, in training video games, and so on. The goal is not to create new behavioral models, but to demonstrate how models may effectively be constructed out of results already published in the behavioral literatures.

A sub-goal is to create a common mathematical framework (CMF) and an open agent architecture that allows others to research and explore alternative behavior models for adding realism to software agents. With this approach, a new model can be added and evaluated quickly. CMF is based on a dynamical, game-theoretic approach to evolution and equilibria in Markov chains representing states of the world that the agents can act upon. In these worlds the agents' utilities (payoffs) are derived by a deep model of cognitive appraisal of intention achievement including assessment of emotional activation/decay relative to value hierarchies, and subject to (integrated) stress and related constraints. The framework is based on widely available ontologies of world values and how these and physiological factors might be construed emotively into subjective expected utilities to guide the reactions and deliberations of agents. For example, what makes one set of opponent groups differ from another? This framework serves as an extension of decision processes appropriate for iterative play in game-theoretic settings, with particular emphasis on agent capabilities for redefining drama and for finding meta-games to counter the human player.

The vast majority of psychology, sociology, and other social-science literature describing human behavior and performance does not reach the eyes of workers in the modeling and simulation community (software engineers, operations researchers, game developers, etc.). Silverman's labs' recent effort has been concerned with the extraction and implementation of Human Behavior Models / Performance Moderator Functions (PMFs) from this literature and illustrating how to implement it within a robust agent framework. Currently they are conducting a number of such behavior model implementation and correspondence studies surrounding the topic of crowd behavior at protest and riot scenes.

c.5 Foundations: Production, Perception, and Information Processing

The production and perception of signals are fundamental components of any communicative system. A sender must be able to produce an information-bearing signal, and the receiver’s sensory and perceptual

Page 17: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

systems must transduce and decode it. In addition, non-trivial information processing is required to make sense of complex messages, generally requiring the ability to compare received signals, at various levels of analysis, to information stored in memory. Because production, perception, and information processing are so intimately intertwined with the process of communication, we describe here several labs at Penn that conduct research concerned with these foundational issues. Schmidt’s lab studies production of birdsong; Richard, Brainard and Backus study human auditory and visual perception; Saul and Lee approach sound and image analysis from a computational point of view; and Sternberg and Jha study cognitive information processing in humans.

Marc Schmidt’s research in motor control is concerned with understanding the organization of motor networks that control complex vocal behaviors. The model system used in his laboratory is vocal production in songbirds. Birdsong is a powerful model system for the study of complex learned behaviors because song output is highly stereotyped and individual elements can be quantified with great precision. In addition, because song is learned and shares many similarities with language acquisition, the study of birdsong is likely to provide insights into the mechanisms underlying vocal acquisition in humans (Doupe and Kuhl, 1999). This lab’s work involves three general experimental approaches. In the first, chronic neural recordings are made of adult singers and the acoustic features of their songs are analyzed quantitatively. Neural activity is recorded bilaterally in a forebrain structure (nucleus HVc) at the top of a hierarchically organized motor pathway. Analysis of spike rate synchrony between multiple simultaneously recorded electrodes in these structures is used to predict song feature encoding in the song motor pathway (Schmidt and Konishi, 1999; Schmidt, 2002). In the second approach, neural recordings and stimulation during singing are used to perturb the song motor pattern and study the nature of the song pattern generating networks that underlie song production. These studies are also aimed at understanding the highly coordinated nature with which hemispheres interact during vocalization (Vu et al., 1998; Schmidt and Konishi, 1999). The third type of research examines the relationship between neural activity patterns in the song motor pathway and underlying changes in acoustic output during song learning. Using a novel song tutoring system (Tchernichovski et al., 2001a), a broad collaborative effort with several other laboratories aims to combine recordings at multiple levels of the song motor pathway (both central and peripheral) with moment-to-moment analysis of the acoustic changes that occur during the vocal imitation process (Tchernichovski et al., 2001b).

The primary topic of Virginia Richards' research is across-frequency integration of information in wide-band sounds. Roughly described, the auditory periphery breaks incoming sounds into different frequency regions yet our percepts are of sound sources emitting wideband sounds. Sounds from multiple sources are summed in the air, and separating the result into two or more perceptually distinct “acoustic objects” requires the integration of information across both time and frequency, and often across the two ears. A basic question is how acoustical information present in different frequency channels at the auditory periphery is re-allocated to form auditory objects. Any one frequency channel may have information from many sound sources, while other frequency channels may have information from just one of the sound sources. Research in Richards' lab addresses these issues in two stages: first, by studying how the basic dimensions of sounds are represented within a single frequency channel, and second, by studying the ways in which level and temporal information can be integrated across frequency. Psychophysical results are evaluated against the predictions of probabilistic models applied to the same stimuli.

The main topic of David Brainard’s research is human color constancy: the question of how color appearance provides information about object identity. Color constancy is an excellent model system for a common type of ambiguity that makes perception difficult in general. In the case of color constancy, the retinal image confounds the spectral properties of objects with those of the illumination. Similar confounding occurs for other visual problems and in other sensory modalities. In audition, for example, the spectrum of a sound depends on room acoustics and other transmission-channel properties as well as on the properties of the sound source itself.

Page 18: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

Work in Brainard’s lab seeks to characterize how well human vision achieves color constancy under natural viewing conditions, and to understand what perceptual computations lead to the observed (often very good) levels of performance (Brainard and Freeman, 1997; Brainard, 1998; Kraft and Brainard, 1999). Noteworthy features of the research include a push to extend the stimulus conditions away from very simple laboratory stimuli towards images more representative of natural viewing (e.g. Brainard, 1998; Kraft and Brainard, 1999), the use of a Bayesian approach to develop the modeling computations (e.g. Brainard and Freeman, 1997), and the elaboration of experimental methods that allow one to connect predictions derived from the Bayesian analysis quite closely to the experimental data (e.g. Brainard et al., to appear).

The main research focus in Ben Backus’ lab is the integration of visual information in depth perception. How does the visual system make use of multiple sources of information, in trying to interpret a visual scene or event? Because the sensory measurements are noisy (and inherently ambiguous even in the absence of noise), the visual system must continually make decisions about the most likely causes of observed sense data. Sensory measurements always contain error, so any two methods of estimating a scene parameter will, in general, disagree with one another. Therefore the visual system must be in the business of reconciling discrepant information. Using depth perception as a model system, cue-conflict experiments have shown that the visual system often constructs a percept in which a scene parameter is the weighted average of two or more cues (e.g. Backus et al., 1999; Backus & Banks, 2000). Yet this cannot be the whole story: sometimes a single cue is ignored (as when wearing binocular disparity-reversing goggles); sometimes cues are used together to constrain the percept (as with a rotating rigid object seen through a device that enhances disparity, that appears to deform as it rotates). Current work aims to determine the optimal rules for building reliable percepts under such conditions, and whether these are the rules used by the visual system. In addition to work in stereo vision, recent work in the Backus Lab includes general theory on the confusability of visual stimuli (Backus, 2001), and the identification of conditions under which physically different stimuli come to have identical representations in the visual system (Backus, 2002).

Daniel Lee and Lawrence Saul are interested in the general problem of extracting features from the visual and auditory environments. A central focus of this research is the problem of dimensionality reduction: the mapping of complex, distributed stimuli to simple meaningful percepts. Recent work has led to unsupervised learning algorithms for computing low dimensional, neighborhood preserving embeddings of high dimensional input (Roweis & Saul, 2000), and parts-based descriptions of complex objects (Lee & Seung, 1999b). In algorithmic work that complements the experiments in Richard’s lab, Saul has also investigated how unreliable features from many narrowband frequency channels can be combined in a probabilistic model to achieve reliable recognition of phonetic features (L. K. Saul, M. G. Rahim, and J. B. Allen (2001).A statistical model for robust integration of narrowband cues in speech. Computer Speech and Language 15: 175-194).

One of the most important approaches to the understanding of complex systems or processes, such as mental processes, is to analyze them into functionally distinct parts or "modules" – to decompose them – and then to discover what each part does, which other parts influence it, and what influence it has on other parts. Saul Sternberg's lab is concerned with the development and analysis of methods for the identification of such modules, and with the application of such methods to the mental processes that underlie the encoding of visual displays of alphanumeric characters, the production of fluent action sequences (as in speech production or typing), and the perception and production of temporal patterns. Such methods include the widely used "additive factor method" (Sternberg, 1969, 1998) for the analysis and interpretation of reaction-time data, and a far more general class of methods for discovering the parts of a mental or neural process, methods unified by the idea of separate modifiability (Sternberg, 2001).

Page 19: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

In application of such methods to human processing of displays of multiple familiar items, for example, Sternberg has found in several experiments with T. Pantzer that the process of identifying a displayed character is composed of at least two encoding processes with different properties. The first process (alpha), perhaps contour extraction, is slowed by a superimposed grid, and can occur in parallel for different elements. The second process (beta) is slowed by a geometric transformation, and occurs serially for different elements. Although the beta processes for different items cannot overlap, the beta process for one item can start before the alpha processes for other items have finished.

In application of such methods to the production of temporal patterns, Sternberg and Pantzer (1997) found that the widely accepted account of “beating time” provided by the Wing-Kristofferson (1973) model has to be reconsidered. Evidence emerging from current experiments supports the view that in some circumstances, human timing behavior depends on a biomechanical oscillator as well as a central control mechanism.

Amishi Jha's research focuses on elucidating the components of working memory and their corresponding functional neural architecture. She has used several techniques to understand the brain-basis of working memory maintenance (behavioral signal detection measures, event-related potentials (ERPs), and event-related functional MRI). Working memory maintenance is the process by which information is kept active over short delay intervals. One cognitive model, based on single-unit recordings in nonhuman primates, is that information is actively maintained by the sustained firing of populations of neurons in the brain. Despite this compelling animal model, there remains an active debate regarding the presence of such activity and its role in working memory maintenance in the human brain. Jha's research has suggested that activity that is sustained while humans hold information in working memory may not be related to maintenance operations (Jha and McCarthy 2000). Through several lines of research she has suggested that this activity may reflect response preparation, prevention of distraction (Jha, Miller et al. 2001), or attentive rehearsal mechanisms (Mangun, Jha et al. 2000).

Page 20: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

d) Education and Training:

The administrative structure of graduate studies at the University of Pennsylvania is designed to facilitate programs that span departments and schools. There is no centralized formal organization unit; instead the University supports 62 PhD-granting Graduate Groups spread over its 12 Schools. The Graduate Groups are organized principally along intellectual discipline lines and consist of standing faculty in one or more departments, as well as distinguished Adjunct faculty from outside the University. This organization makes for agile, changeable, cross-disciplinary links, and bridges the more static departmental and administrative structures of the University. Because the binding between individuals in a Graduate Group is intellectual rather than artificial, joint advising and course offerings are simple collaborative enterprises. As an example, for many years the Computer and Information Science Graduate Group and the Linguistics Graduate Group have each offered convenient opportunities for Masters degrees to PhD students in the other Graduate Group.

The IGERT program outlined in this proposal is greatly facilitated by this administrative structure. While a graduate student must be admitted into and graduated from a specific Graduate Group, coursework may readily be taken across any reasonable spectrum of offerings. This lack of administrative and departmental barriers means that the cross-disciplinary education that is the hallmark of the IGERT program is both historically straightforward and academically legitimatized at Penn. The lack of barriers will give IGERT students the flexibility to study and learn the diverse topics needed for interdisciplinary research.

The proposed graduate training program will provide students in any of eight existing Graduate Groups with an educational foundation for integrative research on communicative interaction. This will include: (1) Two new common courses on the mathematical foundations of communication, (2) a jointly-managed program of summer research projects, (3) a year of educational and research “cross-training” for each student, and (4) opportunities for interactions with researchers from other institutions. An IGERT student will typically be funded for two years by the IGERT program, with funding for the other three years provided by the relevant school, department or faculty advisor according to the norms of each graduate group. In the first year, we expect to start with seven newly matriculating students and seven students who are already enrolled in one of the affiliated graduate groups, and subsequently to add seven or more newly matriculating students each year, so as to maintain a level of 14 IGERT-funded students,

The various educational aspects of the IGERT program, described in detail below, would be required for IGERT students, but would also be open to other Penn graduate and undergraduate students, where space and funding permit.

The overall IGERT program will be managed by a Steering Committee, chaired by John Trueswell and including as initial members David Brainard, Lila Gleitman, Michael Kearns, Mark Liberman, Martha Palmer, Fernando Pereira, Marc Schmidt, Robert Seyfarth, and Sharon Thompson-Schill.

Mathematical Foundations. The backgrounds and mathematical sophistication of the students entering the proposed program will vary widely. We have planned a two-semester Mathematical Foundations sequence that will provide all IGERT students with basic mathematical modeling and algorithmic tools, while still providing sufficient challenges for the most advanced. These two courses, taught in a proposed computer/media lab setting, will cover relevant aspects of a wide range of mathematical topics that are directly relevant to animal, human or machine communication, or that provide prerequisites for these topics. Examples of topics directly relevant to communication include information theory, game theory,

Page 21: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

and formal language theory. Examples of important prerequisite topics include signal processing, machine learning, and probabilistic models.

Previous experience of several of the participants in teaching similar material to a diverse audience influences both our optimism and the course design. For instance, Liberman (with Eero Simoncelli, now at NYU) designed and taught a graduate course on Computer Analysis and Modeling of Biological Signals and Systems, which started with an introduction to the required concepts from linear algebra, shift-invariant systems, and frequency-domain representations, and developed their application through MATLAB to such topics as color matching, tonotopic mapping in the auditory system, pitch detection, sampling and multi-cellular representation of signal properties, multi-scale image representations, perceptual distortions, and brain imaging. A related graduate course, taught by Simoncelli, is now a requirement for all incoming graduate students in neuroscience at NYU1. Another source of inspiration and examples is provided by David Brainard’s Psychophysics Toolbox for MATLAB2.

An illuminating result emerged the first time that the Liberman/Simoncelli course was given. Among the participants were a Linguistics postdoc who had not taken any mathematics after high-school algebra, but was able to do a term project for which she wrote MATLAB code to estimate dipole locations from the raw data in an MEG experiment, and also a Computer Science grad student whose term project turned into a patented algorithm for image compression. These two had the weakest and the strongest mathematical backgrounds of the dozen or so course participants, and yet the course helped each of them to take a qualitative step forwards in mathematical sophistication. The point is that such material can be taught in a way that makes it accessible to bright and motivated students whose math background is weak, and simultaneously challenges students with a strong math background to acquire and apply a deeper level of understanding.

These two semesters obviously cannot substitute entirely for the dozen or more semesters that normally would be required to cover a similar range of topics. However, they can give students the ability to understand and implement algorithms from published descriptions, especially given appropriate libraries of basic function, and to discuss alternative approaches with experts in a well-informed way. It is clearly not the case that every IGERT students will use every mathematical or algorithmic topic from this course in his or her research. However, applications are often unexpected, and fortune favors the prepared. In addition, this background will enable students to make sense of a wide range of courses and readings that might otherwise be inaccessible. Finally, the shared experience of this course will help IGERT students to get to know one another and to establish a personal as well as conceptual basis for future collaborations.

The detailed design and initial teaching of this course will be carried out by a subgroup of the IGERT steering committee, led by Michael Kearns, and including Brainard, Liberman and Pereira. Our plan is for each semester to be co-taught by two of these faculty members.

Because of the diversity of topics and of the students’ backgrounds, we plan to organize this course in a novel format. The two-semester course sequence will be organized into a series of “modules,” each designed to explicate a core mathematical and algorithmic topic. Each module will deal with specific problems of the type that IGERT students need to solve., and will be as self-contained as possible, although of course one module will often require understanding of concepts and techniques taught in another. Each module will be defined by the following key elements:

1 http://www.cns.nyu.edu/~eero/math-tools01/2 http://www.psychtoolbox.org

Page 22: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

A specific communication modeling problem to be solved. For example, the central question of one module might be how to model the time-varying acoustic patterns of human speech for the purpose of automatic speech recognition by computer.

A mathematical model that approximately captures the salient features of the problem. In the current example, hidden Markov models (HMMs) and their variants would be introduced.

The central computational problems which the model is intended to address. Once these problems are identified, algorithms are introduced for their solution. In the example, this would include the Viterbi algorithm for speech decoding with HMMs, and the Baum-Welch algorithm for HMM parameter learning.

In this manner, mathematical abstractions such as HMMs are always introduced in service of a specific communication modeling problem, understandable to the widest possible audience. Formal analyses and algorithms are introduced as needed to understand and solve the modeling problem, and extensive computer exercises give further grounding for the abstract tools.

We expect that modules will vary in length from one to four weeks, depending on the importance of the topic, giving a total of ten to fourteen modules across the two semesters. To give a better idea of the topics and format we have in mind, we now provide brief descriptions for a number of potential modules. These examples are meant to be illustrative, and not definitive or exhaustive.

Module: Game Theory. Game theory has long been used to model communication in settings of strategic conflict or competition. While originally formulated to examine political and economic conflict, it has become increasingly common as a foundation for the study of animal communication. The development of evolutionary game theory, by Maynard Smith and others, has been especially rapid and successful in this regard [Maynard Smith, J. Evolution and the Theory of Games. Cambridge: Cambridge University Press, 1982]. For these and other reasons, we view the basics of game theory and evolutionary game theory as important modeling and analysis tools for IGERT students.

To anchor the study of such a vast and complex topic, the game theory module will focus its simulation and computer exercises on a relatively simple dyadic interaction, for example a competitive interaction between two male birds competing for a territory, or a cycle of head-bobbing displays in lizards. Students will first be introduced to the concept of matrix games in terms of payoffs to the participants, and to the role of Nash equilibria in determining the outcomes to such conflicts. They will be asked to model the conflict numerically in computer exercises, and to compute equilibria for different payoff matrices. In this concrete fashion, even students with the most limited mathematical backgrounds will be able to grasp the fundamental notion of the Nash equilibrium. On the other hand, more advanced students can easily be provided with challenging generalizations of the basic exercises, including the simulation of repeated play of one-round matrix games.

The same simple biological examples naturally lead to the topic of evolutionary game theory, and its equilibrium notion, the evolutionary stable strategy (ESS). In the same matrix games, students will be given computer exercises to compute the ESS and contrast it with the classical Nash solution. More advanced students can be asked to simulate common evolutionary dynamics to see convergence to the ESS. All topics in the module can be grounded in specific, two-player matrix games, yet the exercises highlight the power and differences of the powerful Nash and ESS notions.

Module: Formal Language Theory. Formal language theory provides explicit mathematical models for the structure of natural and artificial languages. It is a cornerstone of contemporary theories of the syntax of natural languages, provides basic information relevant to the study of both human and human-machine

Page 23: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

communication, and offers mathematical tools for modeling a variety of language-related learning problems. Moreover, formal language theory, especially in its statistical variants, offers a set of tools for modeling complex behavior sequences in animals. Finally, the study of automata and formal languages provides an apt context in which to treat the notion of computational constraint – a notion as fundamental to the study of bounded rationality in game theory as it is to modeling limitations on the language learning capacity of resource-bounded agents. Thus, the study of formal languages and automata will form a crucial component in the education of IGERT students.

This module will focus on the most fundamental concepts and techniques of automata and formal language theory, applicable across a wide range of topics in communication. Students will be introduced to the class of regular languages via finite automata and regular grammars, and to finite state transducers, which are extensively used in natural language processing. Exercises will be given to show that regular grammars are insufficient to model the full range of syntactic phenomena present in natural languages, and projects will be assigned to illustrate that finite state transducers and some of their linguistically motivated extensions can be used for efficient processing of natural languages. The class of context free languages will be introduced, and the power and limits of context free grammars as models for natural language syntax, semantics, and discourse structure will be explored. Computational exercises will introduce students to the implementation of algorithms for parsing regular and context free languages. More advanced students will be encouraged to investigate some of the learning problems for formal grammars, and the connections between formal grammars and probabilistic models. These questions offer an excellent point of entry into basic computational learning theory and grammar learning questions. Connections between automata theory and game theory will be investigated, with emphasis on the use of finite automata to model aspects of bounded rationality in the context of repeated play of two person games. The module will emphasize throughout that limits on the information processing capacities of organisms are fundamental to explaining various features of communication.

Module: Statistical Decision Theory. Statistical thinking is central to a number of the topics that will be covered in the foundations sequence; examples include both information theory and Bayesian networks/hidden variable models. Statistical decision theory provides an excellent vehicle for introducing students to statistical concepts: it is useful in its own right and a number of important results may be introduced rapidly once a few basic ideas are in hand. Specifically, once the basic idea of a normally distributed random variable has been covered, one can move rapidly into the theory of signal detection and apply this theory both to characterize human performance and to develop machine classifiers.

As an extended example for this module, we will consider a very simple communication system: Morse code. A receiver attempting to understand Morse code must classify each transmitted element as either a dot or a dash, and we can model this process as a signal detection problem where the two states of the world are characterized by underlying normal distributions with different means that specify the duration probabilities for dots and for dashes. For our purposes, we will consider the case where the physical separation of the dot and dash durations is imperfect, so that the discrimination problem is non-trivial. This will allow us to introduce likelihood-based decision theory and the analysis of observer performance using ROC curves. Indeed, each student should be able to participate in a simple psychophysical experiment where (s)he tries to classify computer-generated dots and dashes presented in isolation and then compute the index of discrimination (d-prime) from the data collected. In addition, the analysis should also allow students to implement a computer classifier that performs the same task. Once the case of discriminating dots and dashes presented in isolation is finished, the next exercise is to consider recognition of multi-element sequences of dots or dashes, where the effect of preceding context can be used to improve classification performance. Here one need not consider the full-blown problem of decoding transmitted English words but rather can consider sequences drawn from a toy ‘language’ where ‘words’ have fixed length and are drawn at random from a finite and specified ‘dictionary.’ Even in this toy context the exercise of seeing how using the base rates defined by the dictionary can improve

Page 24: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

performance, and how this improvement depends on the separation of the dot and dash duration distributions, is quite instructive.

A nice feature of the Morse code example is that students with more advanced backgrounds can push beyond the toy example and confront either the problem of extracting element lengths from a sound signal in the presence of transmission noise or of extending the analysis towards real transmission of English words and sentences.

Module: Latent Variable Models. Many issues in communication can be fruitfully analyzed through the introduction of appropriate hidden or latent variables. Examples include the categorization of text documents, in which the author’s intended topic may be treated as a hidden variable that generates the observed words; human speech, in which the true word sequence can be viewed as a hidden variable generating the observed acoustic signal; and neural modeling, in which underlying brain states or activity can be viewed as generating the observed behavior of neuronal populations. For these and many other examples, there are rich and recently developed probabilistic models of complex joint distributions of observed and unobserved variables (such as Bayesian networks), along with powerful algorithms for conditional inference problems.

In this module, students would be introduced to such models and algorithms though the application of Hidden Markov Models (HMMs) to speech recognition. Computer exercises would involve the creation of simple HMMs for a small number of isolated words on the basis of acoustic data, and the application of maximum likelihood methods to the resulting HMMs for isolated word recognition. The exercise might begin with a trivial one-state model for distinguishing “yes” from “no”, and continue with the design and application of successively less trivial multi-state models to this same problem, culminating with a slightly harder small-vocabulary problem such as isolated digits. More advanced students can take on challenges such as continuous speech recognition with the integration of a language model, or the use of EM to estimate gaussian mixture models for a sample of short-time amplitude spectra. These specific projects will ground a more general discussion of Bayesian networks, conditional inference, phenomena such as ``explaining away’’ in probabilistic reasoning, and advanced algorithms such as belief propagation.

Module: Information Theory. Classical information theory is perhaps the most general and influential theory of abstract communication ever formulated, and as such must be an important component of the foundations course. In this module, we will ground the study of mathematical concepts such as Shannon information, noisy channels, and entropy in their application to neural spike train data. Students will be given single- and multi-cell spike train data sets, and compute simple but important quantities, such as the raw information content in a spike train, and the mutual information between different cells or sets of cells. Examination of basic coding theory issues can directly lead to discussion of possible neural coding mechanisms, including pulse and rate coding. The computer exercises can also highlight the predictive uses of information theory, such as the calculation of the information about an external stimulus contained in a collection of correlated neurons.

Other Modules. The above modules cover topics we consider directly related to the study of communication of one form or another. As examples of modules covering more basic material likely to arise in many communication studies, we give the following:

Page 25: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

Topic Problem Models and algorithms ExercisesMatrix algebra Color

matching in vision

Matrix representation of linear systems; matrix multiplication, systems of linear equations, eigenvectors

Implementation of SVD and PCA

Linear classification Discriminate vowel sounds

Linear discriminant analysis;logistic regression;perceptron algorithm;optimal separator

Compare vowel classification methods

Shift-invariant linear systems and frequency-domain representations

Signal coding

Shift-invariance; frequency domain reasoning; convolution theorem; sampling theorem; filtering; 2-D Fourier transforms; FFT

Audio compression via subband coding

Logical concepts and techniques

Deductive reasoning

Symbolic logics, automated deduction, models of knowledge and belief

Propositional theorem proving and games of imperfect information

Summer Research Program. Funds from the grant will be used to support a summer research program in which students will learn the perspectives and methods of other disciplines by practical experience, somewhat as in the case of medical school rotations. Students will experience research based on observations of natural behavior (as in animal field research, corpus linguistics and clinical observation), as well as research in a laboratory setting (using both behavior and physiological measures), and research based on algorithmic models of interacting agents. Students will also be exposed to research on basic issues in production, perception and cognitive analysis of complex signals.

These experiences will be brief laboratory or field apprenticeships, in which students will normally help carry out an existing research project, rather than designing a new project of their own. In some cases, this experience will develop into a longer apprenticeship during a cross-training year. The detailed structure of the program, including which students will visit which labs during which periods of time, will vary from summer to summer, depending on the mix of students involved, the set of faculty available, and our experience with earlier summer programs. Each year, the IGERT steering committee will use these criteria to plan a set of rotations for the following summer.

Obviously a summer-time rotation is only possible in labs with an on-going experimental or field observation program during the summer, or with some other sort of summertime work (such as data annotation and modeling) that new arrivals can easily be included in. Participation will also require a commitment on the part of the lab’s regular denizens to provide some special instruction and guidance to the visitors. The IGERT steering committee will plan each summer’s rotations with these considerations in mind, and detailed planning for each summer’s program will be carried out by a subgroup of the IGERT steering committee, led by John Trueswell.

Page 26: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

The students involved in this summer program will meet regularly as a group, to get special training (such as in research ethics and IRB procedures), to hear research presentations, and to make presentations on the projects they are involved in.

Below we give some examples of how students might benefit from rotations into (and out of) various IGERT-associated research groups. We have not tried to provide a complex matrix of potential interconnections, but rather to give a few illustrative examples.

Rotation into and out of perception labs. IGERT fellows rotating into these labs will learn, in detail, that the physical signals impinging on our sensory systems do not unambiguously indicate the state of the world. They will also learn about techniques for resolving these ambiguities, and about ways to explore the techniques used in particular biological systems. A rotation in Richards' lab would give students practical experience in acoustics, digital signal processing, and the analysis of behavioral data derived from acoustic stimulation. A rotation into Brainard’s lab would similarly provide experience in the visual domain.

In such research, practical knowledge about experimental apparatus is often crucial. As an example, consider a student whose primary research is concerned with how the coloration of male stickleback fish affects female mate choice. Although this question can be studied in the field, parametric variation of male coloration is difficult using live male fish as experimental stimuli. An alternative would be to display movies of male fish to females in the laboratory, and to vary the male coloration under software control. A student who had worked in Brainard’s lab would know that standard video displays do not faithfully reproduce the spectrum of rendered objects. Rather, these displays emit light that has the same effect on the human cone photoreceptors as the rendered object. Understanding the theory used in standard rendering [Brainard, 1995], this student would know that it does not render properly for stickleback fish, whose photoreceptor spectral sensitivities differ from those of humans. The student would be able to use what she had learned about color stimulus presentation to design custom displays and rendering procedures to present stimuli appropriate for the fish. An analogous set of considerations applies to research involving parametrically varied acoustic stimuli, where the experience of working in Richards’ lab would give students access to the knowledge that they need in order to choose signal synthesis and transduction techniques that will not introduce artifacts.

A student whose primary research is on color constancy could also benefit from knowledge acquired on one of these rotations. A central unsolved problem in the computational analysis of color constancy is how to handle spatial non-uniformities in illumination. One approach is to find a way to segment the image into separate regions of approximately uniform illumination and then analyze each such region separately. Analogous problems arise in many engineering applications in the area of image processing, where (for example) research in Norm Badler’s group has focused on extracting information about the motion of a human figure from a sequence of video frames. A rotation in Badler’s lab would provide information about a range of image segmentation techniques that might be imported to the study of color constancy in complex scenes, and could help design a project and some course work for an IGERT cross-training year.

IGERT fellows rotating in Lee and Saul's labs would gain familiarity with mathematical methods in machine learning, information processing, and feature extraction, which are broadly useful across many areas of science and engineering. Students originating in Lee and Saul's labs would benefit from rotatations to labs investigating the abilities of human subjects in psychoacoustic and visual experiments. Such studies often provide significant insights into the design of robust algorithms for signal processing (Saul, Rahim, & Allen, 2001; Saul & Allen, 2001).

Page 27: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

Rotation into and out of animal behavior laboratories. A central goal of IGERT training in the animal behavior laboratories will be to provide exposure to ethology and neurobiology. Even within the group of students who come to Penn to work on animal communication, there is considerable scope for benefit from breadth provided by the IGERT program. Those with a primary interest in neural mechanisms can expand their training by learning about observational studies of behavior and naturalistic techniques of playback experiments, either with fish in Crawford’s aquaria, birds in nearby parks or controlled laboratory settings with members of Schmidt’s laboratory, or nonhuman primates at any one of a number of field sites with members of Seyfarth’s and Cheney’s laboratory. Conversely, those whose primary research is behavioral will also learn about neuroethology, gaining laboratory experience with psychophysical studies of behavior, neuroanatomy, and various electrophysiological recording techniques, including methods for recording single neurons in behaving animals. For these same students, a rotation into Trueswell’s psycholinguistics gaze-tracking laboratory might offer IGERT students some new ideas about how to study cognitive processing in animals. A rotation at the LDC would offer training in the construction, maintenance and use of very large annotated corpora of biologically-significant signals.

For students whose primary interests lie in Linguistics, Computer Science, and Neuroscience, Penn’s IGERT training program will offer a unique opportunity to gain formal experience studying the communication and behavior of animals. For example, students interested in modeling human-human or human-computer dialogues will acquire a solid empirical basis for evaluating continuities and contrasts with the social interactions of other primates. Some students may benefit from learning to look at human interactions as an ethologist would, while others may find intriguing results by applying techniques from their home discipline to new kinds of data, such as modeling bird song sequences using recent methods for training stochastic grammars. Field and laboratory research on animal communication is rarely available to students in Linguistics, Computer Science, or Neurobiology. By contrast, in our IGERT training plan such opportunities will not only be available but will be administratively integrated into the overall structure of graduate education.

Whatever the pairing of an IGERT student’s home discipline and a summer lab rotation may be, the goal is to impart a basic understanding of the issues, methods and results in a new area. In many cases, this will result in the development of a plan for course work and a research project in the IGERT cross-training year program. In other cases, a student’s cross-training plan will develop independent of the summer experience, which will nevertheless provide useful breadth of knowledge and experience.

Cross-Training Year. The cross-training year will allow students to take courses and to do a significant research project in an area outside of their core discipline. The research project will ideally be an aspect of the student's dissertation research, viewed from the perspective of another discipline. For example, a psycholinguistics student interested in the role of facial expressions in conversation might spend a year taking courses in computer graphics, and working on improved control of facial animation for creation of stimulus materials. A neuroscience student interested in clinical aprosodias might spend a year studying the form and function of normal prosody in the Phonetics Laboratory, and working on reliable quantitative measures of hypo- and hyper-prosodicity.

This aspect of the training will be planned on a case-by-case basis, taking into account the student’s research interests and the opportunities available at Penn. Planning and arrangements for the cross-training year will be the responsibility of a subgroup of the IGERT steering committee, led by David Brainard.

Program Benefits. To illustrate the benefits of the proposed cross-training program, we sketch below four actual or hypothetical interdisciplinary research projects, and show in each case how the student's participation in the program would affect the outcome. These examples are based loosely on recent

Page 28: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

research projects here at Penn, where partial solutions were found by somewhat more ad-hoc cross-disciplinary faculty cooperation, or by individual student cross-training.

1. An animal communication student has developed an annotated corpus of baboon vocalizations, and is particularly interested in one of the sounds produced by adult males, a loud two-phased bark known as a "wahoo." This call appears (along with other types of calls) during predator encounters, during inter-group encounters, and during aggressive interactions with other males. Its acoustic character also varies, in terms of the relative length and amplitude of the two parts, the relative sharpness of the onset ("wahoo" vs. "bwahoo"), the F0 contour, and in other ways. The student's questions include whether this is really a single type of call, and what relationship its acoustic variation has to its different contexts of use. To address these questions, she needs to design and implement some signal processing functions of a kind that are familiar to phoneticians and speech engineers. Having taken our mathematical foundations course, and done a research rotation in a phonetics lab, she is easily able to annotate her audio files at key time points, to try out various signal analysis functions, and to go over her results with an expert in acoustic pattern recognition.

2. A computer science student working on virtual human behavior within graphical user interfaces needs to simulate the natural eye movement patterns of an avatar during conversation. In particular, she needs to uncover the information content of eye movements in conversation, such as for expressing emotion and possibly for reference to objects in the world. She would like to design studies to collect such data, but as a computer science student, she would normally have no experience in experimental design, nor in meeting Institutional Review Board (IRB) guidelines for research with human subjects. However, as a result of the IGERT program she has taken relevant graduate coursework in psychology, including a recently developed graduate-level course in research methods. In addition, she spent a research rotation in the psycholinguistic eye movement lab at Institution for Research in Cognitive Science (IRCS), where she helped run an experimental study of eye movements in face-to-face conversation. Therefore she is easily able to design and collect a conversational eye-movement database, from which she derives improved avatar eye-movement rules. It turns out that her work also leads to benefits for the psycholinguists, who start using virtual humans for controllable stimulus displays in research on the dynamic effects of eye movements in conversation.

3. A psychology student is interested in examining how the distributional properties of lexical items in child-directed speech could inform verb learning and/or modulate the development of language parsing procedures. However, he notes that the linguistic input to the child is riddled with disfluencies (ums, ahs, restarts, and frequent repairs). He wonders whether these disfluencies act as disruptive noise, or could instead be informative to a device that is learning to analyze the input structurally. As an IGERT graduate student, he has learned enough in the mathematical foundations course to investigate this question by comparing the behavior of statistical grammar-learning models provided with linguistic input with the disfluencies kept in or removed. In fact, he has already been exposed to some research from the speech recognition field that bears on these questions, and needs only to adapt the methods and apply them to published child language corpora. The speech engineers were interested in perplexity measures of language models for automatic speech recognition algorithms, but this student is able to frame, compare and test different hypotheses about human speech processing and its development.

The hypothetical examples just described are fairly close to actual experience, so that we can provide plausible and fairly detailed accounts of the cross-disciplinary interactions. Some of the most intriguing opportunities afforded by the proposed IGERT program will be in areas where we are convinced that fruitful connections are possible, but where we cannot present equally detailed accounts of the effects on research, because the connections have not yet been made, or have just begun to be made. For example, it will probably be interesting to make a rigorous and empirically well grounded comparison of the role of gesture, facial expressions and gaze in human conversation with the role of the same elements in the

Page 29: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

social interaction of other primates. The comparison is made more complex by the fact that the human system is culturally variable, and perhaps some aspects of the patterns in other primates may also be idiosyncratic or variable across groups. However, a Penn CIS grad student, advised by a Linguistics faculty member in working on a coding scheme for human gesticulation, has recently been trying to code videos of chimpanzee interaction. Through the proposed IGERT program, a Linguistics or CIS student studying human gesture could work with Seyfarth and Cheney to design a comparative gesture project, and spend a summer or a significant portion of a cross-training year at an external facility, like the Caribbean Primate Research Center’s macaque field site on Cayo Santiago or any one of the research sites used by Seyfarth and Cheney’s current students or post-docs in Africa, Central America, or South America. This is a unique opportunity for a kind of interaction that seems long overdue, and could lead to a significant new research effort.

We could tell a similar story about the potential for cross-fertilization between studies of the neural and genetic substrates of vocal learning, production and perception in humans and songbirds. Tantalizing analogies have been noted for decades. Increasing the bandwidth of interaction by means of cross-training projects involving various combinations of IGERT-affiliated biologists, phoneticians, psychologists and engineers will help to sort out which of these analogies might lead to new insights, whether due to deep similarities or deep contrasts between the two systems. Grad student exchanges are probably the best way to explore connections of this type, and the proposed IGERT program will offer the opportunity both for relatively low-impact exchanges, using the mechanism of the summer rotations, and for more consequential exchanges, using the mechanism of the cross-training year. External Training Opportunities. Although the group of IGERT-associated researchers at Penn will be large and diverse, there remain some areas that are not well covered. For example, there are now no researchers at Penn working on communication in social insects. Therefore, we plan to devote some IGERT funding to promote opportunities for outside contact. First, IGERT students will run a “journal club,” in which papers will be distributed to students and discussed with visiting authors, either invited specifically for the purpose, or shared with other local speaker series. Second, distinguished researchers from other institutions will occasionally be invited to teach intensive one to two week seminars on relevant topics, either during the school year or as part of a summer program. This would offer perspectives different from those of Penn faculty, and would foster collaborations with researchers in other institutions. Third, we will encourage graduate students affiliated with the program to help organize interdisciplinary workshops on topics that they propose. This will expose students to a wider range of perspectives, and will also involve them in actively shaping the research agenda in developing areas. These workshops would take place at IRCS, with local arrangements managed by the IRCS staff. Finally, students in this program will form an annual regional graduate student conference, with participation by students from other universities as well as Penn.

In addition to bringing a wider range of ideas to IGERT students, these activities will also help these students to make their own work known to the outside world.

We plan to have IGERT students themselves play the leading role in planning and organizing these activities, under the guidance of the IGERT steering committee.

Diversity Promotion. Because the proposed program will involve students from up to eight different graduate groups, the first mode of diversity promotion is provided by the existing diversity efforts in these diverse groups. In addition, we will recruit specifically for the IGERT program through outreach methods such as the undergraduate summer workshop run annually by the Institute for Research in Cognitive Science and the Center for Cognitive Neuroscience. Finally, the proposed program has an in-built propensity to promote diversity, because the different disciplines involved have very different

Page 30: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

traditional distributions of sex and ethnicity, and IGERT students and faculty will be able to offer one another cultural and personal support across disciplines.

Research Ethics Training. The question of research ethics is especially urgent in interdisciplinary work, because students in disciplines such as computer science may get little or no training in the ethical treatment of humans and animals in research. Yet as a result of the cross-disciplinary pressures that we have noted, these students already may launch themselves into observational and even experimental research with little understanding or awareness of IRB guidelines for such research. Students also need to learn where the ethical boundaries are in reporting experimental results. When a program doesn't work, one just fixes the bug without mentioning it. When an experiment doesn't work, it may not be appropriate simply to “fix” it, without also reporting the failure.

We will put several training mechanisms in place to address these issues. First, the summer research programs will include formal workshops on IRB guidelines for human and animal research, and on more general ethical issues. In addition, students involved in cross-training will be exposed to the ethical guidelines of the related discipline, simply by the close interactions established by working within an existing research group. Supervising faculty will take care to stress this aspect of the training.

e) Recruitment and Retention(2-page limit): Describe plans for recruitment, mentoring, and retention of trainees, including specific provisions for members of groups underrepresented in science and engineering. Identify the Ph.D. program(s) in which the IGERT graduate students may enroll. Discuss the phasing of new students into the IGERT program.

f) Organization and Management(2-page limit): Describe plans and procedures for the organization and management of the proposed activity. The plan should be specific and clear, and include use of a formalmechanism that assures the fair and effective allocation of group resources. Procedures for selecting students and others who will receive stipends or otherwise share in group funds must be described, as should methods for allocating use of shared equipment to be acquired with IGERT funds. Relationships to other faculty and equipment at the institution, and elsewhere if relevant, should be described as should the relationship to any existing grants that provide funds for related training and educational activities.

g) Performance Assessment (1-page limit): Describe a performance plan and methodology that relates the goals of your IGERT project to indicators and specific measurements for assessing progress toward goal achievement. This assessment should involve evaluators who are external to the project, who can render an objective evaluation and whose expertise spans the education and research objectives of the IGERT project. NSF will evaluate awarded IGERT projects, on an annual basis, through a Web-based questionnaire that standardizes the evaluation across all sites (see Section VII C).

The PI together with the co-PI's will be responsible for all aspects of the program, and will make decisions about the resource allocations, overall policy decisions, overall monitoring of the budget, and setting up of special committees as needed for the

Page 31: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

development of the educational programs. However, we expect and will encourage active involvement by other faculty participants, and will set up an advisory committee including at least one member from each associated subdiscipline, to set policy for theoverall program and to review student cases.

Infrastructure and administrative resources will be provided by IRCS (Institute for Research in Cognitive Science), which has been funded since 1992 by an NSF STC grant. IRCS was founded by Penn in 1990, and will continue to exist after the expiration of the NSF STC grant in 2002. Liberman, who is a co-PI, is the Director of IRCS, whose staff consists of an Administrative Director and several administrative and technical support staff. The Administrative Director will be responsible for the detailed monitoring of the budget, assisting in evaluations, and in the preparation of relevant reports. The staff members will provide administrative support for the interdisciplinary seminars and the management of the IGERT educational programs described in this proposal.

IRCS has an external advisory board which meets roughly once a year. We will ask a subset of this committee to look specifically at the IGERT program. In addition, review of IGERT will form part of the regular internal review of IRCS, during which the opinions of participating departments and participating students are activelysolicited.

h) Recruitment and Retention History (1 page per participating department/program): Explain your capacity to host an IGERT site, and your past performance and ability to attract well-qualified students in science and engineering, including those from underrepresented groups. Provide the following information regarding recruitment and retention of students in the participating departments/programs: (1) total number of applicants, (2) total number of applicants accepted, (3) total number of applicants who enrolled, (4) total number of students currently enrolled in the program indicating part-time and full-time status, (5) total number of Ph.D.s awarded, (6) average time to degree, and (7) other relevant measures of student success. Additionally, provide separate data for women, underrepresented minorities, and persons with disabilities for each of the above categories. A tabular format should be used with separate tables for each participating department/program.

The University of Pennsylvania offers an ideal setting for IGERT training in the study of communication and communicative interaction. IGERT students will be enrolled in one of nine participating graduate groups (for a description of the graduate group structure at Penn, see above, Section d, “Education and Training”). These are: Anthropology, Biology, Computer and Information Science (CIS), Electrical Engineering, Linguistics, Neurology, Philosophy, Psychology, and Systems Engineering. As the accompanying data indicate, each of these graduate groups has a successful, highly selective program that selects its students from a pool of outstanding applicants.

i) Admission and enrollment data tables (insert as pdf, hold two pages)Recruitment. All of the participating graduate groups have vigorous recruitment campaigns designed to bring promising students to Penn. These include stylish, attractive, and informative brochures and websites. Each website provides prospective students with information about the graduate group in general and the research of individual faculty members in particular. Websites also provide prospective students with easy access to individual faculty via e-mail. Faculty members make a particular effort to update their websites with information about recent publications, or recently established collaborations with other faculty at Penn. The administrative material is also updated regularly.

Page 32: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

As part of their recruitment efforts, many graduate groups pay for selected candidates to visit the University during one or two 3-day weekends in February and/or March. These visits offer prospective students an opportunity to meet one-on-one with relevant faculty, and to meet and socialize with current graduate students. Prospective applicants also meet with the Chair of the graduate group, hear research presentations from faculty, and are taken by current students on a tour of Philadelphia. These visits have proven to be effective tools for recruiting.

In a further attempt to recruit promising students in cognitive science, IRCS has for the past four years offered a 2 ½ week Summer Workshop in Cognitive Science and Cognitive Neuroscience for promising undergraduates. The workshop has now become widely known and is highly competitive, last year attracting 121 applicants from 9 countries. Thirty-two students were accepted, from institutions as close as Haverford College and as far away as Russia’s Moscow State University. The academic majors of participants included Cognitive Science, Linguistics, Molecular & Cellular Biology, Mathematics, Philosophy, and Neuroscience. Although the workshop’s stated focus is on cognitive science and cognitive neuroscience, in fact the list of topics and faculty participants overlaps almost perfectly with those contained in the present IGERT proposal. And while the summer workshop has a purely educational purpose, it also serves as a major recruiting tool. Thus the IRCS summer workshop will be an important part of our effort to recruit IGERT students.

Recruitment of women and under-represented minorities. The graduate groups listed in this proposal pursue active policies for recruiting and retaining students drawn from under-represented minorities. To cite just one example, CIS has been successful in obtaining national Patricia Harris Fellowships and Graduate Assistance in Areas of National Needs (GAANN) fellowships for both women and minorities. The CIS graduate admissions office now has considerable expertise in identifying minority candidates, including GEM scholars, NPSC fellows, and those eligible for GAANN fellowships (the Harris program has been discontinued). Currently four CIS graduate students are GAANN fellows.

The University itself offers a number of Fontaine Fellowships to minority candidates each year. In addition, the University has long recognized that for many minority, financially disadvantaged, and educationally unprepared students, academic and university life present obstacles. Responding to this need, the University supports several organizations on campus that are specifically designed to highlight the intellectual contributions of minority students and to provide minority students with social and academic support during their years at Penn.

For example, the Penn Women’s Center, founded in 1973, has as its mission to “ensure that the University … is responsive to women’s needs on all levels and in all activities.” Two divisions of the Women’s Center are particularly relevant to the present IGERT proposal. The Graduate Women’s Group meets regularly with a professional convener to help women scholars develop self-confidence and strategies for coping with the multiple responsibilities of graduate school, family, and work. The Graduate and Professional Women’s Organization also provides support and social contacts for al graduate women at Penn.

In addition, Women at Penn have a special voice in the Trustees’ Council of Penn Women, a group of female alumnae whose goal is the advancement of women at Penn. This group, which has become a model for similar institutions around the country, has raised both awareness and funds to promote women’s interests at Penn. They have established a term professorship, scholarships and service awards, and a career mentoring program. The Women’s Alliance is a collective of diverse women who are concerned with women’s issues both at Penn and beyond. It serves as a support group for women in the community and is closely linked to similar organizations in the Philadelphia area.

Page 33: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

The Black Graduate and Professional Students Association (B-GAPSA) provides opportunities for members to meet with their counterparts in all 12 graduate and professional schools and give voice to mutual concerns. B-GAPSA serves as the campus-wide student organization and political voice for minority graduate students. Other, related organizations include the African American Resource Center, Society of Black Graduate Students in Engineering, Society of Hispanic Professional Engineers, Society of Women Engineers, and the United Minorities Council.

The W.E.B. Dubois College Residential Program was designed to promote academic, cultural, and social interaction between students interested in African- American culture at the University. Established in 1974, the program grew out of a need to establish African-American Studies as an integral part of the University’s academic structure. As the nucleus for most organized minority student/faculty activities, the program offers a series of cultural, academic, and social programs, plus other resources and facilities, dealing with African-American culture. These resources are used heavily by the 90 students who live in the residence, along with many other interested students, faculty, and community residents.

Other programs in which the University participates, and which have the explicit goal of recruiting promising minority students to our PhD programs include: The National Name Exchange, a consortium of 28 doctoral degree-granting institutions; the National Physical Science Consortium, which offers doctoral graduate fellowships in the physical sciences with special emphasis on underrepresented minority and female students; and the Leadership Alliance, a consortium of Ivy League and historically black colleges and universities that aims to increase the number of underrepresented minority students who chose careers in research.

Efforts by the University of Pennsylvania to recruit graduate students who identify themselves as African-American, Native American, or Latino have led to steady improvement in the diversity of graduate students on campus.

Year Total1 Number2 Percentage

1990 4465 135 3.02%1991 4534 163 3.60%1992 4441 173 3.90%1993 4363 186 4.26%1994 4326 199 4.60%1995 3961 183 4.62%1996 3738 178 5.06%1997 3623 182 5.02%1998 3517 178 5.06%1999 3521 167 4.74%

1. Includes both US and international students.2. Includes only US citizens who identify themselves as African-American, Native American, or

Latino.

Retention. All of the graduate groups involved in this proposal have taken explicit steps designed to maximize the retention of PhD students admitted to their programs. Typically, each graduate group has a chair who is a full professor with extensive experience in dealing with the financial, social, and academic issues confronting graduate students. The chair serves as the primary advisor for many first-year students and for those who have not yet selected their primary academic mentor. All graduate groups insist that a student form an advisory committee in his or her first year; this committee includes at least three full-time faculty members and thereafter becomes the student’s primary source of direction. Committees are

Page 34: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

required to meet at least one a semester, and report to the chair of the graduate group after these meetings occur. These meetings provide regular updates on a student’s progress and offer an early warning to the committee and the chair if a student is falling behind.

In our view, these administrative procedures have proved successful in past years and provide excellent guidelines for the administration of students in the IGERT program. Retention rates are notoriously difficult to evaluate, since they can be affected by so many personal factors other than the administration of a graduate program. Nonetheless, the enclosed data indicate that all participating graduate groups have awarded a substantial number of PhDs during the period from 1997 to 2001, with an overall average time to degree of 6-7 years.

Other relevant measures of student success. The graduate groups that would be involved in IGERT training all have a strong record of graduate placement. This offers further evidence that the advising and mentoring structure already in place at Penn will serve as a good model for advising and mentoring IGERT students. To cite two examples, as part of its recent external review, the Psychology graduate group compiled data on all of its recent PhDs. It found that virtually all clinical students were accepted into an internship, non-clinical students typically obtained post-doctoral fellowships, and roughly one student each year accepted a university faculty position. Similarly, in recent years CIS has placed 20 graduates in universities, usually as post-doctoral fellows, and 18 graduates in industry (for example, 5 with AT&T, 4 with IBM, and 1 each with Boeing, SRI, and Siemens).

j) Recent Training Experience(1-page limit): Provide information about any recent experience with other traineeship programs, including a discussion of outcomes. If the IGERT program builds on a recenttraineeship experience, discuss what would be the new value-added aspects of the IGERT project.

k) Collaborators(2-page limit): In order to identify potential conflicts of interest in the review process, provide a consolidated alphabetical list of current and past collaborators during the last four years, and their current institutional affiliation, for all participants listed in Section (C) a, above. This list must also include former graduate students and postdoctoral fellows who have been associated with the faculty participants over the last five years.

l) Existing Facilities and Equipment(1-page limit): Include a brief description of available facilities, including major instruments required for the research. If requested equipment or materials duplicate existing items, explain the need for the additional equipment.

D References Cited (3 page limit)Cheney, D.L. & Seyfarth, R.M. 1997. Reconciliatory grunts by dominant female baboons influence

victims’ behaviour. Anim. Behav. 54, 409-418.

Cheney, D.L. & Seyfarth, R.M. 1998. Why monkeys don’t have language. In: The Tanner Lectures on Human Values, 19 (Ed. G. Petersen). Salt Lake City: University of Utah Press.

Page 35: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

Cheney, D.L., Seyfarth, R.M. & Palombit, R.A. 1996. The function and mechanisms underlying baboon contact barks. Anim. Behav. 52, 507-518.

Crawford, J.D. 1993. Central auditory neurophysiology of a sound-producing mormyrid fish: the mesencephalon of Pollimyrus isidori. J. Comp. Physiol. A 172, 1-14.

Crawford, J.D., Cook A.P., & Heberlein, A.S. 1997a. Bioacoustic behavior of African fishes (Mormyridae): potential cues for species and individual recognition in Pollimyrus spp. J Acoustic. Soc. Am. 101:1200-1212.

Crawford, J.D., Jacobe, P., & Benech, V. 1997b. Field studies of a strongly acoustic fish in west Africa: reproductive ecology and acoustic behavior of Pollimyrus isidori, MORMYRIDAE. Behaviour 134:1-49.

Fischer, J., Hammerschmidt, K., Cheney, D.L. & Seyfarth, R.M. 2000a. Acoustic features of female chacma baboon barks. Ethology, 107, 33-54.

Fischer, J., Cheney, D.L., & Seyfarth, R.M. 2000b. Development of infant baboon responses to female graded variants of barks. Proc. Royal Soc. London B 267, 2317-2321.

Fischer, J., Metz, M., Cheney, D.L., & Seyfarth, R.M. 2001. Baboon responses to graded bark variants. Anim. Behav. 61, 925-931.

Fischer, J., Hammerschmidt, K., Cheney, D.L. & Seyfarth, R.M. In press. Acoustic features of male baboon loud calls: Influences of context, age, and individuality. J. Acoustic. Soc. Am.

Fletcher, L.B. & Crawford, J.D. 2001. Acoustic detection by sound-producing fishes (Mormyridae): the role of gas-filled tympanic bladders. J Exp Biol. 204(2), 175-183.

Horn, A. & Falls, J.B. 1996. Categorization and the design of signals: The case of song repertoires. In: Ecology and Evolution of Acoustic Communication in Birds (eds. D.E. Kroodsma & E.H. Miller), Ithaca, NY: Cornell Univ. Press.

King, A.P. & West, M.J. 1989. Presence of female cowbirds (Molothrus ater ater) affects vocal improvisation in males. J. Comp. Psychol. 103, 39-44.

Kozloski, J. & Crawford, J.D. 1998. Functional neuroanatomy of auditory pathways in the sound-producing fish Pollimyrus. J. Comp. Neurol. 401, 227-252.

Kozloski J, Crawford J.D. 2000. Transformations of an auditory temporal code in the medulla of a sound-producing fish. J. Neurosci. 20(6), 2400-2408.

Large, E. & Crawford, J.D. In press. Auditory temporal computation: Interval selectivity based on post-inhibitory rebound. J. Computational. Neurosci.

Marvit P., Crawford J.D. 2000. Auditory thresholds in a sound-producing electric fish (Pollimyrus):behavioral measurements of sensitivity to tones and click trains. J Acoustical Soc Am.107(4), 2209-2214.

Marvit P., Crawford J.D. 2001. Auditory discrimination in a sound-producing electric fish (Pollimyrus): tone frequency and click-rate difference detection. J Acoustical Soc Am. 108(4), 1819-1825.

Nealen, P. M. & Schmidt, M. F. 2001. Specificity of auditory responses in nucleus HVc of the song sparrow Melospiza melodia. Soc. Neurosci. Abstr., 318, 9.

Owren, M.J., Seyfarth, R.M., & Cheney, D.L. 1997. The acoustic features of vowel-like grunt calls in chacma baboons (Papio cynocephalus ursinus). J. Acoust. Soc. Am. 101, 2951-2963.

Rendall, Drew, Seyfarth, R.M., Cheney, D.L., & Owren, M.J. 1999. Reference, context, and social identity affect call meaning and function in baboon grunts. Anim. Behav. 57:583-592.

Rendall, D., Cheney, D.L., & Seyfarth, R.M. 2000. Proximate factors mediating ‘contact’ calls in adult female and their infants. J. Comp. Psychol. 114, 36-46.

Schmidt, M. F, Konishi, M. 1998. Gating of auditory responses in the vocal control system of awake songbirds. Nature Neuroscience, 1, 513.

Searcy, W.A. 1984. Song repertoire size and female preferences in song sparrows. Behav. Ecol. Sociobiol. 14, 281-286.

Searcy, W.A. 1992. Song repertoire and mate choice in birds. Amer. Zool. 32, 71-80.Stoddard, P. K., Beecher, M. D. & Wills, M. S. 1988. Responses of territorial male song sparrows to song

types and variations. Behav. Ecol. & Sociobiol. 22, 125-130.

Page 36: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

Sullivan, W.E. 1982. Possible neural mechanisms of target distance encoding in the auditory system of the echolocating bat Myotis lucifugis. J. Neurophysiol. 48, 1033-1047.

Suzuki, A., Kozloski, J. & Crawford, J.D. (in press) Encoding of acoustic information by primary afferent neurons in a sound-producing fish (Mormyridae). J. Neurosci.

Tchernichovski, O., Lints, T., Mitra, P. P. & Nottebohm, F. 1999. Vocal imitation in zebra finches is inversely related to model abundance. Proc. Natl. Acad. Sci. U S A 96, 12901-4.

Tchernichovski, O., Nottebohm, F., Ho, C. E., Pesaran, B. & Mitra, P. P. 2000. A procedure foran automated measurement of song similarity. Anim Behav, 59, 1167-1176.

Backus, B.T. andBanks M.S., Estimator reliability and distance scaling in stereoscopic slant perception. Perception, 1999. 28(2): p. 217-42.

Backus, B.T., et al., Horizontal and vertical disparity, eye position, and stereoscopic slant perception. Vision Res, 1999. 39(6): p. 1143-70.

Brainard, D. H. (1995). Colorimetry. Handbook of Optics: Volume 1. Fundamentals, Techniques, and Design. M. Bass. (ed.), New York, McGraw-Hill: 26.1-26.54.

Brainard, D. H. (1998). "Color constancy in the nearly natural image. 2. achromatic loci." Journal of the Optical Society of America A 15: 307-325.

Brainard, D. H. and Freeman, W. T. (1997). "Bayesian color constancy." Journal of the Optical Society of America A 14: 1393-1411.

Brainard, D. H., Kraft, J. M., and Longere, P. (to appear). Color constancy: developing empirical tests of computational models. To appear in Colour Perception: From Light To Object, R. Mausfeld and D. Heyer (eds.), Oxford University Press. Available at http://color.psych.upenn.edu/brainard/papers/CompModels.pdf.

Doupe, A.J. and P.K. Kuhl, Birdsong and Human Speech: Common Themes and Mechanisms. Ann. Rev. Neurosci., 1999. 22: p. 567-631.

Fleishman, L. J., W. J. McClintock, R. B. D'Eath, D. H. Brainard, and J. A. Endler (1998). "Colour perception and the use of video playback experiments in animal behaviour." Animal Behavior 56: 1035-1040.

Kraft, J. M. and D. H. Brainard (1999). "Mechanisms of color constancy under nearly natural viewing." Proc. Nat. Acad. Sci. USA 96: 307-312.

Jha, A. P. and G. McCarthy (2000). "The influence of memory load upon delay-interval activity in a working memory rask: An event related functional MRI study." Journal of Cognitive Neuroscience 12: 90-105.

Jha, A. P., B. M. Miller, et al. (2001). " Influence of Memory Distraction Upon Delay-Interval Activity in a Working Memory Task: An Event-Related fMRI Study." Cognitive Neuroscience Society Abstracts.

Lee, D. D. and Seung, H. S. (1999a). "Learning in intelligent embedded systems", Usenix Proceedings, Workshop on Embedded Systems.

Lee, D. D, and Seung, H. S. (1999b). "Learning the parts of objects with nonnegative matrix factorization," Nature 401: 788-791.

Page 37: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

Mangun, G. R., A. P. Jha, et al. (2000). The Temporal dynamics and Functional Architecture of Attentional Processes in Human Extrastriate Cortex. The New Cognitive Neurosciences. M. S. Gazzaniga. Cambridge, Mass., MIT Press: 701-710.

Pantzer, T. & Sternberg, S. (1998) Effects of display size on mean RT: Serial versus shared-capacity parallel operations. Poster presented at the Psychonomic Society Meeting, Dallas, November.

Roweis, S. T. and Saul, L. K. (2000). "Nonlinear dimensionality reduction by locally linear embedding," Science 290: 2323-2326.

Roweis, S. T. , Saul, L. K., and Hinton, G. E. (2001). "Global coordination of locally linear models," to appear in S. Becker, T. Dietterich, and Z. Ghahramani (eds.), Advances in Neural Information Processing Systems 14. MIT Press: Cambridge, MA.

Saul, L. K., Rahim, M. G. and Allen, J. B. (2001). "A statistical model for robust integration of narrowband cues in speech," Computer Speech and Language 15: 175-194.

Saul, L. K. and Allen, J. B. (2001). "Periodic component analysis: an eigenvalue method for representing periodic structure in speech,"in T. Dietterich, T. Leen, and V. Tresp (eds.), Advances in Neural Information Processing Systems 13. MIT Press: Cambridge, MA.

Schmidt, M.F. and M. Konishi, Bilateral hemispheric co-ordination of birdsong. Proc. 22 Int. Ornithol. Congr., Durban, 1999. 22 : p. 509-523.

Schmidt, M. F. (2002) Patterned interhemispheric synchronization of vocal premotor activity J. Neurophys. (under review)

Seung, H.S. and Lee, D. D. (2000). "The manifold ways of perception", Science 290: 2268-2269.

Sternberg, S. (1969) The discovery of processing stages: Extensions of Donders' method. In W. G. Koster, (Ed.), Attention and performance II. Acta Psychologica, 1969, 30, 276-315.

Sternberg, S. (1998) Discovering mental processing stages: The method of additive factors. In D. Scarborough & S. Sternberg (Eds.), An Invitation to Cognitive Science, Volume 4: Methods, Models, and Conceptual Issues. Cambridge, MA : M.I.T. Press. Pp. 703-863.

Sternberg, S. (2001) Separate modifiability, mental modules, and the use of pure and composite measures to reveal them. Acta Psychologica, 106, 147-246.

Sternberg, S. & Pantzer, T. (1997) Timing of repeated movements: Test of a two-stage model. Paper presented at the Psychonomic Society Meeting, Philadelphia, November.

Tchernichovski, O., et al. (2001a) Dynamics of the vocal imitation process: How a zebra finch learns its song. Science. 291: p. 2564 - 2569.

Tchernichovski, O., Schmidt M. F. and P. P. Mitra (2001b) Combined acoustic and neural measurements of the song imitation process. Soc. Neurosci. Abstract.

Vu, E.T., M.F. Schmidt, and M.E. Mazurek, Interhemispheric coordination of premotor neural activity

Page 38: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

during singing in adult zebra finches. J. Neurosci., 1998. 18(21): p. 9088-9098.

E Biographical Sketch and Current Support (2-page limit each for PI and co-PIs; 1-page limit each for other participants): For allparticipants listed in Section (C) a, above, provide a biographical sketch that includes a brief description of current research support. The sketch should include the individual's academic and professional history, and may include a list of the five most significantpublications. Other activities or accomplishments may be listed. In choosing what to include, emphasize information that will be helpful for understanding the strengths, qualifications, and impact the individual will bring to the proposed IGERT project.

F Budget and Allowable Costs (NSF Form 1030): Provide a budget for each year of support requested, not to exceed $500,000 each year for up to 5 years, exclusive of first-year equipment funds discussed below. The FastLane system will automatically fill out the cumulative 5-year budgetfor the proposal. The major portion of awarded funds must be used for doctoral student stipends, training and educational activities, and for related expenditures, such as student travel, publication costs, and recruitment. Travel funds should be budgeted in each year for the Principal Investigator and for an additional person to attend annual meetings at the NSF. No funds for faculty research or faculty salaries may be requested, with the exception that up to one month per year of salary support for the Principal Investigator for management purposes may be requested. Support for short-term visitors and funding of a limited amount of administrative support may be requested. The NSF contribution to the graduate stipend is $18,000 per year per student, with a cost-of-education allowance (tuition and normal fees) of $10,500 per year per student. List funds requested for graduate students in Participant Support Costs: stipends in F.1, cost-of-education allowances in F.3, and travel in F. 2. Undergraduate stipends should be consistent with those of NSF's Research Experiences for Undergraduates (REU) program (http://www.nsf.gov/cgi-bin/getpub?nsf00107), and postdoctoral stipends may be determined by the institution. If applicable, they should be listed separately on lines B.4 and B.1, respectively. All stipend recipients must be citizens or permanent residents of the U.S.

Funds for the purchase of shared, special-purpose research equipment may be requested. Personnel and shop costs may be requested for developing and constructing special instruments, and for purchasing computer software or other special purpose materials. The total funds requested for research equipment, software, and special purpose materials may not exceed $200,000; if awarded, these funds will be provided in the first year of the grant. Limited funds intended to partially defray the costs of research by students may also be requested. Funds for facility renovation or for equipment installation or maintenance are not allowed. Each award will carry an 8% overhead allowance based on the total direct cost, excluding equipment and cost-of-education allowances.

For multi-institution projects, the lead institution shall submit the proposal, with other participating institutions included under subcontracts. Budgets shall be provided for the overall project as well as individually for the lead institution and for each participatinginstitution that receives a subcontract.

Budget Justification (3-page limit): A brief justification for funds in each budget category should be provided with Section (F), above. Indicate the number of graduate students to be supported and the

Page 39: A Project Summarylanguagelog.ldc.upenn.edu/myl/ldc/IGERT-draft.doc  · Web viewStudents will learn to do research based on observations of natural behavior (as in ethology, corpus

duration of their support on IGERT funds. For shared equipment and special materials, specify the model, source, and current or expectedcost whenever possible, with a brief explanation of the need for each requested item and a description of provisions for maintenance and operating expenses. Provide details of any commitments, including a listing of all relevant documentation, of other organizations expected to participate in the IGERT project, such as government, industry, or private foundations. If internships are planned, the willingness of the host organization, including foreign institutions, and of the individual mentors to participate should be included in the documentation. Do not list or include letters whose sole purpose is to endorse the project.

G Supplementary DocumentsUp to six letters of commitment may be provided as part of the proposal. They should be scanned and placed in this Section.