center for computational learning systems

9
Center for Computational Learning Systems Independent research center within the Engineering School NLP people at CCLS: Mona Diab, Nizar Habash, Martin Jansche, Rebecca Passonneau, Owen Rambow We are part of “The NLP Group” but not of the CS department What we do: o Researchers o Work with Kathy and Julia o Our own projects o Sometimes teach o Supervise students (PhD, Masters, independent studies) Some of us are in CEPSR, some in the Interchurch Building Some NLP Group meetings will take place in Interchurch Center

Upload: tannar

Post on 13-Jan-2016

18 views

Category:

Documents


1 download

DESCRIPTION

Center for Computational Learning Systems. Independent research center within the Engineering School NLP people at CCLS: Mona Diab, Nizar Habash, Martin Jansche, Rebecca Passonneau, Owen Rambow We are part of “The NLP Group” but not of the CS department What we do: Researchers - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Center for Computational Learning Systems

Center for Computational Learning Systems

• Independent research center within the Engineering School

• NLP people at CCLS: Mona Diab, Nizar Habash, Martin Jansche, Rebecca Passonneau, Owen Rambow

• We are part of “The NLP Group” but not of the CS department

• What we do:o Researcherso Work with Kathy and Juliao Our own projectso Sometimes teacho Supervise students (PhD, Masters, independent studies)

• Some of us are in CEPSR, some in the Interchurch Building

• Some NLP Group meetings will take place in Interchurch Center

Page 2: Center for Computational Learning Systems

CLiMB 2: Computational Linguistics for Metadata

Building, phase 2

• Becky Passonneau (with University of Maryland)

• Interactive workbench for image cataloguers/indexers: Use NLP to extract descriptive terms from scholarly text

• Mellon Foundation• http://www.umiacs.umd.edu/~climb/

Page 3: Center for Computational Learning Systems

Automated Readers Advisor, Heiskell Talking Books and Braille

Library (NYPL)

• Becky Passonneau• Replace some of librarians’ tasks in

current over-the-phone borrowing system with automated dialogue system

• Use Wizard-of-Oz paradigm for data collection

• Joint project with CCNY (Esther Levin)• http://

www.cs.columbia.edu/~becky/pubs/WozVariant.ppt

Page 4: Center for Computational Learning Systems

Tracking Emergent Narrative Skills (TENS)

• Becky Passonneau• Current data set: ten-year olds retelling

silent movies• Develop quantitative methods to

compare semantic and pragmatic content (e.g., adapt Pyramid Method for evaluating summary content)

• Joint project with University of Connecticut (Elena Levy)

Page 5: Center for Computational Learning Systems

Arabic NLP

• CADIM Group: Mona Diab, Nizar Habash, Owen Rambow

• Focus on Standard Arabic AND the dialects• NLP tools for Arabic:

o Morphological analysis (exists)o Morphological tagging (exists, best-performing)

Tokenization POS tagging (best-performing) Diacritization (best-performing)

o Word-sense disambiguation (in progress)o Sentence-boundary detection for ASR (in progress)o Parsing (initial research)o Names-entity recognition (joint with Fair Isaacs, in progress)o …

Page 6: Center for Computational Learning Systems

Machine Translation

• Nizar Habash • Focus: Arabic-English MT• Different hybrid MT approaches explored

o Linguistic preprocessing for Statistical MT Morphological and Syntactic preprocessing

o Adding statistical resources to rule-based MT systems

Automatically extracted phrase tables combined with Generation-Heavy MT

• Columbia first time participation in NIST MTEval (2006)

Page 7: Center for Computational Learning Systems

Word Sense Modeling and Disambiguation

• Mona Diab• Using corpora (including

multilingual parallel and similar) for unsupervised learning

• Arabic WordNet• Arabic PropBank

Page 8: Center for Computational Learning Systems

Email Summarization:Social Networks

• Aaron Harnly (PhD student) and Owen Rambow, with Kathy McKeown

• Study interaction between:o Email-intrinsic factors

Language in email (lexison, syntax, …) Email genre

o Structure of dialog Threads Speech acts

o Relation among people Roles in organization Social networks

• Use to predict on factor from others• Use in high-level summaries of large amounts

of email communication

Page 9: Center for Computational Learning Systems

Multilingual Metagrammars

• Owen Rambow (with University of Pennsylvania)

• Goal: high-level abstract representation of syntax of (many/all) natural languages, from which we can automatically generate grammars that can be used for NLP

• Have: Universal Grammar component and language-specific modules for Korean, German, Yiddish

• Next: Icelandic, Mainland Scandinavian, English, Kashmiri, …