bionlp09

25
 BIONLP'09 Shared Task Farzaneh Sarafraz James Eales Reza Mohammadi Goran Nenadic 26 March 2009

Upload: farzanehs

Post on 14-Jun-2015

264 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Bionlp09

   

BIONLP'09 Shared Task

Farzaneh SarafrazJames EalesReza MohammadiGoran Nenadic

26 March 2009

Page 2: Bionlp09

   

BioNLP'09 Task 1

Events in abstracts Given: gene and gene products (proteins) Wanted: events

− type− trigger− participant(s)− cause (if applicable)

Page 3: Bionlp09

   

Example

"I kappa B/MAD­3 masks the nuclear localization signal of NF­kappa B p65 and requires the transactivation domain to inhibit NF­kappa B p65 DNA binding. "

Event: negative regulation

Trigger: masks

Theme1: the first p65

Cause: MAD­3

Page 4: Bionlp09

   

Event Types

Gene expression Transcription Protein Catabolism Localisation Phosphorylation

Binding Regulation Positive regulation Negative regulation

Page 5: Bionlp09

   

Training and Test Data

Training data: 800 abstracts Development data: 150 abstracts Test data: 260 abstracts

Page 6: Bionlp09

   

Our System

1) Finding trigger and type

2) Finding participants (themes)

3) Post processing

Page 7: Bionlp09

   

1) Finding Triggers and Types ­ CRF"I kappa B/MAD­3 masks the nuclear localization..." 0   0   0  0      9    0     0          0

"The binding of I kappa B/MAD­3 to NF­kappa B p65 is 0      0    0 0    0  0   0    0     0    0  0   0

sufficient to retarget NF­kappa B p65 from the  0       0     4        0    0   0   0    0

nucleus to the cytoplasm." 0     0   0      0

9: negative regulation

4: localisation

Page 8: Bionlp09

   

CRF features for each token

is­protein is­PPI­word generic POS tag log­frequency of token being a trigger for each 

event type (10 features) number of proteins in sentence (sentence­level)

Page 9: Bionlp09

   

Trigger Detection Post Processing

Positive discrimination− Manually looking at false negatives− Adding recurring triggers

Negative discrimination− Manually looking at false positives− Filtering out common mistaken tokens

Page 10: Bionlp09

   

Trigger Detection Results

Event Class #Gold R P F­scoreLocalisation 40 77.5 47.69 59.05Binding 180 33.33 54.55 41.38Gene expression 282 76.6 58.54 66.36Transcription 68 58.82 18.6 28.27

19 84.21 88.89 86.4940 97.5 81.25 88.64

Non­reg total 629 63.91 48.73 55.3Regulation 138 13.04 62.07 21.56Positive regulation462 13.85 54.24 22.07Neg. regulation 153 29.41 45.92 35.86All total 1382 38.28 49.44 43.15

Protein catabolismPhosphorylation

Page 11: Bionlp09

   

2) Finding Participants

Type and number of participants− 1 theme (protein)

Gene expression Transcription Protein Catabolism Localisation Phosphorylation

− 1 or more themes (protein) Binding

− 1 theme and 1 cause (proteins/other events)

Regulation Positive regulation Negative regulation

Page 12: Bionlp09

   

Parse Tree Distance

Page 13: Bionlp09

   

Parse Tree Distance Analysis

Page 14: Bionlp09

   

Theme in Subtree

Single Theme events− Theme in subtree  0.7054− Theme not in subtree  0.2946

Binding event− Any theme in subtree = 0.5435− Any theme not in subtree = 0.4565

Regulation events− Either theme or cause in subtree = 0.5919− Either theme or cause not in subtree = 0.4081

Page 15: Bionlp09

   

Distance in Trigger Subtree

Page 16: Bionlp09

   

Distances not in Trigger Subtree

Page 17: Bionlp09

   

Rules Concerning Parse Tree Analysis

For "binding", report as themes:− up to the second closest protein in the subtree− and the first closest protein in the rest of the tree

"In contrast, gp41 failed to stimulate NF­kappaB binding activity in as much as no NF­kappaB bound to the main NF­kappaB­binding site 2 of the IL­10 promoter after addition of gp41."

Successfully missing out the final gp41.

Page 18: Bionlp09

   

Example of a Missed (FN) Theme

For gene expression− All the proteins in the subtree are reported as 

themes"The 15­lipoxygenase (lox) gene is expressed in a tissue­specific manner, predominantly in erythroid cells but also in airway epithelial cells and eosinophils."

                is

               /   \

           gene   expressed

             |

     15­lipoxygenase

Page 19: Bionlp09

   

Evaluation on Development Data

Event Class #Gold R P F­scoreLocalisation 53 67.92 46.75 55.38Binding 312 21.47 63.81 32.13Gene expression 356 64.61 76.33 69.98Transcription 82 53.66 89.8 67.18

21 90.48 67.86 77.5547 91.49 53.09 67.19

Non­reg total 871 50.4 68.44 58.05Regulation 172 5.23 33.33 9.05Positive regulation 632 3.48 21.36 5.99Neg. regulation 201 9.45 15.08 11.62Regulatory total 1005 4.98 19.53 7.93All total 1876 26.07 54.46 35.26

Protein catabolismPhosphorylation

Page 20: Bionlp09

   

Evaluation on Test Data

Event Class #Gold R P F­scoreLocalisation 174 44.83 53.06 48.6Binding 347 12.68 40.37 19.3Gene expression 722 52.63 69.34 59.84Transcription 137 15.33 67.74 25

14 42.86 50 46.15135 78.52 53.81 63.86

Non­reg total 1529 41.53 60.82 49.36Regulation 291 3.09 19.15Positive regulation 983 1.12 8.87 1.99Neg. regulation 379 12.4 20.52 15.46Regulatory total 1653 4.05 16.75 6.53All total 3182 22.06 48.61 30.35

Protein catabolismPhosphorylation

 5.33

Page 21: Bionlp09

   

Results: Ranked 12 out of 24 teams

Rank R P F­Score Rank R P F­Score1 46.73 58.48 51.95 13 25.96 36.26 30.262 45.82 47.52 46.66 14 20.93 49.3 29.383 34.98 61.59 44.62 15 22.69 40.55 29.14 36.9 55.59 44.35 16 21.53 36.99 27.215 33.41 51.55 40.54 17 17.44 39.99 24.296 28.13 53.56 36.88 18 28.63 20.88 24.157 28.22 45.78 34.92 19 13.45 71.81 22.668 27.75 46.6 34.78 20 22.78 19.03 20.749 21.62 62.21 32.09 21 30.42 14.11 19.2810 21.12 56.9 30.8 22 11.25 66.54 19.2511 22.5 47.7 30.58 23 11.69 31.42 17.0412 22.06 48.61 30.35 24 9.4 61.65 16.31

Page 22: Bionlp09

   

End.

Page 23: Bionlp09

   

Other Tasks

Event detection and characterization Event argument recognition Negations and speculations

Page 24: Bionlp09

   

Example

"I kappa B/MAD­3 masks the nuclear localization signal of NF­kappa B p65 and requires the transactivation domain to inhibit NF­kappa B p65 DNA binding. "

Event: negative regulation

Trigger: masks

Theme1: the first p65

Cause: MAD­3

Site: nuclear localization signal

Page 25: Bionlp09

   

Example

"In contrast, NF­kappa B p50 alone fails to stimulate kappa B­directed transcription, and based on prior in vitro studies, is not directly regulated by I kappa B. "

Event: regulation

Theme1: this p50

Trigger: regulated

Negation: true for this event

Speculation: none