ibm cog institutetalk_diab

85
Towards building effec2ve computa2onal sociopragma2cs models of human cogni2on Mona Diab George Washington University

Upload: diannepatricia

Post on 03-Aug-2015

291 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: Ibm cog institutetalk_diab

Towards  building  effec2ve  computa2onal  sociopragma2cs  models  

of  human  cogni2on  Mona  Diab  

George  Washington  University  

Page 2: Ibm cog institutetalk_diab

Acknowledgement  

•  Many  collaborators:  Dragomir  Radev,  Amjad  Abu  Jbara,  Pradeep  Dasigi,  Weiwei  Guo,  Owen  Rambow,  Julia  Hirschberg,  Kathy  Mckeown,  Mustafa  Mughazy,  Heba  Elfardy,  Vinod  Prabhakaran,  Greg    Werner,  Muhammad  Abdulmageed  

•  Research  supported  by  IARPA  SCIL  program  and  DARPA  DEFT  &  BOLT  programs  and  Google  Faculty  award  

•  Slides  adapted  from  several  publica2on  presenta2ons  

Page 3: Ibm cog institutetalk_diab

What  is  sociopragma2cs?  

•  The  aspect  of  language  use  that  relates  to  everyday  social  prac.ces.  hVp://www.wordsense.eu  dic2onary  – What  are  social  prac2ces?  

Well  …  from  our  language  focused  prism  J    •  Interac2ons,  expressions  of  emo2ons/beliefs/opinions,  etc.    

Page 4: Ibm cog institutetalk_diab

   Text  and  Social  Rela2ons  

We   can   use   linguis2c  analysis   techniques   to  understand   the   implicit  rela2ons   that   develop   in  on-­‐line  communi2es  

Image  source:  clair.si.umich.edu  

Page 5: Ibm cog institutetalk_diab

Overarching  Agenda  

•  Goal:  AVempt  to  mine  social  media  text  for  clues  and  cues  on  understanding  human  interac2ons  

•  How:  Iden2fy  interes2ng  sociolinguis2c  behaviors  and  correlate  them  with  linguis2c  usage  that  are  quan2fiable  devices  and  build  effec2ve  models  in  the  process  

•  Compare  these  devices  cross  linguis2cally  

Page 6: Ibm cog institutetalk_diab

   Many  Different  Forms  of  Social  Media  

•  Communica2on    

•  Collabora2on    

•  Mul2media    

•  Reviews  &  opinions      

Page 7: Ibm cog institutetalk_diab

 Social  Media  Explosion  

source:  www.internetworldstats.com  

>3  billion  Internet  users  worldwide  >42.3%  popula2on  penetra2on  (>48%  in  the  MENA  region)  75%  of  them  used  “Social  Media”  

Page 8: Ibm cog institutetalk_diab

   Text  in  Social  Media  Some  social  media  applica2ons  are  all  about  text  

Page 9: Ibm cog institutetalk_diab

   Text  in  Social  Media  Even  the  ones  based  on  photos,  videos,  etc.  generate  a  lot  of  discussions  

Page 10: Ibm cog institutetalk_diab

   Text  in  Social  Media  

Huge  amount  of  text  exchanged  in  discussions  

Page 11: Ibm cog institutetalk_diab

Do  you  s2ll  need  convincing  that  text  is  important!  

Yeah  I  thought  not!  Just  checking  J  

Page 12: Ibm cog institutetalk_diab

Interes2ng  Sociolinguis2c  Phenomena:  Social  Constructs  

Mul2ple  Viewpoints  (Subgroups)   Influencers  

Pursuit  of  Power   Disputed  Topics  

Page 13: Ibm cog institutetalk_diab

Approach  to  processing  social  construct  phenomena    

(Direc.ve  from  the  IARPA  SCIL  Program)  

•  Iden2fy  language  uses  (LU)  per2nent  to  the  different  social  constructs  (SC)    

•  Correlate  the  LUs  with  Linguis2c  Construc2ons/Cons2tuents  (LC)      

Page 14: Ibm cog institutetalk_diab

Social  Construct:  Influencer  (inf)  

•  Language  Uses  – AVempt  to  Persuade  – Agreement/Disagreement  – Level  of  CommiVed  Belief  

Influencers  

Page 15: Ibm cog institutetalk_diab

Social  Construct:  Pursuit  of  Power    (PoP)  

•  Language  Uses  –  AVempt  to  Persuade  –  Agreement/Disagreement  –  Level  of  CommiVed  Belief  –  Nega2ve/Posi2ve  Aktude    – Who  is  talking  about  whom    –  Dialog  PaVerns  (non  linguis2c)  

Pursuit  of  Power  

Page 16: Ibm cog institutetalk_diab

Social  Construct:  Subgroup    (Sub)  

•  Language  Uses  – Agreement/Disagreement  – Nega2ve/Posi2ve  Aktude    – Sarcasm  – Level  of  CommiVed  Belief  – Signed  Network  (non  linguis2c)  

Mul2ple  Viewpoints  (Subgroups)  

Page 17: Ibm cog institutetalk_diab

LUs  in  our  approach  •  AVempt  to  Persuade  (Inf,  PoP)  •  Agreement/Disagreement  (Inf,  PoP,  Sub)  •  Level  of  CommiVed  Belief  (Inf,  PoP)  •  Nega2ve/posi2ve  aktude  (Sub,  PoP)  •  Sarcasm  (Sub)  •  Who  is  talking  about  whom  (PoP)  •  Dialog  PaVerns  (PoP)  •  Signed  Network  (Sub)  

Do  not  depend  on  linguis6c  analysis  Rely  on  linguis6c  analysis    

   

Page 18: Ibm cog institutetalk_diab

Cross  language  comparison:  Generaliza2ons  

•  In  general  similar  LU  level  devices  cross  linguis2cally  •  AVempt  to  persuade  

–  Claim:  grounding  in  experience,  commonly  respected  sources    

–  Argumenta2on:  evidence  and  support  from  other  discussants    

•  Agreement/Disagreement  –  Shared  opinion  (explicit  expression),  shared  perspec2ve  (implicit  aktude)  

•  Level  of  CommiVed  Belief  –  CommiVed:  The  sun  will  rise  tomorrow  –  Non  commiVed:  John  may  believe  that  the  moon  is  made  of  cheese  

Page 19: Ibm cog institutetalk_diab

Generaliza2ons  

•  -­‐ve/+ve  aktude  – Nega2ve  language    – Sen2ment/word  polarity  

•  Who  is  talking  about  whom  – Use  of  men2ons  and  their  frequency  

Page 20: Ibm cog institutetalk_diab

But  how  do  they  differ  in  their  linguis2c  expression?  

•  Arabic  vs.  English  social  media  use  different  linguis2c  cons2tuents  (LC)  to  exhibit  language  use    

 

Page 21: Ibm cog institutetalk_diab

Focus  of  this  talk  

Influencers  

Pursuit  of  Power   Disputed  Topics  

Mul2ple  Viewpoints  (Subgroups)  

Page 22: Ibm cog institutetalk_diab

Subgroup  Detec2on  Problem  

Discussion    Thread   Subgroups  

Discussant  

Page 23: Ibm cog institutetalk_diab

Example  

The  new  immigra2on  law  is  good.  Illegal  immigra2on  is  bad.  

Peter  

I  totally  disagree  with  you.  This  law  is  blatant  racism.  

Mary  

Have  you  read  all  what  Peter  wrote?  He  is  correct.  Illegal  immigra2on  is  bad  and  must  be  stopped.  

John  

You  are  clueless,  Peter.    Stop  suppor2ng  racism.  Alexander  

Peter   John  

Support  the  new  law  

Against  the  new  law  

Mary   Alexander  

Page 24: Ibm cog institutetalk_diab

Sample  thread  

Page 25: Ibm cog institutetalk_diab

Subgroup  Detec2on  System  Overview  

Discussion    Thread  

Subgroups  

Discussant  

Opinion  Expressions    

Iden2fica2on  

Thread    

Parsing  

…disagree……

….......…………

like………………………………bad…….  

Candidate      

Target  Iden2fica2on  

..........you……...  ...............................conserva1ves  ideologues……….  ………………………....…..Immigra1on  law…………………  

Opinion-­‐Target  Pairing  

disagree   You  

like   Conserva2ve    Ideologues  

bad   Immigra2on  law  

Reply  Structure  

Candidate      

Target  Iden2fica2on  

Clustering  

Discussant  A9tude  Profiles  (DAPs)    

       

Page 26: Ibm cog institutetalk_diab

Subgroup  Detec2on  System  Overview  

Discussion    Thread  

Subgroups  

Discussant  

Opinion  Expressions    

Iden2fica2on  

Thread    

Parsing  

…disagree……

….......…………

like………………………………bad…….  

Candidate      

Target  Iden2fica2on  

..........you……...  ...............................conserva1ves  ideologues……….  ………………………....…..Immigra1on  law…………………  

Opinion-­‐Target  Pairing  

disagree   You  

like   Conserva2ve    Ideologues  

bad   Immigra2on  law  

Reply  Structure  

Candidate      

Target  Iden2fica2on  

Clustering  

Discussant  A9tude  Profiles  (DAPs)    

       

Page 27: Ibm cog institutetalk_diab

1  -­‐  Thread  Parsing  

The  new  immigra2on  law  is  good.  Illegal  immigra2on  is  bad.  

Peter  

I  totally  disagree  with  you.  This  law  is  blatant  racism.  

Mary  

Have  you  read  all  what  Peter  wrote?  He  is  correct.  Illegal  immigra2on  is  bad  and  must  be  stopped.  

John  

You  are  clueless,  Peter.    Stop  suppor2ng  racism.  Alexander  

P1  

P2  

P3  

P4  

D1  

D2  

D3  

D4  

Iden2fy  Posts,  Discussants,  and  the  reply  structure  of  the  discussion  thread  

Page 28: Ibm cog institutetalk_diab

Subgroup  Detec2on  System  Overview  

Discussion    Thread  

Subgroups  

Discussant  

Opinion  Expressions    

Iden2fica2on  

Thread    

Parsing  

…disagree……

….......…………

like………………………………bad…….  

Candidate      

Target  Iden2fica2on  

..........you……...  ...............................conserva1ves  ideologues……….  ………………………....…..Immigra1on  law…………………  

Opinion-­‐Target  Pairing  

disagree   You  

like   Conserva2ve    Ideologues  

bad   Immigra2on  law  

Reply  Structure  

Candidate      

Target  Iden2fica2on  

Clustering  

Discussant  A9tude  Profiles  (DAPs)    

       

Page 29: Ibm cog institutetalk_diab

2  -­‐  Iden2fy  Opinion  Words*  

The  new  immigra2on  law  is  good+.  Illegal  immigra2on  is  bad-­‐.  

Peter  

I  totally  disagree-­‐  with  you.  This  law  is  blatant-­‐  racism-­‐.  

Mary  

Have  you  read  all  what  Peter  wrote?  He  is  correct+.  Illegal  immigra2on  is  bad-­‐  and  must  be  stopped.  

John  

You  are  clueless-­‐,  Peter.    Stop  suppor2ng  racism.  Alexander  

P1  

P2  

P3  

P4  

D1  

D2  

D3  

D4  

*Iden2fying  opinion  words  using  Opinion  Finder  with  an  extended  lexicon  (implemented  using  random  walks  –  Hassan  &  Radev,  2011)  

Page 30: Ibm cog institutetalk_diab

Subgroup  Detec2on  System  Overview  

Discussion    Thread  

Subgroups  

Discussant  

Opinion  Expressions    

Iden2fica2on  

Thread    

Parsing  

…disagree……

….......…………

like………………………………bad…….  

Candidate      

Target  Iden2fica2on  

..........you……...  ...............................conserva1ves  ideologues……….  ………………………....…..Immigra1on  law…………………  

Opinion-­‐Target  Pairing  

disagree   You  

like   Conserva2ve    Ideologues  

bad   Immigra2on  law  

Reply  Structure  

Candidate      

Target  Iden2fica2on  

Clustering  

Discussant  A9tude  Profiles  (DAPs)    

       

Page 31: Ibm cog institutetalk_diab

3-­‐  Iden2fy  Candidate  Targets  of  Opinion  

Target  

Discussant  (  e.g.  you,    Peter)`  

Topic/En1ty  (e.g.  The  new  immigra2on  Law,                                  Illegal  Immigra2on)    

Page 32: Ibm cog institutetalk_diab

Candidate  Targets  

3-­‐  Iden2fy  Candidate  Targets  of  Opinion  

The  new  immigra2on  law  is  good+.  Illegal  immigra2on  is  bad-­‐.  

Peter  

I  totally  disagree-­‐  with  you.  This  law  is  blatant-­‐  racism-­‐.  

Mary  

Have  you  read  all  what  Peter  wrote?  He  is  correct+.  Illegal  immigra2on  is  bad-­‐  and  must  be  stopped.  

John  

You  are  clueless-­‐,  Peter.    Stop  suppor2ng  racism.  Alexander  

P1  

P2  

P3  

P4  

D1  

D2  

D3  

D4  

All  discussants  are  candidate  Targets  

Page 33: Ibm cog institutetalk_diab

Candidate  Targets  

3-­‐  Iden2fy  Candidate  Targets  of  Opinion  

The  new  immigra2on  law  is  good+.  Illegal  immigra2on  is  bad-­‐.  

Peter  

I  totally  disagree-­‐  with  you.  This  law  is  blatant-­‐  racism-­‐.  

Mary  

Have  you  read  all  what  Peter  wrote?  He  is  correct+.  Illegal  immigra2on  is  bad-­‐  and  must  be  stopped.  

John  

You  are  clueless-­‐,  Peter.    Stop  suppor2ng  racism.  Alexander  

P1  

P2  

P3  

P4  

D1  

D2  

D3  

D4  

D1  

D1  

D1  

Iden2fy  discussant  men2ons  (2pp  or  name)    in  the  discussion  

D2  

Page 34: Ibm cog institutetalk_diab

Candidate  Targets  

3-­‐  Iden2fy  Candidate  Targets  of  Opinion  

The  new  immigra2on  law  is  good+.  Illegal  immigra2on  is  bad-­‐.  

Peter  

I  totally  disagree-­‐  with  you.  This  law  is  blatant-­‐  racism-­‐.  

Mary  

Have  you  read  all  what  Peter  wrote?  He  is  correct+.  Illegal  immigra2on  is  bad-­‐  and  must  be  stopped.  

John  

You  are  clueless-­‐,  Peter.    Stop  suppor2ng  racism.  Alexander  

P1  

P2  

P3  

P4  

D1  

D2  

D3  

D4  

D1  

D1  

D1  

D1  Peter  

Iden2fy  anaphoric  men2ons  of  discussants  

D2  

Page 35: Ibm cog institutetalk_diab

Candidate  Targets  

3-­‐  Iden2fy  Candidate  Targets  of  Opinion  

The  new  immigra1on  law  is  good+.  Illegal  immigra1on  is  bad-­‐.  

Peter  

I  totally  disagree-­‐  with  you.  This  law  is  blatant-­‐  racism-­‐.  

Mary  

Have  you  read  all  what  Peter  wrote?  He  is  correct+.  Illegal  immigra1on  is  bad-­‐  and  must  be  stopped.  

John  

You  are  clueless-­‐,  Peter.    Stop  suppor2ng  racism.  Alexander  

P1  

P2  

P3  

P4  

D1  

D2  

D3  

D4  

D1  

D1  

D1  

D1  Peter  

Topic1  

Topic1  

Topic2  

Topic2  

D2  

Topic  1   Topic  2  

Page 36: Ibm cog institutetalk_diab

3-­‐  Iden2fy  Candidate  Targets  of  Opinion  

•  Techniques  used  to  iden2fy  topical  targets  

– Named  En2ty  Recogni2on  

– Noun  phrase  chunking    

Page 37: Ibm cog institutetalk_diab

Subgroup  Detec2on  System  Overview  

Discussion    Thread  

Subgroups  

Discussant  

Opinion  Expressions    

Iden2fica2on  

Thread    

Parsing  

…disagree……

….......…………

like………………………………bad…….  

Candidate      

Target  Iden2fica2on  

..........you……...  ...............................conserva1ves  ideologues……….  ………………………....…..Immigra1on  law…………………  

Opinion-­‐Target  Pairing  

disagree   You  

like   Conserva2ve    Ideologues  

bad   Immigra2on  law  

Reply  Structure  

Candidate      

Target  Iden2fica2on  

Clustering  

Discussant  A9tude  Profiles  (DAPs)    

       

Page 38: Ibm cog institutetalk_diab

4-­‐  Opinion-­‐Target  Pairing  

I  totally  disagree-­‐  with  you.  The  new  immigra1on  law  is  blatant-­‐  racism-­‐.  

Mary   P2  

D1   Topic1  

nsubj(disagree-3, I-1) advmod(disagree-3, totally-2) root(ROOT-0, disagree-3) prep_with (disagree-3, you-5) Rule    

nsubj(racism-­‐-4, Topic1-1) cop(racist-4, is-2) amod(racism-4, blatant-3) root(ROOT-0, racist-4)

Rule    

Page 39: Ibm cog institutetalk_diab

Named  en2ty  rules  

Page 40: Ibm cog institutetalk_diab

Candidate  Targets  

4-­‐  Opinion-­‐Target  Pairing  

The  new  immigra1on  law  is  good+.  Illegal  immigra1on  is  bad-­‐.  

Peter  

I  totally  disagree-­‐  with  you.  This    law  is  blatant-­‐  racism-­‐.  

Mary  

Read  all  what  Peter  wrote.  He  is  correct+.  Illegal  immigra1on  is  bad-­‐  and  must  be  stopped.  

John  

You  are  clueless-­‐,  Peter.    Stop  suppor2ng  racism.  Alexander  

P1  

P2  

P3  

P4  

D1  

D2  

D3  

D4  

D1  

D1  

D1  

D1  Peter  

Topic1  

Topic1  

Topic2  

Topic2  

Topic  1   Topic  2  

Page 41: Ibm cog institutetalk_diab

4-­‐  Opinion-­‐Target  Pairing  

•  Language  Uses  (LUs)  present  in  this  step:  

– Targeted  sen2ment  toward  other  discussants  (2nd  person)  

– Targeted  Sen2ment  toward  topic  men2ons  (3rd  person)  

I  totally  disagree-­‐  with  you.  

This  law  is  blatant-­‐  racism-­‐.  

Page 42: Ibm cog institutetalk_diab

4-­‐  Opinion-­‐Target  Pairing  

•  LU  details  

– Rule-­‐based  detec2on  of  sen2ment  targets  (we’ve  also  been  experimen2ng  with  supervised  target  detec2on  methods)  

– Discussant  targets  are  iden2fied  by  2nd  person  pronouns  (you,  your,  yourself,  etc.)  and  by  username  men2ons  (casper3912,  etc.)  

Page 43: Ibm cog institutetalk_diab

Subgroup  Detec2on  System  Overview  

Discussion    Thread  

Subgroups  

Discussant  

Opinion  Expressions    

Iden2fica2on  

Thread    

Parsing  

…disagree……

….......…………

like………………………………bad…….  

Candidate      

Target  Iden2fica2on  

..........you……...  ...............................conserva1ves  ideologues……….  ………………………....…..Immigra1on  law…………………  

Opinion-­‐Target  Pairing  

disagree   You  

like   Conserva2ve    Ideologues  

bad   Immigra2on  law  

Reply  Structure  

Candidate      

Target  Iden2fica2on  

Clustering  

Discussant  A9tude  Profiles  (DAPs)    

       

Page 44: Ibm cog institutetalk_diab

5-­‐  Discussant  Aktude  Profile  

Target1   ………   Targetn  

+   -­‐   #  IA   +   -­‐   #  IA   +   -­‐   #  IA  DAP1  

DAP2  

#  IA  is  the  number  of  interac2ons  

Page 45: Ibm cog institutetalk_diab

5-­‐  Discussant  Aktude  Profile  

Peter  

Mary  

John  

Alexander  

Topic  1   Topic  2  

Targets  Discussants  

0   0   0   0   0   0   1   0   1   0   0   0   1   0   1   0   1   1  

0   0   0   0   0   0   0   1   1   1   0   1   0   2   2   0   0   0  

0   0   0   1   0   1   1   0   2   0   0   0   0   0   0   0   1   1  

1   0   1   0   0   0   0   1   1   0   0   0   0   0   0   0   0   0  

Page 46: Ibm cog institutetalk_diab

5-­‐  Discussant  Aktude  Profile  

Peter  

Mary  

John  

Alexander  

Topic  1   Topic  2  

Targets  Discussants  

0   0   0   0   0   0   1   0   1   0   0   0   1   0   1   0   1   1  

0   0   0   0   0   0   0   1   1   1   0   1   0   2   2   0   0   0  

0   0   0   1   0   1   1   0   2   0   0   0   0   0   0   0   1   1  

1   0   1   0   0   0   0   1   1   0   0   0   0   0   0   0   0   0  

Each  Discussant  is  implicitly  posi1ve  toward  himself  

Page 47: Ibm cog institutetalk_diab

Subgroup  Detec2on  System  Overview  

Discussion    Thread  

Subgroups  

Discussant  

Opinion  Expressions    

Iden2fica2on  

Thread    

Parsing  

…disagree……

….......…………

like………………………………bad…….  

Candidate      

Target  Iden2fica2on  

..........you……...  ...............................conserva1ves  ideologues……….  ………………………....…..Immigra1on  law…………………  

Opinion-­‐Target  Pairing  

disagree   You  

like   Conserva2ve    Ideologues  

bad   Immigra2on  law  

Reply  Structure  

Candidate      

Target  Iden2fica2on  

Clustering  

Discussant  A9tude  Profiles  (DAPs)    

       

Page 48: Ibm cog institutetalk_diab

Clustering  

Peter  Mary  

John  Alexander  

Subgroup  2  Subgroup  1  

(Peter-­‐,  Topic1-­‐)  

(Peter-­‐)  

(Topic1+,  Topic  2-­‐)  

(Peter+,  Topic  2-­‐)  

Page 49: Ibm cog institutetalk_diab

Evalua2on  (Abu-­‐Jbara  et  al.,  ACL  2012)  (Abu-­‐Jbara  et  al.,  ACL  2013)  

 

Page 50: Ibm cog institutetalk_diab

English  Data    

•  117    Discussions    •  Short  threads      •  short  posts  •  Human  annota2on  •  More  formal  

•  12    Polls  +  Discussions    •  Long  threads  •  Long  and  short  posts  •  Data  self-­‐labeled  •  Less  formal  

•  30    debates  •  Long  threads  •  Long  and  short  posts  •  Data  self-­‐labeled  •  Less  formal  

Page 51: Ibm cog institutetalk_diab

English  Evalua2on  Datasets  

Page 52: Ibm cog institutetalk_diab

Arabic  Data  

•  Forum  for  2  sided  self  labeled  poli2cal  debates    www.naqeshny.com  

 •  36  debates  comprising  711  posts  corresponding  to  

326  users  •     •  The  average  number  of  posts  per  discussion  19.75  

and  average  number  of  discussants  per  topic  13.08  

Page 53: Ibm cog institutetalk_diab

Evalua2on  Metrics    

1.  Purity  

Source:  hVp://nlp.stanford.edu/IR-­‐book/html/htmledi2on/evalua2on-­‐of-­‐clustering-­‐1.html  

Page 54: Ibm cog institutetalk_diab

Evalua2on  Metrics    

2.  Entropy  

3.  F-­‐Measure  

where  P(I,  j)  is  the  probability  of  finding  an  element  from  the  category  i  in  the  cluster  j,  nj  is  the  number  of  items  in  cluster  j,  and  n  the  total  number  of  items  in  the  distribu2on.  

Page 55: Ibm cog institutetalk_diab

Baselines  

•  Interac2on  Graph  Clustering  (GC)  – Nodes:  Par2cipants  –  Edges:  interac2ons  (connect  two  par2cipants  if  they  exchange  posts)  

•  Text  Classifica2on  (TC)  –  Build  TF-­‐IDF  vectors  for  each  par2cipant  (using  all  his/her  posts)  

–  Cluster  the  vector  space  

Page 56: Ibm cog institutetalk_diab

English  Clustering  Algorithm  

•  K-­‐means  •  Expecta2on  Maximiza2on  (EM)  •  Farthest  First  (FF)    

Page 57: Ibm cog institutetalk_diab

English  Clustering  Algorithm  

•  K-­‐means  •  Expecta2on  Maximiza2on  (EM)  •  Farthest  First  (FF)  

Page 58: Ibm cog institutetalk_diab

Arabic  Clustering  Algorithm  

•  K-­‐means  •  Expecta2on  Maximiza2on  (EM)  •  Farthest  First  (FF)  

Page 59: Ibm cog institutetalk_diab

Arabic  Clustering  Algorithm  

•  K-­‐means  •  Expecta2on  Maximiza2on  (EM):  Purity  0.67  Entropy  0.72  (Best  Results)  

•  Farthest  First  (FF)  

Page 60: Ibm cog institutetalk_diab

Comparison  to  baselines  

Our System

English  Results    

Arabic  Results    

Method   P   E  

Signed  Network   0.71   0.68  

Our  System   0.67   0.72  

Page 61: Ibm cog institutetalk_diab

Wikipedia   Poli1cal  Forum   Create  debate  

Purity   0.66   0.61   0.64  

Entropy   0.55   0.80   0.68  

F-­‐measure   0.61   0.56   0.60  

English  Results  

Page 62: Ibm cog institutetalk_diab

Wikipedia   Poli1cal  Forum   Create  debate  

Purity   0.66   0.61   0.64  

Entropy   0.55   0.80   0.68  

F-­‐measure   0.61   0.56   0.60  

English  Results  

Best  performing  

Page 63: Ibm cog institutetalk_diab

Wikipedia   Poli1cal  Forum   Create  debate  

Purity   0.66   0.61   0.64  

Entropy   0.55   0.80   0.68  

F-­‐measure   0.61   0.56   0.60  

English  Results  

Best  Performing      &    Worst  Performing  

Page 64: Ibm cog institutetalk_diab

Component  Evalua2on  

Our  System  

No  Topical  Targets  No  Discussant  Targets  

No  Sen1ment  No  Interac1on  

No  Anaphora  Resolu1on  No  Named  En1ty  Recog.  

No  NP  Chunking  

Page 65: Ibm cog institutetalk_diab

Component  Evalua2on  

Our  System  

No  Topical  Targets  No  Discussant  Targets  

No  Sen1ment  No  Interac1on  

No  Anaphora  Resolu1on  No  Named  En1ty  Recog.  

No  NP  Chunking  

Not really a linguistic feature

Page 66: Ibm cog institutetalk_diab

Component  Evalua2on  

Our  System  

No  Topical  Targets  No  Discussant  Targets  

No  Sen1ment  

No  Interac1on  No  Anaphora  Resolu1on  No  Named  En1ty  Recog.  

No  NP  Chunking  

More of a linguistic feature!

Page 67: Ibm cog institutetalk_diab

Deeper  look  at  Agreement/Disagreement  

•  So  far  we  employed  shared/divergent  opinion  in  the  form  of  explicit  polarity  indicators  – Sen2ment  polarity  towards  other  discussants  

•  A:  So,  no  maHer  how  much  faith  you  have,  one  of  you  MUST  be  wrong!  (nega.ve)  

•  B:  You  are  a  scien.st?!  May  I  ask  in  which  field?  (nega.ve)  

– Sen2ment  polarity  towards  an  en.ty    •  A:  Here  is  an  excellent  verse  from  the  Bible..  (posi.ve)  •  B:  The  Bible  rightly  says  that...  (posi.ve)  

Page 68: Ibm cog institutetalk_diab

Implicit  Opinion/Perspec2ve  •  Observa2on:  People  sharing  similar  beliefs/perspec2ve  tend  to  use  the  same  evidence  to  support  their  point    –  Believers:  faith,  peace,  love,  ci2ng  verses  from  the  Bible...    –  Atheists:  reason,  science,  aVack  on  the  “logical”  flaws  in  Bible...    

•  However  it  is  not  always  explicit  (using  similar  words  and  similar  aktudes)  

•  Peter:  God  is  the  creator  of  mankind  •  Mary:  The  belief  in  an  ul2mate  divine  being  has  sustained  me  over  the  years    

–  Not  necessarily  posi2ve/nega2ve  –  High  dimensional  similarity  between  both  sentences  is  low!    –  BUT  we  know  Mary  and  Peter  share  the  same  perspec1ve  and  will  tend  to  be  in  agreement  with  each  other  

Page 69: Ibm cog institutetalk_diab

Modeling  of  implicit  agreement/disagreement    

•  Implicit  agreement  or  disagreement  (perspec2ve)  –  using  text  similarity  to  help  iden2fy  subgroups    

•  Perspec2ve  modeling  is  used  to  complement  explicit  aktude    

•  Perspec2ve  granularity  has  to  be  collected  on  the  level  of  a  thread  rather  than  a  single  post  

•  Hence  we  summarize  all  the  posts  in  the  thread.    

 

Page 70: Ibm cog institutetalk_diab

Our  Model  

•  Explicit  high  dimensional  aktude  toward  other  discussants  and  en22es    

•  Modeling  shared  perspec2ve  among  discussants  over  threads  using  textual  similarity  on  the  post  level  in  the  latent  space  

Page 71: Ibm cog institutetalk_diab

Extrac2ng  implicit  perspec2ve  

•  Run  Latent  Dirichlet  Alloca2on(LDA)  on  the  thread  

•  Extract  the  topic  distribu2on  of  each  post  •  Aggregate  the  distribu2ons  of  all  posts  between  each  pair  of  discussants  

Page 72: Ibm cog institutetalk_diab

FEATURE  REPRESENTATION:  ATTITUDE  PROFILES      

•  Vector  Representa2on    

•  Explicit  aktude  towards  other  discussants  and  En22es    

A   B   C   E1   E2  

A   0        0        0   1      1        2   0      1        1   1      0        1   0      0    0  

B   …  

C   -­‐-­‐  

Page 73: Ibm cog institutetalk_diab

FEATURE  REPRESENTATION:  ATTITUDE  PROFILES      

•  Vector  Representa2on    

•  Implicit  agreement  with  other  discussants    

A   B   C   E1   E2   A   B   C  

A   0        0        0   1      1        2   0      1        1   1      0        1   0      0    0   1    1    1   1    0    0.5   0.5  0    0  

B   …  

C   -­‐-­‐   1  1  1    

Page 74: Ibm cog institutetalk_diab

Data  •  English  

–  Create  Debate  (CD)    •  www.createdebate.com    •  Deba2ng  on  a  certain  topic    •  Sides  are  explicitly  indicated  by  discussants  in  a  poll  Informal  language    

– Wikipedia  Discussion  Forum  (WIKI)  •  en.wikipedia.org    •  Groups  labels  are  manually  annotated    •  Formal  language,  not  much  nega2ve  polarity    

•  Arabic  – www.naqeshny.com  –  Self  labeled  poli2cal  debates    

Page 75: Ibm cog institutetalk_diab

Experimental  Condi2ons  

•  Clustering  algorithm  –  S-­‐Link  #  of  clusters  by  rule  of  thumb  =  √n/2  

•  Evalua2on  Metrics  –  Purity,  Entropy,  F-­‐measure    

•  Baseline  –  RAND-­‐BASE:  Assign  discussants  to  clusters  randomly  –  SWD-­‐BASE:  Calculate  surface  word  distribu2on,  as  a  simpler  form  of  perspec2ve  

Page 76: Ibm cog institutetalk_diab

English  Results  Condi1on   Wiki   CD  

Purity   Entropy   Fmeasure   Purity   Entropy   Fmeasure  

RAND-­‐BASE   0.675   0.563   0.652   0.399   0.966   0.41  

SWD-­‐BASE   0.772   0.475   0.646   0.452   0.932   0.432  

SD   0.834   0.360   0.667   0.824   0.394   0.596  

SE   0.827   0.383   0.655   0.793   0.422   0.582  

SD+SE   0.835   0.362   0.665   0.82   0.385   0.604  

PERS   0.853   0.321   0.699   0.787   0.399   0.589  

SD+PERS   0.853   0.320   0.698   0.849   0.333   0.615  

SE+PERS   0.853   0.321   0.702   0.789   0.399   0.591  

SD+SE+PERS   0.857   0.310   0.703   0.861   0.315   0.625  

Page 77: Ibm cog institutetalk_diab

Observa2ons  Condi1on   Wiki   CD  

Purity   Entropy   Fmeasure   Purity   Entropy   Fmeasure  

RAND-­‐BASE   0.675   0.563   0.652   0.399   0.966   0.41  

SWD-­‐BASE   0.772   0.475   0.646   0.452   0.932   0.432  

SD   0.834   0.360   0.667   0.824   0.394   0.596  

SE   0.827   0.383   0.655   0.793   0.422   0.582  

SD+SE   0.835   0.362   0.665   0.82   0.385   0.604  

PERS   0.853   0.321   0.699   0.787   0.399   0.589  

SD+PERS   0.853   0.320   0.698   0.849   0.333   0.615  

SE+PERS   0.853   0.321   0.702   0.789   0.399   0.591  

SD+SE+PERS   0.857   0.310   0.703   0.861   0.315   0.625  

Best  Performance  is  when  we  combine  explicit  aktude  (SD  Sen2ment  toward  other  discussants,  SE  Sen2ment  toward  En22es)  with  implicit  perspec2ve  (PERS),  regardless  of  genre  

Page 78: Ibm cog institutetalk_diab

Observa2ons  Condi1on   Wiki   CD  

Purity   Entropy   Fmeasure   Purity   Entropy   Fmeasure  

RAND-­‐BASE   0.675   0.563   0.652   0.399   0.966   0.41  

SWD-­‐BASE   0.772   0.475   0.646   0.452   0.932   0.432  

SD   0.834   0.360   0.667   0.824   0.394   0.596  

SE   0.827   0.383   0.655   0.793   0.422   0.582  

SD+SE   0.835   0.362   0.665   0.82   0.385   0.604  

PERS   0.853   0.321   0.699   0.787   0.399   0.589  

SD+PERS   0.853   0.320   0.698   0.849   0.333   0.615  

SE+PERS   0.853   0.321   0.702   0.789   0.399   0.591  

SD+SE+PERS   0.857   0.310   0.703   0.861   0.315   0.625  Wiki  seems  to  gain  more  from  implicit  perspec2ve  compared  to  CD  

 Explicit  Aktude  is  a  beVer  feature  for  CD:  people  express  their    sen2ments  openly,  while  in  Wiki  people  are  more  constrained  and    subtle  in  their  expressions  

Page 79: Ibm cog institutetalk_diab

Observa2ons  Condi1on   Wiki   CD  

Purity   Entropy   Fmeasure   Purity   Entropy   Fmeasure  

RAND-­‐BASE   0.675   0.563   0.652   0.399   0.966   0.41  

SWD-­‐BASE   0.772   0.475   0.646   0.452   0.932   0.432  

SD   0.834   0.360   0.667   0.824   0.394   0.596  

SE   0.827   0.383   0.655   0.793   0.422   0.582  

SD+SE   0.835   0.362   0.665   0.82   0.385   0.604  

PERS   0.853   0.321   0.699   0.787   0.399   0.589  

SD+PERS   0.853   0.320   0.698   0.849   0.333   0.615  

SE+PERS   0.853   0.321   0.702   0.789   0.399   0.591  

SD+SE+PERS   0.857   0.310   0.703   0.861   0.315   0.625  BeVer  results  obtained  on  the  same  data  set  from  the  previous  results  for  Wiki  (P  0.66,  E  0.55)  CD  (P  0.64,  E  0.68)  

Page 80: Ibm cog institutetalk_diab

Arabic  Results  Using  EM   Purity   Entropy   F-­‐measure  

Signed  Network  BASELINE   0.71   0.68   0.67  

Explicit  Aktude   0.67   0.72   0.65  

Implicit/Perspec2ve   0.64   0.74   0.65  

Our  System  (combined)   0.77   0.50   0.76  

Page 81: Ibm cog institutetalk_diab

Arabic  Results  Using  EM   Purity   Entropy   F-­‐measure  

Signed  Network  BASELINE   0.71   0.68   0.67  

Explicit  Aktude   0.67   0.72   0.65  

Implicit/Perspec2ve   0.64   0.74   0.65  

Our  System  (combined)   0.77   0.50   0.76  

Significant  improvement  over  baseline  

Page 82: Ibm cog institutetalk_diab

Arabic  Results  Using  EM   Purity   Entropy   F-­‐measure  

Signed  Network  BASELINE   0.71   0.68   0.67  

Explicit  Aktude   0.67   0.72   0.65  

Implicit/Perspec2ve   0.64   0.74   0.65  

Our  System  (combined)   0.77   0.50   0.76  

Significant  improvement  over  baseline  Complementarity  between  Explicit  aktude  and  Perspec2ve  

Page 83: Ibm cog institutetalk_diab

Conclusions  

•  We  can  successfully  model  sociopragma2c  phenomena  – Golden  rule  of  computer  science  (divide  and  conquer)  

Form  subgroups  J  

•  There  is  significant  room  for  improvement  •  It  takes  a  large  team  of  computer  scien2sts  and  significant  collabora2on  with  the  humani2es  to  get  this  program  going  

Page 84: Ibm cog institutetalk_diab

Where  are  we  now?  

•  Extensive  work  on  Sen2ment  and  Emo2on  Intensity  characteriza2on/detec2on  

•  Work  on  Rumor  Detec2on  •  Work  on  Level  of  CommiVed  Belief  Tagging  (check  us  out  at  *SEM  2015,  and  EXPROM  2015)  

•  Work  on  Ideological  Perspec2ve  Detec2on  (check  us  out  at  *SEM  2015)  

Page 85: Ibm cog institutetalk_diab

Thank  you  Ques.ons?