A"Twi&er)Based"Study"of""Newly"Formed"Clippings""in"American"English"!
!
Sravana!Reddy,!Joy!Zhong,!James!Stanford!Dartmouth!College!
Previous!Work!
Baclawski!(2012)!A!study!of!Fs!(‘adorbs’)!in!40!TwiLer!users!
Research!QuesNons!• Are!these!clippings!just!cyclical!“slang”?!(Eble!1996,!2004)!!
• Are!they!an!increasingly!producNve!process!with!new!social!meanings?!!
• Is!this!type!of!clipping!more!producNve!than!past!generaNons?!!
• What!is!the!role!of!the!–s!suffix!(adorbs,(awks,(totes)?!!
• Which!speakers!use!it!the!most?!Age,!gender,!ethnicity?!
This!Study!
Hypothesis:!!Women!are!leading!in!the!usage!of!these!new!clippings,!and!it!is!more!
urban/suburban"than!rural!
Labov!(1990,!2001),!Trudgill!(1972),!Coates!&!Pichler!(2011),!Holmes!&!Meyerhoff!(2003),!Wolfram!&!SchillingFEstes!(2006:155F6)!
Why!use!TwiLer!for!!American!Dialect!Research?!
Each!era!applied!contemporary!technology…!– Kurath!(1939)!– Hanley’s!recordings!(1931F1937)!(Purnell!2012)!– Chambers!&!Trudgill!(1998)!– Labov,!Ash!&!Boberg!(2006)!– Kretzschmar!(2009)!and!many!more !!
Now:!Social!Media!analysis,!computaNonal!modeling,!!
Mechanical!Turk!
TwiLer!for!SociolinguisNcs!– Eisenstein,!O’Connor,!Smith!&!Xing!(2010)!US!regional!variaNon!in!lexical!items!– Bamman,!Eisenstein!&!Schnoebelen!(2012)!Gendered!language!and!networks!– Maybaum!(2012)!!TwiLer!terms!– Zappavigna!(2013)!!TwiLer!discourse!and!variaNon!– Doyle!(2014)!Geographic!distribuNon!of!“needs!done”!
Methodology!• Collected!185!million!geoFtagged!tweets(originaNng!in!the!US!(JulFNov!2013)!!by!893,024!users!
• AutomaNcally!extracted!a!list!of!clippings!!• For!each!word,!created!demographic!profile!of!users!!– Gender!– PopulaNon,!median!age,!and!ethnic!distribuNon!at!the!user’s!locaNon!
• Compared!demographic!features!of!clipping!and!its!original!form!
ExtracNng!Clippings!
• Rather!than!manually!compiling!list!of!clippings,!automaNcally!learn!from!TwiLer!data!
• A!clipping!and!its!original!form!will!be!used!in!roughly!similar!contexts!
ExtracNng!Clippings!
• Represent!every!word!type!as!vector!of!its!leo!and!right!context!
• Rank!every!word!pair!by!context!vector!similarity!!
• Extract!top!ranked!pairs!where!first!three!characters!match!
totes!leo:!am,!are,!was,!were,!is..!right:!okay,!ok,!adorbs,!fine…!!adorable!leo:!is,!so,!these,!looks…!right:!omg,!with,!dork,…!!totally!leo:!am,!are,!was,!were,!is…!right:!okay,!fine,!insane,!not…!!!
• Survey!on!Mechanical!Turk!!• Demographic!quesNons:!age,!gender,!locaNon!!• Rate!familiarity!with!each!clipping!
o Unfamiliar!o Familiar,!but!I!do!not!use!it!o I!use!it!in!speech!only!o I!use!it!in!wriNng!only!o I!use!it!in!speech!and!wriNng!
• Same!survey!also!conducted!with!Dartmouth!undergraduate!students!
Old!vs.!New!Clippings!
Old!vs.!New!Clippings!• Split!survey!respondents!into!ages!18F29!and!30+!• For!each!clipping,!compute!average!familiarity!score!within!the!two!age!groups!
!0!=!Unfamiliar!3!=!Familiar,!but!I!do!not!use!it!4!=!I!use!it!in!speech!only!5!=!I!use!it!in!wriNng!only!6!=!I!use!it!in!speech!and!wriNng!
1.13 0.89 0.76
1.941.52
4.253.75 3.6
4.724.14
sesh obvi perf probs totes
Top858New8Words
Avg8Fam
iliarity8Score
Over830
Under830
Old!vs.!New!Clippings!
1.13 0.89 0.76
1.941.52
4.253.75 3.6
4.724.14
sesh obvi perf probs totes
Top858New8Words
Avg8Fam
iliarity8Score
Over830
Under830
• Newness!score!for!clipping!!=!below!30!familiarity!–!above!30!familiarity!
!
Old!vs.!New!Clippings!
• Newness!score!for!clipping!!=!below!30!familiarity!–!above!30!familiarity!
• Threshold!at!1.0!newness!score!
sesh!
-2 -1 0 1 2 3
perf!adorbs!ridic!vom!alc!vid!frat!perv!choc!cig!
New!clippings!(25)!Old!clippings!(55)!
doc!
Old!vs.!New!Clippings!sesh!perf!adorbs!ridic!vom!alc!vid!frat!perv!choc!cig!
New!clippings!(25)!Old!clippings!(55)!
doc!
obvi!
probs!
totes!
cray!
choreo!
presh!
craycray!
convo!def!
hilar!
gorg!
guac!
vocab!
obv!
fam!
adorb!
calc!
chem!esp!
prez!
merch!
vacay!
roomie!
mins!
diff!
parm!
prac!
pedi!
sched!
prob!
prolly!
sibs!
delish!
defs!
sus!
pic!
intro!
comfy!
prep!
secs!
approx!
fave!
milli!
collab!
combo!
apps!perv!
pregs!
champ!
info!
preggo!
breaky!
liq!
taL!
meds!
undies!
defly!
gents!
lolly!
anon!
fams!
ortho!
preg!num!
McD’s!
chiro!
cig!
Clippings!on!TwiLer!
0!100000!200000!300000!400000!500000!600000!700000!800000!900000!
Clipping! Original!
Old!New!
0!100000!200000!300000!400000!500000!600000!700000!800000!900000!
Clipping! Original!
Number!of!users!
Demographic!Analysis!• Gender!
(following!Bamman!et!al.)!– Most!TwiLer!users!report!a!name!in!addiNon!to!their!pseudonym!!!!
– Match!first!name!against!the!Social!Security!AdministraNon!list!of!baby!names!born!in!1995!
– About!2/3!of!users!have!names!in!the!SSA!list!and!are!assigned!a!gender!
Demographic!Analysis!• LocaNon!!
(following!Eisenstein!et!al.)!– Tweets!are!geoFtagged!with!laNtude/longitude!– Map!each!geoFcoordinate!to!one!of!33000!!Zip!Code!TabulaNon!Areas!(ZCTAs)!
– Ignore!users!that!tweet!from!more!than!one!ZCTA!– Get!demographic!aLributes!of!ZCTAs!from!2010!Census:!PopulaNon,!Median!Age,!White%,!African!American%,!Asian%,!NaNve!American%,!Hispanic%!
– Each!user!is!now!associated!with!a!demographic!profile!of!their!environment!
Demographic!Analysis!
• LogisNc!regression!– Predicted!variable:!clipping!or!original?!– Features:!demographic!profile!of!users!!
• Gender!• PopulaNon!• Median!Age!• Ethnicity!
New!and!Old!Clippings!Log!od
ds!
All!factors!shown!are!!significant!(p<0.05)!
F0.15!
F0.1!
F0.05!
0!
0.05!
0.1!
New! Old!
Gender!Female!
0!
0.005!
0.01!
0.015!
0.02!
0.025!
0.03!
0.035!
New! Old!
Log!od
ds!
Log10!PopulaNon!
Log!od
ds!
No!Significant!Effects!
New!and!Old!Clippings!Log!od
ds!
F0.005!
F0.004!
F0.003!
F0.002!
F0.001!
0!
0.001!
0.002!
0.003!
0.004!
0.005!
New! Old!
Log!od
ds!
Median!Age! Ethnicity!
Usage!of!Fs!suffix!in!clippings!
0!
0.1!
0.2!
0.3!
0.4!
0.5!
0.6!
Gender!Female!
Log!od
ds!
F0.016!
F0.014!
F0.012!
F0.01!
F0.008!
F0.006!
F0.004!
F0.002!
0!
Log!od
ds!
Median!Age!
adorbs/adorb,!probs/prob,!fams/fam,!awks/awk,!pregs/preg,!defs/def!!
Conclusion!
Hypothesis(Confirmed:!!Women!are!leading!in!the!usage!of!these!new!clippings,!and!it!is!more!
urban/suburban"than!rural!
Are!clippings!a!TwiLer!arNfact?!They!abound!in!longFform!blog!posts!too…!
…!and!TwiLer!users!ooen!lengthen!words!
Are!clippings!a!TwiLer!arNfact?!Baldwin!et!al.!(2013)!measure!average!word!lengths!in!TwiLer!and!different!corpora!
Are!clippings!a!TwiLer!arNfact?!Eisenstein!et!al.!(2013)!find!shortened!forms!!are!mainly!used!in!tweets!of!length!much!less!than!!140!characters.!Shortening!is!not!used!in!order!to!!fit!length!constraints!!
Are!clippings!a!TwiLer!arNfact?!Our!experiment:!avg!lengths!of!tweets!containing!!clippings!compared!to!tweets!with!original!forms!
77.01!81.64!
0!
20!
40!
60!
80!
100!
120!
140!
Clipping! Original!
Old!New!
65.08! 73.74!0!
20!
40!
60!
80!
100!
120!
140!
Clipping! Original!
Future!Work!
• Track!spread!of!clippings!in!TwiLer!over!Nme!– Will!these!clippings!spread!throughout!the!populaNon?!
– Geographic/demographic!dimensions!of!spread?!– When!did!these!clippings!originate?!!!
• MorphoFphonological!study!of!clippings!!