lynne grewe and sushmita pandey california state university east bay [email protected]

62
Lynne Grewe and Sushmita Pandey California State University East Bay [email protected]

Upload: julie-goodwin

Post on 11-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Lynne Grewe and Sushmita Pandey

California State University East [email protected]

Page 2: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

The GoalUsing Social Data to make Social

Advertisement Recommendations.

PPARSPPARSSocial Network

ApplicationSocial Network

Application

Social Network

Social Network

User and FriendsUser and Friends

AdvertisementsAdvertisements

Your friends Nathan and Marty will like this

Page 3: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

The ProblemsWhat is the Social data?Which Social Data is useable/best?How do we capture and analyze it?How do relate Social data to Advertisements?How do we deliver a Social Advertisement?

Page 4: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

The EnvironmentSocial Network: MySpace, Facebook, Hi5,

Orkut, LinkedIn, Netlog, more

Page 5: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Overview of TalkPPARS overviewData – problem of multiple networksExample of DataParsingQuantizationResultsAdvertisement Recommendation ResultsFuture Work

Page 6: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Our System OverviewPPARS = Peer Pressure Advertisement

Recommendation SystemDATAINPUTDATAINPUT

FRONT ENDFRONT END

Get user-friends quantizedGet user-friends quantized

Process groups

Process groups

QuantizedQuantized

AdAdUser Ad

choice

User Ad

choice

Peer – Pressure Ad SelectionPeer – Pressure Ad Selection

User-originUser-origin

Group /Ad matches &

socialize

Group /Ad matches &

socialize

Model Ads

Model Ads

Page 7: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Social DataEvery network can provide different social dataTwo main splits: Facebook and OpenSocial

(majority of others).

OpenSocial is an open standard adopted by over 30 containers and growing --- international audience. Allows for “standardized” access. Popular containers like MySpace, Linkedin,

Google, Yahoo!, etc.Corporate support Google, Yahoo!, IBM,

Microsoft, and more.

Page 8: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Data FieldsAbout Me Activities Addresses Age Body_type

BooksCars

Cars Children Current_Location

Date_Of_BirthDrinker

Drinker Emails Ethnicity Fashion Food

Gender Happiest_when

Has_app Heroes Humor

ID Interests Job_interests Jobs Languages_Spoken

Living_Arrangments

Looking_for Movies Music Name

Network Prescense

Nick Name Pets Phone Political Views

Profile song Profile url Profile video quotes

Relationship status

Religion

Romance Scared Of Schools Sexual Orientation

Sports

Status Tags Thumbain Url Addresses Time Zone

Turn Ons Turn Offs TV Shows URLS

Page 9: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Some Example Data

AboutMe Ok, so I am a graduate of with degrees in Philosophy, and Religion. I currently live in with my wife and daughter. I enjoy Snowboarding/skiing, Motorcycles, computers, sports cars, and hanging out with friends.  

Page 10: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Some Example DataAge 33

Books The Professor and the Madman, Plato, Aristotle, Locke, Hume, Kant, luscombe

Movies Things to do in Denver when yer dead, The Departed, Encino Man, Real Genius

Music Very Eclectic, including Pennywise, Disturbed, System of a Down, Linkin Park, Senses Fail, Mudvayne, Goldfinger, and a bunch of others I am sure I cannot remember at this time

Music allen to // chimaira // sw1tched // bleed the sky // destiny // 40 below summer // endo // nothingface // enhancer // watcha // lamb of god // soilwork // skrape // flaw // unearth // slodust // deftones // raunchy // devildriver // reveille // american head charge // nonpoint // stutterfly // factory 81 // in flames // (hed) p.e. // dry kill logic // primer 55 // 36 crazyfists // sevendust // taproot // candiria // bionic jive // funeral for a friend // .....

Television Smallville, heros

Page 11: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Some Example DataInterests Snowboarding/skiing,

Motorcycles, computers, sports cars, and hanging out with friends.

Page 12: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Some Example DataStatus MarriedStatus In a Relationship

Smoker No

Drinker YesHeroes FatherHeroes Freie Stelle als

Held zu vergeben, Bewerbungen bitte an mich...

Looking_for Networking , Friends

Ethnicity White / CaucasianChildren Proud parentSexual_Orientation Straight

Page 13: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Some Example DataSchools University Of Nevada-Reno

Reno, NV Graduated: N/A Degree: Master's DegreeMajor: Hydrogeology 

2007 to Present Purdue University-Main Campus West Lafayette,Indiana Graduated: 2003 Student status: AlumniDegree: Bachelor's DegreeMajor: PhilosophyMinor: CPTClubs: Purdue Student Government Liberal Arts Student CouncilGreek:   Delta Chi

2001 to 2003 Reed Hs Sparks, NV Graduated: N/A Student status: AlumniDegree: High School Diploma

Page 14: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Social Data – which?Not all networks provide access to same data Users can keep information privateNot all data is “social”Not all data is directly useful for advertisers

Page 15: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Data

Current_Location

Date_Of_Birth Addresses Phone

Not typically available / private

Not all data is “social”

Not all data is directly useful for advertisers

ID Name Has_app Nich_Name Network Presence

Profile url Profile song Profile video Thumnail URL URLs

Drinker Emails Ethnicity Fashion Food

Page 16: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Infrequent dataFor our scheme need in common data to be

able to reason over in common feature space.Data that is NOT frequent:

Cars Fashion Food Humor

Political Views Pets Heroes

Page 17: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Social Data - whichFirst go around- based on network

availability and commonality, user prevalence and estimated advertisement usefulness

Balance between small sample space and feature dimensionalityAbout Me Activities Age Gender

Books TV Music Looking For

Drinker Relationship Ethnicity Religion

Language Interests Date_Of_Birth

Smoker

Page 18: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

PPARS – Front EndUser DataUser Data

PARSINGPARSING

Individual Social Data Tokens

Individual Social Data Tokens

CodebooksCodebooksWeb

ServicesWeb

Services

Friend 1 data

Friend 1 data

Friend2 data

Friend2 data

FriendX data

FriendX data User-originUser-origin

OntologyCodebookOntologyCodebook

QuantizedQuantizedSet of User and Friend Quantized Data VectorsSet of User and Friend

Quantized Data Vectors

QUANTIZATION

I like cars, have 2 kids,….. Movies: Star Wars

Age= 30 …..

Page 19: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

ParsingCreate small

social data tokens to passto Quantization

Null Data TestNull Data Test

Raw Social DataRaw Social Data

Hierarchical Segmentation

Split by . / ! / ?Split by . / ! / ?

Split by : Split by :

Split by - Split by -

Split by ; Split by ;

Split by ,Split by ,

Individual Social Data TokensIndividual Social Data Tokens

I like lots of movies. Like:Star Wars, Star Wars II, Jaws.And I love Harrison Fords acting.

•I like lots of movies• Like•Star Wars•Star Wars II•Jaws•And I love Harrison Fords acting.

Page 20: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Parsing ExampleAbout Me input = "I work as an engineer at

Motorola. I work in the peripherals department and do chip design. I am doing some management.“

Resulting Social Data Tokens:I work as an engineer at MotorolaI work in the peripherals department and do chip

designI am doing some management

Page 21: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Parsing ExampleInterests input = “Internet, Movies, Reading,

Karaoke,Building alternate communities”

Resulting Social Data Tokens:InternetMoviesReadingKaraokeLanguageBuilding alternative communities

Page 22: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Parsing ExampleMusic input = “Bands: Superdrag, Weezer, The Doors, The Beach Boys,

Journey Solo Artists: Billy Joel, Albums: Appetite for Destruction - Guns & Roses; Blue - Weezer“

Resulting Social Data Tokens: Bands Superdrag The Doors Cheap Trick The Beach Boys Journey Solo Artists Billy Joel Albums Appetite for Destruction Guns & Roses Blue Weezer

Lost formatting of line return between Journey and Solo Artists

Page 23: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

ParsingSimple technique of segmentationFuture work – include semantics of phrases

to detect potential “headings”, syntax rules around delimiters like : and –

Page 24: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

QuantizationTake a social data token and translate it into a

numerical feature vector. “I like cars” Cars = 0.2

For each social data field need to create meaningful feature vector elements.

For each social data field need to come up with techniques/algorithms to translate the raw social data token into support for its different feature vector elements.

Page 25: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Quantization- feature vectorPattern Recognition and Matching are later

parts of PPARSNeed numerical representations for this of

our user, friend social data and also to represent Ads.

“I like cars” =???what ad??

Cars = 0.2 Ad with cars around 0.2

Page 26: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Quantization – feature vectorFor each social data element like “About Us”,

“Gender”, “Movies” we have designed its own feature vector.

Result of technique used to quantize the input social token data

Result of studying keywords /trends in user database of sample social tokens.

To understand this ---- lets first discuss techniques used to quantize social data tokens as it related to the “type” of data element.

Page 27: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Quantization and Social Data TypeNumerical Data

Data is naturally numerical – i.e. Age, date of birth Can be quickly and effectively translated into number in some defined range:

Address – can be translated into lattitude and longitude Phone – again limited in digits Time zone – again predefined ranges

Categorizable Data Data where there is a predefined accepted taxonomy – i.e. movies their genre Data where through sample analysis and advertisement goals categories can

be derived Example: interests, about me, food, fashion

Indexed Data This is data that has defined sets of values specific to either container or

OpenSocial. Example : smoker = yes, no, occasionally, quit, never Other examples: gender, relationship, drinker, sexual orientation

Other This is data for which we can not easily derive an algorithm for categorizing.

Examples Profile Image , Profile Song URL, etc.

Page 28: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Collapsing of DataSome data fields have almost same meaning

or content typically greatly overlaps About Me and Interests (and even Status) Age and Date of Birth

Page 29: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Categorizable DataThis is the bulk of the data fields: About Me,

Interests, Music, Movies, TV, Books, Looking For, Religion, Ethnicity, Language

Determine Feature Elements:Accepted “standard” taxonomies Web Service taxonomiesAdvertisement driven taxonomies

Page 30: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

PPARS – Front EndUser DataUser Data

PARSINGPARSING

Individual Social Data Tokens

Individual Social Data Tokens

CodebooksCodebooksWeb

ServicesWeb

Services

Friend 1 data

Friend 1 data

Friend2 data

Friend2 data

FriendX data

FriendX data User-originUser-origin

OntologyCodebookOntologyCodebook

QuantizedQuantizedSet of User and Friend Quantized Data VectorsSet of User and Friend

Quantized Data Vectors

QUANTIZATION

I like cars, have 2 kids,….. Movies: Star Wars

Age= 30 …..

Page 31: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Categorization: Web ServiceFor some of our social data fields we are able

to utilize popular web services to convert our social data tokens into search hits that have categorized information associated with them.

Example: Internet Video Archive and IMDB Use movie genre

Page 32: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

IVA – movie search by actor “Robert Redford” http://api.internetvideoarchive.com/Video/MoviesByActorName.aspx?

DeveloperId=f377f57f-3bad-4704-8e80-1b643b206abd&SearchTerm=Robert+Redford

Some of the Results :- <item>- <Description>- <![CDATA[ The Unforeseen movie trailer - starring Robert Redford, Willie Nelson, Ann Richards,

Gary Bradley, Judah Folkman, William Greider. Directed by Laura Dunn. Theatrical Release Date: 2/29/2008 Genre: Documentary Rating: Not Rated   ]]>

  </Description>  <Title>THE UNFORESEEN</Title>   <Language>English</Language>   <Country>United States</Country>   <SiteUrl />   <Studio>Two Birds Films</Studio>   <StudioID>3018</StudioID>   <Rating>Not Rated</Rating>

  <Genre>Documentary</Genre>   <GenreID>13</GenreID>  

Page 33: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

IVA – movie search continued http://api.internetvideoarchive.com/Video/MoviesByActorName.aspx?

DeveloperId=f377f57f-3bad-4704-8e80-1b643b206abd&SearchTerm=Robert+Redford

  <HomeVideoReleaseDate>9/16/2008</HomeVideoReleaseDate>   <TheatricalReleaseDate>2/29/2008</TheatricalReleaseDate>   <Director>Laura Dunn</Director>   <DirectorID>36635</DirectorID>   <Actor1>Robert Redford</Actor1>   <ActorId1>7105</ActorId1>   <Actor2>Willie Nelson</Actor2>   <ActorId2>8591</ActorId2>   <Actor3>Ann Richards</Actor3>   <ActorId3>36642</ActorId3>   <Actor4>Gary Bradley</Actor4>   <ActorId4>36637</ActorId4>  

Page 34: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

IVA – movie search continued http://api.internetvideoarchive.com/Video/MoviesByActorName.aspx?

DeveloperId=f377f57f-3bad-4704-8e80-1b643b206abd&SearchTerm=Robert+Redford

  <HomeVideoReleaseDate>9/16/2008</HomeVideoReleaseDate>    <Link>http://videodetective.com/titledetails.aspx?publishedid=947964</Link>   <BoxOfficeInMillions>-1</BoxOfficeInMillions> - <!-- Television Content  -->   <AirDayOfWeek>-1</AirDayOfWeek>   <AirStartTime />   <ShowLengthInMinutes>-1</ShowLengthInMinutes>   <IsTelevisionContent>false</IsTelevisionContent>   <FirstReleasedYear>2008</FirstReleasedYear>  

<Image>http://content.internetvideoarchive.com/content/photos/1250/05253626_.jpg</Image>

  <Duration>164</Duration>   <DateCreated>3/20/2008 8:00:00 AM</DateCreated>   <Media>Movie</Media>   <PublishedId>947964</PublishedId>   <DateModified>4/22/2011 1:57:00 PM</DateModified>

AND MORE !!!!

selected GENRE

Page 35: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

IVA genres --- our movie feature elements

VideoCategory

Not Assigned

Western

Action-Adventure

Children's

Comedy

Drama

Family

Horror

Musical

Mystery-Suspense

Non-Fiction

Sci-Fi

War

Health/ Workout

Documentary

Thriller

Biography

Romance

Page 36: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Movie QuantizationFor each Social data token “Adam Sandler” , “Star

Wars” we can get multiple hits.

Example, “Robert Redford” – first 8 hits:Drama = 5Western = 1Documentary = 2

Issues: How do we know if actor name, movie title, director or

other? Multiple hits for actor or director ---what do we do?

(evidence them all) Multiple hits for movie title – what do we do? (take first hit)

These genres become our Movie feature elements

Page 37: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Order of Movie QuantizationGiven any social data element parsed from

the user’s MOVIE data, we cannot know apriori if it is a title or actor or director’s name. It may even be the genre of movies a user likes.

1.Title search (take first hit)

2.Actor search (evidence all)

3.Director Search (evidence all)

4.Keyword Matching (see next)

Page 38: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Quantization Result 1Up,Forrest Gump,Rear Window,District 9,Pac-

Man,WALL·E,My Flesh and Blood, MacMusical,

Yields:MOVIE_FAMILY=0.6, MOVIE_SCIFI=0.2,

MOVIE_DOCUMENTARY=0.4, MOVIE_THRILLER=0.2

Page 39: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Quantization using other servicesTV - IMDB,

http://www.imdb.com/search/title?title_type=tv_series&title=".

Books - Google Books Search, http://books.google.com/books/feeds/volumes?

Music - IVA’s music API http://api.internetvideoarchive.com/Music/**

Page 40: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Quantization via Keyword MatchingWhat do we do when there is no pre-determined

taxonomy and no services for database hits?Natural Language Processing techniques

Currently employ simple (but, effective and efficient) technique of Keyword matching /lookupCreate database of predetermined phrases/

keywordsLookup scheme to quantize social data token(s).

Individual Social Data Tokens

Individual Social Data Tokens

CodebooksCodebooksOntologyCodebookOntologyCodebook

QuantizedQuantizedSet of User and Friend Quantized Data VectorsSet of User and Friend

Quantized Data Vectors

“I work as an engineer” About ME lookup??“Watch a lot of drama” Movies look up ??

Page 41: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Keyword DatabaseUsed on : About Me / Interests, Religion,

Ethnicity, Looking For, Language, Relationship

Secondary use: Books, TV, Music, MoviesWhen service fails to provide any hits

Page 42: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Keyword Database Creationmanual scanning of hundreds (at starting level) of

user profilesdomain specific expert (human) knowledgedictionaries and taxonomies when exist

Issue: how determine weights for every entryExpert determined (consistency) or all equal valued

(no sense of importance)Issue: at very beginning level---can we create a

dictionary for everything ---no --- are there more advance NLP techniques

Page 43: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Some arbitrary Keyword DB entriesABOUT_ME HOME Cats 0.2ABOUT_ME HOME Children 0.2ABOUT_ME HOME Daughter 0.2ABOUT_ME HOME Dog 0.2ABOUT_ME HOME Cats 0.2ABOUT_ME HOME Children 0.2ABOUT_ME HOME Daughter 0.2ABOUT_ME HOME Dog 0.2ABOUT_ME HOME home 0.5

Page 44: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Some arbitrary Keyword DB entriesABOUT_ME ENTERTAINMENT

Shopping 0.2ABOUT_ME ENTERTAINMENT Shows

0.2ABOUT_ME ENTERTAINMENT Sing

0.2ABOUT_ME ENTERTAINMENT Ski

0.2ABOUT_ME ENTERTAINMENT

Songwriter 0.2

Page 45: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Keyword DB- evidence weight

Issue: how determine weights for every entryExpert determined (consistency) or all equal valued (no sense of importance)

System options: DB weights can take on different values, option to run with all weights equal.

Page 46: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Keyword DB- ??Issue: at very beginning level---can we create a

dictionary for everything ---no --- are there more advance NLP techniques to explore for inferences.

While users can write anything (and do), remember we are focuses on Advertisement Recommendation --- so the scope of our language is limited to hits related to our feature vector elements….this is a constrained problem

Home, Entertainment, Smoking, Work, Social, Movies, TV, Shopping, Books, etc.—these are the kinds of areas we are concerned with.

Page 47: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Types of Keyword MatchingSTRICT

Social data token must match exactly a DB entry“Drama” Drama √“I like Drama” Drama X

DB_ENTRY_CONTAINS_DATA_ELEMENTData token must exist inside the DB entry

“Drama” Drama and Comedy √

DB_ENTRY_PARTOF_DATA_ELEMENTPart of data token matches DB entry (this is further

segmenting data token) “I like Drama” Drama √

Page 48: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Quantization Results different kinds of Keyword Matching ‘ I am a student and I work and love cars'Output STRICT: No hitsABOUT_ME_ENTERTAINMENT = -1

ABOUT_ME_WORK = -1ABOUT_ME_HOME] = -1ABOUT_ME_SOCIAL = -1ABOUT_ME_FOOD = -1

Page 49: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Quantization Results different kinds of Keyword Matching ‘ I am a student and I work and love cars' Output

DB_ENTRY_CONTAINS_DATA_ELEMENTNo hitsABOUT_ME_ENTERTAINMENT = -1

ABOUT_ME_WORK = -1ABOUT_ME_HOME] = -1ABOUT_ME_SOCIAL = -1ABOUT_ME_FOOD = -1

Page 50: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Quantization Results different kinds of Keyword Matching

‘ I am a student and I work and love cars' 

Output DB_ENTRY_PARTOF_DATA_ELEMENTkeyword = student ABOUT_ME_WORK =0.2 keyword = work ABOUT_ME_WORK =0.5  keyword = cars ABOUT_ME_ENTERTAINMENT =0.2 keyword = LOVE ABOUT_ME_HOME=0.2 ABOUT_ME_SOCIAL=0.2 ABOUT_ME_ENTERTAINMENT = 0.2

ABOUT_ME_WORK = 0.7 ABOUT_ME_HOME = 0.2 ABOUT_ME_SOCIAL = 0.2 ABOUT_ME_FOOD = -1  

Page 51: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Quantization Results 2 – using DB_ENTRY_PARTOF_DATA_ELEMENT“

Fell in love with computers at 11, never got over it... Nonetheless, I have always understood that human problems are solved by people, not technology. My lifes work has been to empower communities to design and build their own solutions.”

6 data tokens from parsing RESULTS:

ABOUT_ME_ENTERTAINMENT = 0.2ABOUT_ME_WORK = 0.5ABOUT_ME_HOME = 0.2ABOUT_ME_SOCIAL = 0.2ABOUT_ME_FOOD = -1

Page 52: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Quantization Result 3 – good null resultsi am xing ju. test ABOUT ME for opensocial. Parsed results:i am xing jutest ABOUT ME for opensocial

NO keyword db hits ABOUT_ME_ENTERTAINMENT=> -1 ABOUT_ME_WORK => -1 ABOUT_ME_HOME => -1 ABOUT_ME_SOCIAL => -1 ABOUT_ME_FOOD => -1

Page 53: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Quantization ResultsGarbage in and Garbage out

LoL really dude that is the way to be no hits

is this garbage “LoL” = lots of love…..could you interpret this to be someone interested in social / friends?? Future – deeper interpretation / semantic analysis?

Page 54: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

IndexedSmoker, Drinker, Gender, Relationship (some

networks), Looking for (some networks) , etc.

Example for Drinker:

opensocial.Enum.Drinker.HEAVILYopensocial.Enum.Drinker.NOopensocial.Enum.Drinker.OCCASIONALLYopensocial.Enum.Drinker.QUITopensocial.Enum.Drinker.QUITTINGopensocial.Enum.Drinker.REGULARLYopensocial.Enum.Drinker.SOCIALLYopensocial.Enum.Drinker.YES

Page 55: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Quantized Feature Vector107 elementsNormalize to 0 to 1.0 (near)

Page 56: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Advertisement DescriptionExperts manually determine the feature

vector weighting for each add.Future –

to automate this from survey/ input directly from Advertiser

Is there a way to analyze the ad message or image – image understanding? Will results even match advertiser’s goals.

Page 57: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

PPARS --- Advertisement MatchingNot focus of this talkCurrently doing variations on KNN with

different forms of clusteringEarly results with small advertising database

and beginning Keyword database look goodWhat kinds of groups ---groups with user in it

or not? based on only in common feature elements or not.

Page 58: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

PPARS- Advertisement DeliveryArea of future work could be in effective

delivery of “social message” related to selected add. Now simple form of direct delivery

Based on grouping of same gender and age and strong likesin interests on home.

Page 59: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

PPARS- Advertisement DeliveryArea of future work could be in effective

delivery of “social message” related to selected add. Now simple form of direct delivery

Based on grouping of same gender and age and drinking.This is a grouping the user is not part of---only friends

Your friends Nathan and Marty will like this

Page 60: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

PPARS- Advertisement DeliveryHere the grouping is “loose” only related by

gender and very loosely by age. So the advertisement match is not great

Question: should be only serve to “strong” groups?

Page 61: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Analysis of Advertisement ResultsGroupings are tight when data allowsMatches to advertisements in levels – best,

top 10, etc. are correct

Page 62: Lynne Grewe and Sushmita Pandey California State University East Bay lynne.grewe@csueastbay.edu

Future WorkParsing – more syntax and semantics (NLP)

Parsing – differences in different languages.

Quantization – extend to Natural Language Understanding in addition/replacement of Keyword matching, effects of different evidence accumulation.

Data Extrapolation – using inference to create hits in more feature elements.