opinion mapping travelblogs efthymios drymonas alexandros efentakis dieter pfoser research center...

27
Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems Athens, Greece http://www.imis.athena-innovation.gr

Upload: julie-boone

Post on 17-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

Opinion Mapping Travelblogs

Efthymios Drymonas Alexandros Efentakis

Dieter Pfoser

Research Center AthenaInstitute for the Management of Information Systems

Athens, Greecehttp://www.imis.athena-innovation.gr

Page 2: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

Users create vast amounts of

“geospatial” narratives

…travel diaries, travel blogs…

How to quickly assess them?

2

Introduction

Page 3: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

• Simple assessment of user-generated

geospatial content

• Visualization

• Geospatial opinion maps

3

Motivation

Page 4: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

4

Opinion Mapping generating steps

1. Relating text to location –

Geocoding

2. Relating user sentiment to text –

Opinion Coding

3. Relating opinions to location –

Opinion Mapping

Page 5: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

1. Relating text to location – Geocoding

5

a) Web crawling

b) Geoparsing

c) Geocoding

Page 6: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

1a. Web Crawling

• Crawled for travel blog articles

• Parsed ~ 150k HTML documents

6

Page 7: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

1b. Geoparsing -Processing Pipeline Overview

• GATE

• Cafetiere IE

system

• YAHOO! API

– Placemaker

– Placefinder7

Page 8: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

1b. Linguistic Preprocessing

• Tokeniser & Orthographic Analyser

• Sentence Splitter

• POS Tagger

• Morphological Analysis, WordNet – Ex. “went south”, “goes south” = “go south”

8

Page 9: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

1b. Semantic Analysis: i. Ontology Lookup

Ontology access to retrieve potential

semantic class information

9

Page 10: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

1b. Semantic Analysis: ii. Feature Extraction (IE engine)

• Compilation of semantic analysis rules

• IE engine uses all previous info

– Linguistic information (POS tags,

orthographic info etc.)

– Semantic and context information

• Extraction of spatial objects10

Page 11: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

1c. PostProcessor - Geocoding

• Collecting semantic analysis

results and annotating them to

the original text

• Preparing the input to the

geocoder module

11

Page 12: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

1c. Geocoding

• Place name info from semantic analysis

transformed to coordinates

• YAHOO! Placemaker for disambiguation

• YAHOO! Placefinder geocoder

12

Page 13: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

output XML file

• From plain text

to structured

information

• Also global

document info

extracted13

Page 14: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

2. Relating user sentiment to text–

Opinion Coding 1/2• OpinionFinder tool

• Annotates text with positive or negative

sentiments

• Retain paragraphs only containing spatial

info

• Total positive and negative sentiments for

each paragraph 14

Page 15: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

2. Relating user sentiment to text–

Opinion Coding 2/2

15

• Score for this paragraph : +2

Page 16: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

3. Mapping opinions to location -Opinion Mapping

Scoring method

Spatial grid

Aggregation method

16

Page 17: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

Opinion Mapping (Scoring)• Each paragraph is characterized by a MBR

– Visualized paragraph’s MBR do not exceed 0.5º x

0.5º

• Each paragraph’s MBR is mapped to a

sentiment color according to users’ opinions

17

Page 18: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

Opinion Mapping (Issues)

Problem:

• Multiple paragraphs may partially target

the same area (overlapping areas)

• How to visualize partially overlapping

MBRs of different paragraphs and

sentiments

18

Page 19: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

Opinion Mapping (Spatial grid)

Solution:

• We split earth into small tiles of

0.0045º x 0.0045º (~500m x 500m)

• Each paragraph’s MBR consists of

several such small tiles

19

Page 20: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

Opinion Mapping (Aggregation Method) 1/2

• Partially overlapping paragraph

MBRs translated to a set of

overlapping tiles

– Sentiment aggregation per tile (for

drawing purposes)

• Instead of sentiment aggregation per MBR

20

Page 21: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

Opinion Mapping (Aggregation Method) 2/2

An example:

• For one cell/tile there are four

scores:

-1, -2, 1, 0

• Resulting score is their sum: -2

21

Page 22: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

Opinion Mapping examples

22Original MBRs of paragraphs

Page 23: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

Opinion Mapping examples

23Paragraph MBRs divided in tiles – Aggregation per tile

Page 24: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

Opinion Mapping examples

24Final result

Page 25: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

Conclusions• Aggregating opinions is important for utilizing and

assessing user-generated content

• Total of more than 150k web pages/articles were

processed

• Sentiment information from various articles is

aggregated and visualized

• Relate portions of texts to locations

• Geospatial opinion-map based on user-contributed

information

25

Page 26: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

Future Work

• Better approach on sentiment analysis

• More in-depth analysis of the results

• Examine micro blogging content streams

• Live updated sentiment information

26

Page 27: Opinion Mapping Travelblogs Efthymios Drymonas Alexandros Efentakis Dieter Pfoser Research Center Athena Institute for the Management of Information Systems

End.. Questions?

27