introduction product facet identification subtopic summarization discussion and conclusion...

49
Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Product Review Summarization from a Deeper Perspective Ly Duy Khang Supervisor: A/P KAN Min Yen Ly Duy Khang CS4101 B.COMP. DISSERTATION 1

Post on 19-Dec-2015

223 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Product Review Summarizationfrom a Deeper Perspective

Ly Duy Khang

Supervisor: A/P KAN Min Yen

Ly Duy Khang CS4101 B.COMP. DISSERTATION1

Page 2: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

1. Introduction Motivation Related work Problem statement & Our approach

2. Product Facet Identification Preliminaries Methodology Evaluation Improvement

3. Subtopic Summarization Preliminary Methodology Evaluation

4. Discussion and Conclusion

Outline

Ly Duy Khang CS4101 B.COMP. DISSERTATION2

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 3: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

1. Introduction Motivation Related work Problem statement & Our approach

2. Product Facet Identification Preliminaries Methodology Evaluation Improvement

3. Subtopic Summarization Preliminary Methodology Evaluation

4. Discussion and Conclusion

Outline

Ly Duy Khang CS4101 B.COMP. DISSERTATION

MotivationRelated workProblem statement & Our approach

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

3

Page 4: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

Product review

A media commonly provided by online merchants for customers to review and express opinions on the products that they have purchased.

MotivationRelated workProblem statement & Our approach

4

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 5: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Product review is an important source of information:1. More and more people are shopping online, as

a result of the expansion of e-commerce.2. Enables customers to find opinions about

products easily, as well as to share them with their peers.

3. Allows producers to get certain degree of feedback.

Ly Duy Khang CS4101 B.COMP. DISSERTATION

Problems

1. The number of reviews is often too large, and is still growing rapidly.

2. It is difficult to locate and capture opinions effectively.

MotivationRelated workProblem statement & Our approach

5

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 6: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Product review summarization system

1. Automatically process a large collection of reviews.

2. Identify topics and opinions in the review.3. Aggregate all information and present a

concise summary to the user.

Ly Duy Khang CS4101 B.COMP. DISSERTATION

MotivationRelated workProblem statement & Our approach

6

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 7: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Summarization

The task of extracting and presenting the most important information

from the inputs.• News headline• Program agenda• Scientific paper abstract• …

Ly Duy Khang CS4101 B.COMP. DISSERTATION

MotivationRelated workProblem statement & Our approach

7

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 8: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Review Summarization

Focus on opinions (techniques from Sentiment Analysis):

• Thumbs-up/Thumbs-down indication: [Turney02]

• Facet-based summary: [Hu04a],[Hu04b],[Popescu05]

• Comparative summary: [Hu05]

Ly Duy Khang CS4101 B.COMP. DISSERTATION8

MotivationRelated workProblem statement & Our approach

Product Facet examples:

1. Camera: “battery life”, “lens”, “flash”, “resolution”, etc.

2. Music player: “sound” , “weight”, “size”, “storage”, etc.

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 9: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION9

MotivationRelated workProblem statement & Our approach

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

GoogleProduct

Bing Shopping

Page 10: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Problem statement

Produce a facet-based summary of product review that captures

• Opinions of users.• Evidences that support those opinions.

Ly Duy Khang CS4101 B.COMP. DISSERTATION10

MotivationRelated workProblem statement & Our approach

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 11: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION11

MotivationRelated workProblem statement & Our approach

Approach and Contribution

Two main components:1. Product Facet Identification• Re-implement the baseline from [Hu04a]• Contribute a new effective heuristic to

improve the accuracy2. Subtopic Summarization• Initiate a sentence clustering solution• Make necessary modification to sentence

semantic similarity measurement (adopted from [Li06] and [Kong07])

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 12: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

1. Introduction Motivation Related work Approach

2. Product Facet Identification Preliminaries Methodology Evaluation Improvement

3. Subtopic Summarization Overview Methodology Evaluation

4. Discussion and Conclusion

Outline

Ly Duy Khang CS4101 B.COMP. DISSERTATION

PreliminariesMethodologyEvaluationImprovement

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

12

Page 13: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Why do we want to automate this task?

1. It is hard or even impossible to obtain a complete list of facets.• e.g., iPhone’s alarm function

2. Different set of words used by users and manufacturers/sellers to describe the same facet.• e.g., Price vs. Value; Body vs. Case

3. The manufacturer may not want to include those weak facets of their product.• e.g., iPhone is unable to play Flash on the

WebLy Duy Khang CS4101 B.COMP. DISSERTATION

PreliminariesMethodologyEvaluationImprovement

13

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 14: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

PreliminariesMethodologyEvaluationImprovement

Explicit/Implicit product facet

Product facets can be expressed explicitly or implicitly.

1. The pictures of this camera are very clear.2. The camera fits nicely into my palm.

We only consider explicit facet – appears as noun/noun phrase in the

sentence.

14

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 15: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

Architecture Overview

PreliminariesMethodologyEvaluationImprovement

15

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 16: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

a/ Preprocessing

1. Process each input sentence with a Part-of-Speech (POS) Tagger to obtain the POS label for each word.

2. Remove stop words from the result.3. Stem each word to obtain its root form4. Only noun/noun phrases are fed to the next

module.

PreliminariesMethodologyEvaluationImprovement

16

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 17: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

b/ Frequent Mining

Identify all frequent noun/noun phrases that satisfy the minimum

support, which is defined as the minimum number of sentences

containing that noun/noun phrases.

PreliminariesMethodologyEvaluationImprovement

17

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 18: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

c/ Post Processing (1/2)

1. Usefulness pruning: Remove single-word facet that is likely to be meaningless.• e.g. life battery life

2. Compactness pruning: Remove facet phrase that is not compact.• e.g. sample photo photo

PreliminariesMethodologyEvaluationImprovement

18

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 19: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

c/ Post Processing (2/2)

3. Infrequent facet discovery: help discover genuine facets that are not mentioned a lot.• Gather opinion words that modify frequent

facets.• For each sentence that does not contain

frequent facet but one or more opinion words, include the nearest noun/noun phrase as facet.

PreliminariesMethodologyEvaluationImprovement

19

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 20: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

d/ Sentence Extraction• Sentences that contain any of the product

facets that we have discovered are labeled with that corresponding facet.

• Only opinionated sentences are sent down to the next component.

PreliminariesMethodologyEvaluationImprovement

20

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 21: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

a/ Experimental Data

From the same dataset as in [Hu04a]:• 1 Digital Camera (45 reviews)• 1 DVD Player (99 reviews)• 1 Cell phone (41 reviews)

PreliminariesMethodologyEvaluationImprovement

21

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 22: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

b/ Evaluation Measure

PreliminariesMethodologyEvaluationImprovement

22

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 23: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

c/ Experimental Result (Baseline)

PreliminariesMethodologyEvaluationImprovement

Baseline

Recall Precision F

Camera

79 0.822 0.747 0.783

Phone 67 0.761 0.718 0.739

DVD 49 0.797 0.793 0.795

Avg. 65 0.793 0.753 0.772

23

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 24: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

Improvement - Syntactic Role (1/2)

Many noisy results such as: “light”, “hand”, “time”, “month”, “hour”,

etc. • Filtered by considering the word’ syntactic role

in the sentence.

PreliminariesMethodologyEvaluationImprovement

24

Improvement - Syntactic Role (2/2)

During the preprocessing step, we do not pass down to the next

module those noun/noun phrases that do not appear as subject/object

in the sentence.

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 25: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

Experimental Result (Baseline with Syntactic Role)

PreliminariesMethodologyEvaluationImprovement

Recall Precision F-measure

Baseline

Improve

Baseline

Improve

Baseline

Improve

Camera

0.822 0.8220.74

70.80

20.78

30.81

2

Phone

0.761 0.7610.71

80.78

50.73

90.77

3

DVD 0.797 0.7970.79

30.86

70.79

50.83

1

Avg.0.79

30.793+0%

0.753

0.818

+8.6%

0.772

0.805

+4.3%

25

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 26: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

1. Introduction Motivation Related work Approach

2. Product Facet Identification Preliminaries Methodology Evaluation Improvement

3. Subtopic Summarization Overview Methodology Evaluation

4. Discussion and Conclusion

Outline

Ly Duy Khang CS4101 B.COMP. DISSERTATION

OverviewMethodologyEvaluation

Introduction Product Facet IdentificationSubtopic SummarizationDiscussion and Conclusion

26

Page 27: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

OverviewMethodologyEvaluation

Ly Duy Khang CS4101 B.COMP. DISSERTATION27

Introduction Product Facet IdentificationSubtopic SummarizationDiscussion and Conclusion

Page 28: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

How often does subtopic exist?

OverviewMethodologyEvaluation

Camera Subtopics

Memory 3

LCD 6

Lens 7

… …

Average 5.125

Phone Subtopics

Radio 3

Headset 4

Signal 3

… …

Average 3.5

DVD Subtopics

Price 1

Remote 4

Format 1

… …

Average 2

28

Introduction Product Facet IdentificationSubtopic SummarizationDiscussion and Conclusion

Page 29: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

OverviewMethodologyEvaluation

Ly Duy Khang CS4101 B.COMP. DISSERTATION29

Introduction Product Facet IdentificationSubtopic SummarizationDiscussion and Conclusion

Page 30: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

Architecture Overview

OverviewMethodologyEvaluation

30

Introduction Product Facet IdentificationSubtopic SummarizationDiscussion and Conclusion

Page 31: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

a/ Preprocessing

1. General Entity pruning• Product class name: “camera”, “DVD”,

“phone”, etc.• Brand name: “Nikon”, “Canon”, “iPod”,

“Kingston”, etc.2. Similarity pruning ([Kong07])• “picture” vs. “image”, “photo”• “display” vs. “monitor”• “Megapixel” vs. “Resolution”

OverviewMethodologyEvaluation

31

Introduction Product Facet IdentificationSubtopic SummarizationDiscussion and Conclusion

Page 32: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

b/ Sentence representation & Semantic similarity measurement (1/2)Adopted from the work by [Li 06], a scalable

vector formulation is used to represent sentence, followed by cosine

distance between two vectors for sentence semantic similarity

measurement

OverviewMethodologyEvaluation

32

Introduction Product Facet IdentificationSubtopic SummarizationDiscussion and Conclusion

Page 33: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

b/ Sentence representation & Semantic similarity measurement (2/2)

OverviewMethodologyEvaluation

S1 = The battery of my camera is very impressive.S2 = This camera always has a long battery life.Joint Concept Vector:C = {battery, camera, impressive, long, battery life}V1 = { 1.0 , 1.0 , 1.0 , 0.25, 0.5 }V2 = { 0.5 , 1.0 , 0.25 , 1.0 , 1.0 }sim(S1, S2) = = 0.75

33

Introduction Product Facet IdentificationSubtopic SummarizationDiscussion and Conclusion

Page 34: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

c/ Sentence clustering (1/2)

1. Hierarchical clustering:

2. Non-hierarchical clustering:

OverviewMethodologyEvaluation

34

Introduction Product Facet IdentificationSubtopic SummarizationDiscussion and Conclusion

Page 35: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

c/ Sentence clustering (2/2)

OverviewMethodologyEvaluation

35

To estimate the number of clusters, we adopt the graph-based

algorithm proposed in [Hat01]

Introduction Product Facet IdentificationSubtopic SummarizationDiscussion and Conclusion

Page 36: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

d/ Compact presentation

OverviewMethodologyEvaluation

36

1. Sentences are now grouped into subtopics.2. Determine the orientation for every sentences

in the cluster.3. For each positive/negative partition P, we

would select the sentence with the maximum representative power to display

Introduction Product Facet IdentificationSubtopic SummarizationDiscussion and Conclusion

Page 37: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

OverviewMethodologyEvaluation

a/ Experimental Data

From the same dataset used in the previous component, we extract a

subset of those facets with high frequency in each product.• Camera: 8 facets• Phone: 8 facets• DVD: 6 facets

37

Introduction Product Facet IdentificationSubtopic SummarizationDiscussion and Conclusion

Page 38: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

OverviewMethodologyEvaluation

Experiment Results – Number of subtopics (average)

38

Manualsubtopi

cs

SenSim ([Li06])

SenSim(+ADJ)

Camera 5.125 1.875 3.0

Phone 3.5 1.5 2.5

DVD 2 1.167 1.5

Introduction Product Facet IdentificationSubtopic SummarizationDiscussion and Conclusion

Page 39: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

1. Purity: rewards the clustering solution that introduces less noise in each cluster:

2. Inverse Purity: rewards the clustering solution that gathers more elements (of the same cluster in the gold standard) into a corresponding cluster:

Ly Duy Khang CS4101 B.COMP. DISSERTATION

OverviewMethodologyEvaluation

b/ Evaluation Measure (1/2)

39

Introduction Product Facet IdentificationSubtopic SummarizationDiscussion and Conclusion

Page 40: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

F-measure: The harmonic mean of purity and inverse purity(α = 0.5):

Ly Duy Khang CS4101 B.COMP. DISSERTATION

OverviewMethodologyEvaluation

b/ Evaluation Measure (2/2)

40

Introduction Product Facet IdentificationSubtopic SummarizationDiscussion and Conclusion

Page 41: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

OverviewMethodologyEvaluation

c/ Experiment Results – Performance using SenSim (+ADJ)

41

Camera

0.524 0.617 0.542 0.676 0.819 0.725 0.714 0.828 0.753

5.125

+29.02%

+32.63%

+33.80%

+36.21%

+34.13%

+38.89%Phon

e0.647 0.593 0.604 0.682 0.783 0.717 0.702 0.739 0.707

3.5 +5.54% +32.00%

+18.74%

+8.63%

+24.64%

+17.16%

DVD 0.825 0.622 0.682 0.904 0.795 0.837 0.894 0.743 0.791

2 +9.60% +27.72%

+22.73%

+8.33%

+19.34%

+15.94%

Random (200) Hierarchical Non-hierarchical (200)

Purity

I-Purity

F(0.5) Purity I-

Purity F(0.5) Purity I-Purity

F(0.5)

Introduction Product Facet IdentificationSubtopic SummarizationDiscussion and Conclusion

Page 42: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

1. Introduction Motivation Related work Approach

2. Product Facet Identification Preliminaries Methodology Evaluation Improvement

3. Subtopic Summarization Overview Methodology Evaluation

4. Discussion and Conclusion

Outline

Ly Duy Khang CS4101 B.COMP. DISSERTATION

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

42

Page 43: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

Limitation and Future work

1. We do not conduct human evaluation on the effectiveness of the new proposed summary compared to the current ones.

2. Automatic sentiment analysis module integration.

3. Better sentence semantic similarity measurement with deep analysis.

4. Implicit facets handling.5. Sentence reformulation for summary output.6. Extend subtopics to other review

summarization settings.43

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 44: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

Conclusion

1. We designed a complete summarization system targeting the domain of product reviews.

2. We introduced an effective heuristic rule using syntactic role to improve the process of identifying product facets.

3. We showed the existence of subtopic within the discussion of product facets and addressed this limitation in current summarization system with our proposed clustering component.

4. We extended the sentence semantic similarity measurement with sentiment information.

44

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 45: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

[Barzilay02] Barzilay, R., Elhadad, N., & McKeown, K. (2002). Inferring strategies for sentence ordering in multidocument news summarization. Journal of Artificial Intelligence Research, 17, 35–55.

[Car98b] Carbonell, J., & Goldstein, J. (1998). The use of MMR, Diversity-based Re-ranking for Reordering Documents and Producing Summaries. Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, 335–336.

[Ding08] Ding, X., Liu, B., & Yu, P. S. (2008). A Holistic Lexicon-based Approach to Opinion Mining. Proceedings of the international conference on Web search and web data mining – WSDM

[Hat01] Hatzivassiloglou, V., Klavans, J. L., Holcombe, M. L., Barzilay, R., yen Kan, M., & McKeown, K. R. (2001). Simnder: A exible clustering tool for summarization. In Proceedings of the NAACL Workshop on Automatic Summarization, 41-49

[Hat97] Hatzivassiloglou, V., & McKeown, K. R. (1997). Predicting the Semantic Orientation of Adjectives. Proceedings of the eighth conference on European chapter of the Association for Computational Linguistics , 174-181.

[Hovy01] Hovy, E. H. (2001). Automated text summarization. Handbook of computational linguistics. Oxford University Press, Oxford.

References

Ly Duy Khang CS4101 B.COMP. DISSERTATION45

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 46: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

[Knight00] Knight, K., & Marcu, D. (2000). Statistics-based summarization-step one: Sentence compression. Proceedings of the National Conference on Artificial Intelligence, 703–710

[Barzilay99] Barzilay, R., Mckeown, K. R., & Elhadad, M. (1999). Information fusion in the context of multi-document summarization. Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics, 550–557.

[Hu04b] Hu, M., & Liu, B. (2004b). Mining Opinion Features in Customer Reviews. Proceedings of the National Conference on Artificial Intelligence, 755-760

[Hu05] Liu, B., Hu, M., & Cheng, J. (2005). Opinion observer: Analyzing and comparing opinions on the web. Proceedings of the 14th international conference on World Wide Web

[Kim06] Kim, S. M., & Hovy, E. (2006). Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Text. Computational Linguistics

[Li06] Li, Y., McLean, D., Bandar, Z. A., O'Shea, J. D., & Crockett, K. (2006). Sentence Similarity Based on Semantic Nets and Corpus Statistics. IEEE Trans. on Knowledge and Data Engineering, 18 (8), 1138-1150.

References

Ly Duy Khang CS4101 B.COMP. DISSERTATION46

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 47: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

[Liu09] Liu, B. (2009). Sentiment Analysis and Subjectivity. Handbook of Natural Language Processing, 1-38

[Popescu05] Popescu, A. M., & Etzioni, O. (2005). Extracting Product Features and Opinions from Reviews. Computational Linguistics, 339-346.

[Radev04] Radev, D., Jing, H., Styś, M., & Tam, D. (2004). Centroid-based summarization of multiple documents. Information Processing and Management, 40(6), 919–938.

[Turney02] Turney, P., C., & Littman, M. (2002). Unsupervised Learning of Semantic Orientation From a Hundred-Billion-Word Corpus.

[Wiebe99] Wiebe, J. M., Bruce, R. F., & O'Hara, T. P. (1999). Development and Use of a Gold-standard Data Set for Subjectivity Classifications. Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics 246-253

[Ye05] Ye, S., Qiu, L., Chua, T., & Kan, M. Y. (2005). NUS at DUC 2005: Understanding Documents via Concept Links. Document Understanding Conference (DUC)

[Yu03] Yu, H., & Hatzivassiloglou, V. (2003). Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences. Proceedings of the conference on Empirical methods in natural language processing,129-136

References

Ly Duy Khang CS4101 B.COMP. DISSERTATION47

Introduction Product Facet Identification

Subtopic SummarizationDiscussion and Conclusion

Page 48: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

Q & A

48

Page 49: Introduction Product Facet Identification Subtopic Summarization Discussion and Conclusion Introduction Product Facet Identification Subtopic Summarization

Ly Duy Khang CS4101 B.COMP. DISSERTATION

Thank you for your attention

49