project sentiment analysis

Download project sentiment analysis

Post on 13-Apr-2017

680 views

Category:

Documents

1 download

Embed Size (px)

TRANSCRIPT

  • A Project Report on

    SENTIMENT ANALYSIS OF MOBILE REVIEWS USING

    SUPERVISED LEARNING METHODS

    A Dissertation submitted in partial fulfillment of the requirements for the award of the

    degree of

    BACHELOR OF TECHNOLOGY

    IN

    COMPUTER SCIENCE AND ENGINEERING

    BY

    Y NIKHIL (11026A0524)

    P SNEHA (11026A0542)

    S PRITHVI RAJ (11026A0529)

    I AJAY RAM (11026A0535)

    E RAJIV (11026A0555)

    Under the esteemed guidance of

    Dr. L. SUMALATHA

    Professor, CSE Department

    DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

    UNIVERSITY COLLEGE OF ENGINEERING

    JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY

    KAKINADA, KAKINADA 533003, A.P

    2011 - 2013

  • UNIVERSITY COLLEGE OF ENGINEERING

    JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY

    KAKINADA, KAKINADA 533003

    ANDHRA PRADESH, INDIA

    CERTIFICATE

    This is to certify that the dissertation titled Sentiment Analysis of Mobile Reviews

    Using Supervised Learning Techniques is submitted by Y.NIKHIL (11026A0524),

    P. SNEHA (11026A0542), S.PRITHVI RAJ (11026A0529), I.AJAYRAM

    (11026A0535) , E.RAJIV (11026A0555), students of B.Tech.(CSE - IIMDP) , in

    partial fulfillment of the requirements for the award of the degree of Bachelor Of

    Technology in Computer Science and Engineering is a record of bonafide work

    carried out by them under my supervision.

    Dr. L. Sumalatha

    Internal Guide,

    Head of the Department,

    Professor,

    Department of CSE,

    University College of Engineering,

    JNT University, Kakinada.

  • DECLARATION

    This is to certify that the thesis titled Sentiment Analysis of Mobile Reviews Using

    Supervised Learning Techniques is a bonafide work done by us, in partial

    fulfillment of the requirements for the award of the degree B.Tech.(CSE-IIMDP) and

    submitted to the Department of Computer Science & Engineering, University College

    of Engineering, Jawaharlal Nehru Technological University, Kakinada.

    I also declare that this project is a result of my own effort and that has not been copied

    from anyone and I have taken only citations from the sources which are mentioned in

    the references.

    This work was not submitted earlier at any other University or Institute for the award

    of any degree.

    Place: UCEK, JNTUK Y NIKHIL (11026A0524)

    Date: P SNEHA (11026A0542)

    S PRITHVI RAJ (11026A0529)

    I AJAY RAM (11026A0535)

    E RAJIV (11026A0555)

  • ii

    ACKNOWLEDGEMENTS

    We express our deep gratitude and regards to Dr. L. Sumalatha, Internal Guide and

    Professor, Head of Department of Computer Science & Engineering for her

    encouragement and valuable guidance in bringing shape to this dissertation.

    We thankful to all the Professors and Faculty Members in the department for their

    teachings and academic support and thanks to Technical Staff and Non-teaching staff

    in the department for their support.

    Y NIKHIL (11026A0524)

    P SNEHA (11026A0542)

    S PRITHVI RAJ (11026A0529)

    I AJAY RAM (11026A0535)

    E RAJIV (11026A0555)

  • iii

    ABSTRACT

    Sentiment analysis or opinion mining is the computational study of peoples opinions,

    sentiments, attitudes, and emotions expressed in written language. It is one of the most

    active research areas in natural language processing and text mining in recent years. Its

    popularity is mainly due to two reasons. First, it has a wide range of applications

    because opinions are central to almost all human activities and are key influencers of

    our behaviors. Whenever we need to make a decision, we want to hear others opinions.

    Second, it presents many challenging research problems, which had never been

    attempted before the year 2000. Part of the reason for the lack of study before was that

    there was little opinionated text in digital forms. It is thus no surprise that the inception

    and the rapid growth of the field coincide with those of the social media on the Web.

    In fact, the research has also spread outside of computer science to management

    sciences and social sciences due to its importance to business and society as a whole.

    In this talk, I will start with the discussion of the mainstream sentiment analysis research

    and then move on to describe some recent work on modeling comments, discussions,

    and debates, which represents another kind of analysis of sentiments and opinions.

    Sentiment classification is a way to analyze the subjective information in the text and

    then mine the opinion. Sentiment analysis is the procedure by which information is

    extracted from the opinions, appraisals and emotions of people in regards to entities,

    events and their attributes. In decision making, the opinions of others have a significant

    effect on customers ease, making choices with regards to online shopping, choosing

    events, products, entities. The approaches of text sentiment analysis typically work at a

    particular level like phrase, sentence or document level. This paper aims at analyzing a

    solution for the sentiment classification at a fine-grained level, namely the sentence

    level in which polarity of the sentence can be given by three categories as positive,

    negative and neutral.

  • iv

    TABLE OF CONTENTS

    1 INTRODUCTION

    1.1 Objective 4

    1.2 Proposed Approach and Methods to be Employed 4

    2 LITERATURE SURVEY

    2.1 Models 6

    2.1.1 Nave Bayes 6

    2.1.2 Bag Of Words 10

    2.1.3 Support Vector Machine 14

    2.1.4 Principal Component Analysis 21

    3 SYSTEM ANALYSIS AND DESIGN

    3.1 Software and Hardware Requirements 30

    3.2 Matlab Technology 30

    3.3 Data Flow Diagrams 32

    4 IMPLEMENTATION

    4.1 Elimination of Special Characters and Conversion to Lower Case38

    4.2 Word Count 38

    4.3 Testing and Training 39

    4.3.1 Nave Bayes 39

    4.3.2 Bag of Words 39

    4.3.3 Support Vector Machine 40

    4.4 Sample Code 40

  • v

    5 TESTING

    5.1 Testing Strategies 49

    5.1.1 Unit Testing 49

    5.1.2 Integration Testing 49

    5.1.2.1 Top Down Integration Testing 49

    5.1.2.2 Bottom Up Integration Testing 50

    5.1.3 System Testing 50

    5.1.4 Accepting Testing 50

    5.1.4.1 Alpha Testing 50

    5.1.4.2 Beta Testing 50

    5.2 Testing Methods 50

    5.2.1 White Box Testing 50

    5.2.2 Black Box Testing 51

    5.3 Validation 51

    5.4 Limitations 51

    5.