text classification/categorization

21
NLP For Text Categorization/Classific ation Domain-Natural Language Processing Prepared By-Abhishek Oswal Guide-Jayshree Ghorpade

Upload: oswal-abhishek

Post on 23-Jan-2017

174 views

Category:

Engineering


2 download

TRANSCRIPT

Page 1: Text Classification/Categorization

NLP For Text Categorization/Classification

Domain-Natural Language ProcessingPrepared By-Abhishek Oswal

Guide-Jayshree Ghorpade

Page 2: Text Classification/Categorization

Some Questions What is NLP Detecting ->Patterns Features Models

What is Cassification What is Text Classification Why Text Classification

Page 3: Text Classification/Categorization

Promblem Type• Supervised • You know about it• Train data

• Fruits Analogy

• Unsupervised• You don't know about it • Untrain data

Page 4: Text Classification/Categorization

Supervised

• Regression and Classification• Regression -> Real estate market predict price ,

• Price Continious Output • Classification ->Whether it sells for more or less than asked price,discrete output

Page 5: Text Classification/Categorization

Process of Classification• Data preprocessing• Training and Test set• Creation of model• Algorithm• Classify

Page 6: Text Classification/Categorization

Methods To Represent• Document -term Matrix

• Bags of words

Page 7: Text Classification/Categorization

Methods To Classify• Using Probability• Naive Bayes• Using Graphs• Simple Vector Machine• Tree• Decision Tree

Page 8: Text Classification/Categorization

Naive Bayes• Predictice Model• Conditional Probability

Page 9: Text Classification/Categorization

Naive Bayes• Independent Features• Prior• Likelihood

Page 10: Text Classification/Categorization

Naive Bayes

Page 11: Text Classification/Categorization

Naive Bayes

Page 12: Text Classification/Categorization

Another Method

Page 13: Text Classification/Categorization

Simple Vector Machine

Page 14: Text Classification/Categorization

SVMl Optimal plane

Page 15: Text Classification/Categorization

SVM• Using margin• Marging is no man's land

Page 16: Text Classification/Categorization

SVM• Optimal plane would be one with Biggest margin

• Equation of hyperplane

Page 17: Text Classification/Categorization

Applications

• Email Classification• Spam Filtering• News Organization• Classification of documents based on language• Opining Mining

• Eg.• Sakaal Classifieds• Gmail Spam Mail Detection

Page 18: Text Classification/Categorization

Example

Page 19: Text Classification/Categorization

Comparision

Naive BayesEasy FastDifferent Classes

SVMDifficultSlow (traning time)Binary Output

Page 20: Text Classification/Categorization

Questions• My Questions First• Classify my presentation• Class ->Good /Bad

Page 21: Text Classification/Categorization

ReferencesFei Yu ,Jiyao An and Hong Li,Mialiang Zhu and Ouyang Yang,”Intelligence Text Categorization Based on Bayes Algorithm”,Proceedings of International Conference on Information Acquisition.