text classification/categorization
TRANSCRIPT
NLP For Text Categorization/Classification
Domain-Natural Language ProcessingPrepared By-Abhishek Oswal
Guide-Jayshree Ghorpade
Some Questions What is NLP Detecting ->Patterns Features Models
What is Cassification What is Text Classification Why Text Classification
Promblem Type• Supervised • You know about it• Train data
• Fruits Analogy
• Unsupervised• You don't know about it • Untrain data
Supervised
• Regression and Classification• Regression -> Real estate market predict price ,
• Price Continious Output • Classification ->Whether it sells for more or less than asked price,discrete output
Process of Classification• Data preprocessing• Training and Test set• Creation of model• Algorithm• Classify
Methods To Represent• Document -term Matrix
• Bags of words
Methods To Classify• Using Probability• Naive Bayes• Using Graphs• Simple Vector Machine• Tree• Decision Tree
Naive Bayes• Predictice Model• Conditional Probability
Naive Bayes• Independent Features• Prior• Likelihood
Naive Bayes
Naive Bayes
Another Method
Simple Vector Machine
SVMl Optimal plane
SVM• Using margin• Marging is no man's land
SVM• Optimal plane would be one with Biggest margin
• Equation of hyperplane
Applications
• Email Classification• Spam Filtering• News Organization• Classification of documents based on language• Opining Mining
• Eg.• Sakaal Classifieds• Gmail Spam Mail Detection
Example
Comparision
Naive BayesEasy FastDifferent Classes
SVMDifficultSlow (traning time)Binary Output
Questions• My Questions First• Classify my presentation• Class ->Good /Bad
ReferencesFei Yu ,Jiyao An and Hong Li,Mialiang Zhu and Ouyang Yang,”Intelligence Text Categorization Based on Bayes Algorithm”,Proceedings of International Conference on Information Acquisition.