distributed machine learning examples
TRANSCRIPT
DISTRIBUTED MACHINE LEARNING EXAMPLES
STANLEY WANG SOLUTION ARCHITECT, TECH LEAD @SWANG68 http://www.linkedin.com/in/stanley-wang-a2b143b
Topic Modeling
• Topical categorization of blogs, documents or other objects that can be tagged with text, improves the experience for end users;
• Discover Sets of Topics from Large Unstructured Collections of documents;
• Annotate documents with topic;
• Utilize Annotation to Index, Search and Classify on documents;
The Intuitions behind LDA
• Latent Dirichlet Allocation (LDA) is an unsupervised, probabilistic, text clustering algorithm. LDA defines a generative model that can be used to model how documents are generated given a set of topics and the
words in the topics;
Graphical Model for LDA
• Topic-based text classification;
• Topic modeling can be seen as a pre-processing step before applying supervised learning methods, such as Collaborative Filtering;
• Finding patterns in genetic data, images, and social networks;
Real Inference with LDA
• A 100-topic LDA model was fitted to 17,000 articles from the Science journal; • At right are the top 15 most frequent words from the most frequent topics; • At left are the inferred topic proportions for the example article from previous slide;
What is Community Intuition?
In social world, community is a collection of users that are more closely related to each other than the rest of the network. The relation between users
can be amount of interaction, similar interest, geographical factors etc.
Why Detect Social Communities?
• Behavior Analysis • Location-based Interaction Analysis • Recommender Systems Development • Link Prediction • Customer Interaction and Analysis • Media & Content Analysis • Security • Social Studies