deep learning for nlp applications

26
Deep Learning for Deep Learning for NLP Applications NLP Applications

Upload: samiur-rahman

Post on 14-Jul-2015

447 views

Category:

Engineering


8 download

TRANSCRIPT

Page 1: Deep Learning for NLP Applications

Deep Learning forDeep Learning forNLP ApplicationsNLP Applications

Page 2: Deep Learning for NLP Applications

What is DeepWhat is DeepLearning?Learning?

Page 3: Deep Learning for NLP Applications

Just a NeuralJust a NeuralNetwork!Network!

"Deep learning" refers toDeep Neural Networks

A Deep Neural Network issimply a Neural Networkwith multiple hidden layers

Neural Networks havebeen around since the1970s

Page 4: Deep Learning for NLP Applications

So why now?So why now?

Page 5: Deep Learning for NLP Applications

LargeLargeNetworks areNetworks arehard to trainhard to train

Vanishing gradients makebackpropagation harderOverfitting becomes aserious issueSo we settled (for the timebeing) with simpler, moreuseful variations of NeuralNetworks

Page 6: Deep Learning for NLP Applications

Then,Then,suddenly ...suddenly ...

We realized we can stackthese simpler NeuralNetworks, making themeasier to trainWe derived more efficientparameter estimation andmodel regularizationmethodsAlso, Moore's law kicked inand GPU computationbecame viable

Page 7: Deep Learning for NLP Applications

So what's the bigSo what's the bigdeal?deal?

Page 8: Deep Learning for NLP Applications

MASSIVE improvements inMASSIVE improvements inComputer VisionComputer Vision

Page 9: Deep Learning for NLP Applications

Speech RecognitionSpeech RecognitionBaidu (with Andrew Ng as their chief) has built a state-of-the-art speech recognition system with DeepLearningTheir dataset: 7000 hours of conversation couple withbackground noise synthesis for a total of 100,000hoursThey processed this through a massive GPU cluster

Page 10: Deep Learning for NLP Applications

Cross DomainCross DomainRepresentationsRepresentations

What if you wanted to takean image and generate adescription of it?The beauty ofrepresentation learning isit's ability to be distributedacross tasksThis is the real power ofNeural Networks

Page 11: Deep Learning for NLP Applications

But Samiur, whatBut Samiur, whatabout NLP?about NLP?

Page 12: Deep Learning for NLP Applications

Deep Learning NLPDeep Learning NLPDistributed word representationsDependency ParsingSentiment AnalysisAnd many others ...

Page 13: Deep Learning for NLP Applications

StandardStandardBag of Words

A one-hot encoding20k to 50k dimensionsCan be improved byfactoring in documentfrequency

Word embeddingWord embeddingNeural Word embeddings

Uses a vector spacethat attempts topredict a word given acontext window200-400 dimensions

motel [0.06, -0.01, 0.13, 0.07, -0.06, -0.04, 0, -0.04]

hotel [0.07, -0.03, 0.07, 0.06, -0.06, -0.03, 0.01, -0.05]

Word RepresentationsWord Representations

Word embeddings make semantic similarity andsynonyms possible

Page 14: Deep Learning for NLP Applications

Word embeddings have coolWord embeddings have coolproperties:properties:

Page 15: Deep Learning for NLP Applications

Dependency ParsingDependency ParsingConverting sentences to adependency based grammar

Simplifying this to the verbsand it's agents is calledSemantic Role Labeling

Page 16: Deep Learning for NLP Applications

SentimentSentimentAnalysisAnalysis

Recursive Neural NetworksCan model treestructures very wellThis makes it great forother NLP tasks too(such as parsing)

Page 17: Deep Learning for NLP Applications

Get to theGet to theapplications partapplications part

already!already!

Page 18: Deep Learning for NLP Applications

ToolsToolsPython

Theano/PyLearn2Gensim (for word2vec)nolearn (uses scikit-learn

Java/Clojure/ScalaDeepLearning4jneuralnetworks by Ivan Vasilev

APIsAlchemy APIMeta Mind

Page 19: Deep Learning for NLP Applications

Problem: Funding SentenceProblem: Funding SentenceClassifierClassifier

Build a binary classifier that is able to take any sentencefrom a news article and tell if it's about funding or not.

eg. "Mattermark is today announcing that it has raised a

round of $6.5 million"

Page 20: Deep Learning for NLP Applications

Word VectorsWord Vectors

Used Gensim's Word2Vec implementation to trainunsupervised word vectors on the UMBC WebbaseCorpus (~100M documents, ~48GB of text)Then, iterated 20 times on text in news articles in thetech news domain (~1M documents, ~300MB of text)

Page 21: Deep Learning for NLP Applications

Sentence VectorsSentence VectorsHow can you compose word vectors to makesentence vectors?

Use paragraph vector model proposed by Quoc LeFeed into an RNN constructed by a dependencytree of the sentenceUse some heuristic function to combine the stringof word vectors

Page 22: Deep Learning for NLP Applications

What did we try?What did we try?TF-IDF + Naive BayesWord2Vec + Composition MethodsWord2Vec + TF-IDF + Composition MethodsWord2Vec + TF-IDF + Semantic Role Labeling (SRL) +Composition Methods

Page 23: Deep Learning for NLP Applications

Composition MethodsComposition Methods

Where wi represents the i'th word vector,wv the word vector for the verb, and a0 and a1 are

agents

Page 24: Deep Learning for NLP Applications

What worked?What worked?Word2Vec + TFIDF + SRL + Circular Convolution/Additive

The first method with simple TFIDF/Naive Bayesperformed extremely poorly because of it's largedimensionalityCombining TFIDF with Word2Vec provided a small,but noticeable improvementAdding SRL and a more sophisticated compositionmethod increased performance by almost 5%

Page 25: Deep Learning for NLP Applications

What else could we try?What else could we try?Can we apply this method to generate generalpurpose document vectors?

We are currently using LDA (a topic analysismethod) or simple TFIDF to create documentvectorsHow will this method compare to the alreadyproposed paragraph vector method by Quoc Le?

Can we associate these document vectors with muchsmaller query strings?

eg. Search for artificial intelligence against ourcompanies and get better results than keywordsearch

Page 26: Deep Learning for NLP Applications

Who's doing ML atWho's doing ML atMattermark?Mattermark?

mattermark

We need more people! Refer anyone thatyou know that does Data Science/ML