language processing deep learning methods for natural
TRANSCRIPT
![Page 1: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/1.jpg)
Deep Learning Methods for Natural Language ProcessingGarrett HoffmanDirector of Data Science @ StockTwits
![Page 3: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/3.jpg)
Learning Distributed Representations of Words with Word2Vec
3
![Page 4: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/4.jpg)
Sparse Representation
![Page 5: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/5.jpg)
Sparse Representation
![Page 6: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/6.jpg)
Sparse Representation
![Page 7: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/7.jpg)
Sparse Representation
![Page 8: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/8.jpg)
Sparse Representation
![Page 9: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/9.jpg)
Sparse Representation Drawbacks
![Page 10: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/10.jpg)
Sparse Representation Drawbacks
![Page 11: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/11.jpg)
Sparse Representation Drawbacks
□
![Page 12: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/12.jpg)
Distributed Representation
![Page 13: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/13.jpg)
Distributed Representation
![Page 14: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/14.jpg)
Distributed Representation
![Page 15: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/15.jpg)
Distributed Representation
![Page 16: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/16.jpg)
Word2Vec
“Distributed Representations of Words and Phrases and their Compositionality”, Mikolov et al. (2013)
![Page 17: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/17.jpg)
![Page 18: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/18.jpg)
Word2Vec - Generating Data
McCormick, C. (2016, April 19). Word2Vec Tutorial - The Skip-Gram Model.
![Page 19: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/19.jpg)
Word2Vec - Skip-gram Network Architecture
McCormick, C. (2016, April 19). Word2Vec Tutorial - The Skip-Gram Model.
![Page 20: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/20.jpg)
Word2Vec - Skip-gram Network Architecture
McCormick, C. (2016, April 19). Word2Vec Tutorial - The Skip-Gram Model.
![Page 21: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/21.jpg)
Word2Vec - Embedding Layer
McCormick, C. (2016, April 19). Word2Vec Tutorial - The Skip-Gram Model.
![Page 22: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/22.jpg)
Word2Vec - Embedding Layer
McCormick, C. (2016, April 19). Word2Vec Tutorial - The Skip-Gram Model.
![Page 23: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/23.jpg)
Word2Vec - Skip-gram Network Architecture
McCormick, C. (2016, April 19). Word2Vec Tutorial - The Skip-Gram Model.
![Page 24: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/24.jpg)
Word2Vec - Output Layer
McCormick, C. (2016, April 19). Word2Vec Tutorial - The Skip-Gram Model.
![Page 25: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/25.jpg)
Word2Vec - Intuition
McCormick, C. (2017, January 11). Word2Vec Tutorial Part 2 - Negative Sampling.
![Page 26: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/26.jpg)
Word2Vec - Negative Sampling
McCormick, C. (2017, January 11). Word2Vec Tutorial Part 2 - Negative Sampling.
![Page 27: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/27.jpg)
Word2Vec - Negative Sampling
McCormick, C. (2017, January 11). Word2Vec Tutorial Part 2 - Negative Sampling.
![Page 28: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/28.jpg)
https://www.tensorflow.org/tutorials/word2vec
Word2Vec - Results
![Page 30: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/30.jpg)
Distributed Representations of Sentences and Documents
Doc2Vec
![Page 31: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/31.jpg)
Recurrent Neural Networks and their Variants
31
![Page 32: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/32.jpg)
Sequence Models
![Page 33: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/33.jpg)
Recurrent Neural Networks (RNNs)
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
![Page 34: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/34.jpg)
Recurrent Neural Networks (RNNs)
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
![Page 35: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/35.jpg)
Recurrent Neural Networks (RNNs)
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
![Page 36: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/36.jpg)
Long Term Dependency Problem
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
![Page 37: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/37.jpg)
Long Short Term Memory (LSTMs)
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
![Page 38: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/38.jpg)
Long Short Term Memory (LSTMs)
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
![Page 39: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/39.jpg)
Long Short Term Memory (LSTMs)
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
![Page 40: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/40.jpg)
LSTM - Forget Gate
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
![Page 41: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/41.jpg)
LSTM - Learn Gate
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
![Page 42: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/42.jpg)
LSTM - Update Gate
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
![Page 43: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/43.jpg)
LSTM - Output Gate
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
![Page 44: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/44.jpg)
Gated Recurrent Unit (GRU)
http://colah.github.io/posts/2015-08-Understanding-LSTMs/
![Page 45: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/45.jpg)
Types of RNNs
http://karpathy.github.io/2015/05/21/rnn-effectiveness/
![Page 46: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/46.jpg)
Types of RNNs
http://karpathy.github.io/2015/05/21/rnn-effectiveness/
![Page 47: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/47.jpg)
LSTM Network Architecture
![Page 48: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/48.jpg)
Learning Embeddings End-to-End
![Page 49: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/49.jpg)
Dropout
![Page 50: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/50.jpg)
Bidirectional LSTM
http://colah.github.io/posts/2015-09-NN-Types-FP/
![Page 51: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/51.jpg)
Convolutional Neural Networks for Language Tasks
51
![Page 52: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/52.jpg)
Computer Vision Models
![Page 53: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/53.jpg)
Convolutional Neural Networks (CNNs)
![Page 54: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/54.jpg)
Convolutional Neural Networks (CNNs)
http://colah.github.io/posts/2014-07-Conv-Nets-Modular/
![Page 55: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/55.jpg)
CNNs - Convolution Function
0 0 0 0 0 0
0 1 2 1 1 2
0 1 1 1 1 1
1 0 0 0 0 0
0 0 1 1 1 0
0 1 1 1 1 1
0 0 2
1 2 0
1 2 2
Input Vector Kernel / Filter
![Page 56: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/56.jpg)
CNNs - Convolution Function
0 0 0 0 0 0
0 1 2 1 1 2
0 1 1 1 1 1
1 0 0 0 0 0
0 0 1 1 1 0
0 1 1 1 1 1
0 0 2
1 2 0
1 2 2
Input Vector Kernel / Filter
![Page 57: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/57.jpg)
CNNs - Convolution Function
0 0 0 0 0 0
0 1 2 1 1 2
0 1 1 1 1 1
1 0 0 0 0 0
0 0 1 1 1 0
0 1 1 1 1 1
0 0 0
1 0 0
0 2 0
Input Vector Kernel / Filter
2
Output Vector
![Page 58: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/58.jpg)
CNNs - Convolution Function
0 0 0 0 0 0
0 1 2 1 1 2
0 1 1 1 1 1
1 0 0 0 0 0
0 0 1 1 1 0
0 1 1 1 1 1
0 0 0
1 0 0
0 2 0
Input Vector Kernel / Filter
2 3
Output Vector
![Page 59: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/59.jpg)
CNNs - Convolution Function
0 0 0 0 0 0
0 1 2 1 1 2
0 1 1 1 1 1
1 0 0 0 0 0
0 0 1 1 1 0
0 1 1 1 1 1
0 0 0
1 0 0
0 2 0
Input Vector Kernel / Filter
2 3 4
Output Vector
![Page 60: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/60.jpg)
CNNs - Convolution Function
0 0 0 0 0 0
0 1 2 1 1 2
0 1 1 1 1 1
1 0 0 0 0 0
0 0 1 1 1 0
0 1 1 1 1 1
0 0 0
1 0 0
0 2 0
Input Vector Kernel / Filter
2 3 4 3
Output Vector
![Page 61: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/61.jpg)
CNNs - Convolution Function
0 0 0 0 0 0
0 1 2 1 1 2
0 1 1 1 1 1
1 0 0 0 0 0
0 0 1 1 1 0
0 1 1 1 1 1
0 0 0
1 0 0
0 2 0
Input Vector Kernel / Filter
2 3 4 3
0
Output Vector
![Page 62: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/62.jpg)
CNNs - Convolution Function
0 0 0 0 0 0
0 1 2 1 1 2
0 1 1 1 1 1
1 0 0 0 0 0
0 0 1 1 1 0
0 1 1 1 1 1
0 0 0
1 0 0
0 2 0
Input Vector Kernel / Filter
2 3 4 3
0 1
Output Vector
![Page 63: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/63.jpg)
CNNs - Convolution Function
0 0 0 0 0 0
0 1 2 1 1 2
0 1 1 1 1 1
1 0 0 0 0 0
0 0 1 1 1 0
0 1 1 1 1 1
0 0 0
1 0 0
0 2 0
Input Vector Kernel / Filter
2 3 4 3
0 1 1 1
1 2 2 2
2 2 3 3
Output Vector
![Page 64: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/64.jpg)
CNNs - Max Pooling Function
3
Input Vector Output Vector
2 3 4 3
0 1 1 1
1 2 2 2
2 2 3 3
![Page 65: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/65.jpg)
CNNs - Max Pooling Function
3 4
Input Vector Output Vector
2 3 4 3
0 1 1 1
1 2 2 2
2 2 3 3
![Page 66: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/66.jpg)
CNNs - Max Pooling Function
3 4
2
Input Vector Output Vector
2 3 4 3
0 1 1 1
1 2 2 2
2 2 3 3
![Page 67: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/67.jpg)
CNNs - Max Pooling Function
3 4
2 3
Input Vector Output Vector
2 3 4 3
0 1 1 1
1 2 2 2
2 2 3 3
![Page 68: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/68.jpg)
Convolutional Neural Networks (CNNs)
![Page 69: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/69.jpg)
CNN Architecture for Text
![Page 70: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/70.jpg)
State of the Art in NLP - Generalized Language Models
70
![Page 71: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/71.jpg)
Generalized Language Modeling
![Page 72: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/72.jpg)
Types of RNNs
http://karpathy.github.io/2015/05/21/rnn-effectiveness/
![Page 73: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/73.jpg)
P(wn|w1,…wn−
1)
Generalized Language Modeling
![Page 74: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/74.jpg)
Current SOTA
![Page 75: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/75.jpg)
ULMFiT
http://nlp.fast.ai/classification/2018/05/15/introducting-ulmfit.html
![Page 76: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/76.jpg)
ULMFiT
http://nlp.fast.ai/classification/2018/05/15/introducting-ulmfit.html
![Page 77: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/77.jpg)
ULMFiT - GLM Pre Training
AWD-LSTM
![Page 78: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/78.jpg)
ULMFiT
http://nlp.fast.ai/classification/2018/05/15/introducting-ulmfit.html
![Page 79: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/79.jpg)
ULMFiT - Refine GLM for Target Task
Discriminative Fine-Tuning
Slanted Triangular Learning Rates (STLR)
![Page 80: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/80.jpg)
ULMFiT
http://nlp.fast.ai/classification/2018/05/15/introducting-ulmfit.html
![Page 81: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/81.jpg)
ULMFiT - Target Task Classification Training
Concat Pooling
Gradual Unfreeze
![Page 82: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/82.jpg)
BERT / GPT-2 - Transformer Model
Transformer Model
![Page 83: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/83.jpg)
Attention Mechanism
http://www.abigailsee.com/2017/04/16/taming-rnns-for-better-summarization.html
![Page 86: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/86.jpg)
Transformer Model
http://mlexplained.com/2017/12/29/attention-is-all-you-need-explained/
![Page 87: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/87.jpg)
Practical Considerations for Modeling with Your Data
87
![Page 88: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/88.jpg)
Practical Considerations
![Page 89: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/89.jpg)
Practical Considerations
![Page 90: Language Processing Deep Learning Methods for Natural](https://reader030.vdocuments.mx/reader030/viewer/2022040613/624b98b1041a287767474a21/html5/thumbnails/90.jpg)
Practical Considerations