Here is my short list of materials related to Deep Learning and NLP that I found useful during my exploration of this topics. I don’t think that this list is exhaustive, but maybe you will find something useful;)
Videos:
General introduction:
- http://www.fast.ai/ - an introduction to Deep Learning focused on examples. There are a lot of Jupyter notebooks supplementing the course - https://github.com/fastai/courses/tree/master/deeplearning1/nbs.
- https://www.coursera.org/specializations/deep-learning - Deep Learning specialization from Andrew Ng. It’s a bit more technical and focuses more on the theory than fastai. I first completed Coursera course and then fastai, but I think it might be easier to do this in the opposite direction. fastai will give you a broad overview, and Coursera’s specialization will deepen the understanding of important details.
NLP:
- https://www.youtube.com/watch?v=IM99o2B_0Nw - a tutorial on NLP using h2o platform. It’s worth to screen the slides (available here: https://www.slideshare.net/0xdata/nlp-with-h2o) to see what it’s all about. There’re a few lovely pictures, especially the one about word2vec, cappuccino, and expresso;)
Blog posts:
Miscellaneous:
- https://www.kdnuggets.com/2017/12/improve-machine-learning-performance-lessons-andrew-ng.html - a summary of one of the Andrew Ng course. I think that the concepts of “Single Number Evaluation Metric” and “Satisfying and Optimizing Metric” are very, very important and should be the central topic at the beginning of every ML project (not only Deep Learning).
- https://docs.google.com/presentation/d/1kSuQyW5DTnkVaZEjGYCkfOxvzCqGEFzWBy4e9Uedd9k/preview?imm_mid=0f9b7e&cmp=em-data-na-na-newsltr_20171213#slide=id.g168a3288f7_0_58 - a presentation from Google about machine learning. It might be a good source of inspiration if you need to prepare a presentation about places where ML can be used.
NLP:
- http://mccormickml.com/2016/04/19/word2vec-tutorial-the-skip-gram-model/, http://mccormickml.com/2017/01/11/word2vec-tutorial-part-2-negative-sampling/, http://colah.github.io/posts/2014-07-NLP-RNNs-Representations/, https://towardsdatascience.com/word-vectors-for-non-nlp-data-and-research-people-8d689c692353 - some stuff about word embeddings, and the intuitions behind this idea. I recommend reading all of those articles;)
Books:
- https://www.manning.com/books/deep-learning-with-r, https://www.manning.com/books/deep-learning-with-python - I list those books together, because their content is very, very similar, only one user R, and the second uses Python. I decided that I’ll first read R version, and then go to the Python version, because currently, I think in R, so it’s easier for me to grasp new ideas when they’re presented in R. But then, by comparing the examples I learn some Python. But I do not recommend this approach to others. I think starting with Python is a better solution (and you need to buy only one book;)).
- https://www.manning.com/books/grokking-deep-learning - I purchased this book as a MEAP (Manning Early Access Program), and I’m still waiting to the completion. It’s a bit different book than the Deep Learning with * because it focuses on the building everything from scratch (backpropagation, gradient descent, etc.). I only skimmed the content, but I think it might be a good read if you are interested in how everything works behind the scene. I hope that I will write more when the book will be completed.
- http://www.deeplearningbook.org/ - I have not read this book yet, but it seems to be the most mature and advanced position in this short list, and worth read. And it’s also free:) For more opinions about this book goes here: https://www.goodreads.com/book/show/24072897-deep-learning.
Articles:
NLP:
- https://arxiv.org/abs/1607.04606 - “Enriching Word Vectors with Subword Information” - the article describes the main idea behind the Facebook’s fasttext library - the usage of subwords information to build word embeddings.
- https://research.fb.com/publications/tagspace-semantic-embeddings-from-hashtags/ - predicting hashtags for short texts using CNN.
Libraries:
- http://www.fast.ai - yup. It’s a deep learning library accompanying the fastai course.
- https://keras.io/ - high level deep learning library. It has an R bindings (https://keras.rstudio.com/).
- https://www.tensorflow.org/ - more low-level library created by Google (it’s not only a deep learning framework - check this out - http://edwardlib.org/ - it’s built on top of Tensorflow). It’s better to start with Keras (or fastai).
- https://pytorch.org/ - a bit more low-level library than Keras and fastai. It’s an engine behind the fastai, so it might be easier to start with fastai and then move to the pytorch.
- https://github.com/facebookresearch/StarSpace - a Facebook’s library for neural nets. It supports only a particular set of predefined tasks like building word embeddings, recommendations, or text classification. Nevertheless, it’s work to check it because it might work pretty well without a lot of fine-tuning.
- https://fasttext.cc/ - it’s another tool for NLP from Facebook. From my perspective, the best advantage of fasttext is an ability to handle out-of-vocabulary word by using subwords information. It can greatly simplify the data preparation step because you can put all the data right into the model without stemming or any other form of preprocessing. It might be an excellent tool for quick and dirty solutions.