I've had a lot of people ask me how to get started with machine learning and/or deep learning. This is a list of some of the resources that I have either found useful myself or heard people who I trust rave about.
Classes
- Coursera's Machine Learning Class - This Coursera class, taught by Andrew Ng, seems to have become the canonical introductory machine learning class everyone uses. Most data scientists I know have at some point used this class to study for interviews.
Reading
-
Machine Learning by Chris Bishop - Bishop's book is a common introductory machine learning textbook. While I know some people who have learned machine learning simply by reading this text, I think that it can be a bit thick if it is your first introduction to machine learning, but is a wonderful reference once you have a better idea of how things fit together.
-
Machine Learning by Kevin Murphy - Murphy's book is another common introductor machine learning textbook. This is also a wonderful reference but is a bit hard to read cover to cover.
Programming
- SciKitLearn - SciKitLearn is a Python library for machine learning. Most of the tools provided in it are written with the aim of being usable for those with minimal machine learning background. With this goal in mind, the documentation often contains nice resources for describing heuristics or intuiton to better understand the machine learning behind the library.
Classes
-
If you want an intuition for what deep learning is and how it works, 3Blue1Brown has a series of YouTube videos that explain this really well!
-
If you want to build something that uses deep learning, Fast.ai is an online course that will get you using deep learning for practical projects within just a few lessons.
Reading
-
Neural Networks & Deep Learning - This online book explains a lot of hard concepts relatively intuitively.
-
Deep Learning Book - This book seems to have become one of the canonical books on deep learning. It starts with background knowledge and continues on through modern deep learning research. While some of the final section (on modern research) could get a bit confusing, I thought that section one did a marvelous job reviewing the probability, linear algebra, and other background knowledge that is most useful to get going in deep learning, and I thought section two was a really nice overview of some standard deep learning approaches.
Programming
-
Keras - Keras is a high-level neural networks API, written in Python. You can think of it as a wrapper around TensorFlow (and other lower level tools), Theano, etc. If you want to get something that uses deep learning up and running quickly, Keras is a great library to use. However, if you need to do a lot of customization to your architecture, there is a good chance that you will end up needing to use some of the lower level tools (i.e. TensorFlow) too.
-
TensorFlow- TensforFlow is a library for implementing deep learning developed by Google Brain. It takes some practice to get used to thinking about models in the TensorFlow way, but it is very robust and works well in large and distributed systems.
-
PyTorch - PyTorch is an alternative to TensorFlow. While there is some debate about which is better, the general consensus is that PyTorch is often easier to use for smaller projects, research projects, and other projected that do not need to be exceptionally robust and/or distributed.
Visualizations
- TensorFlow Playground - This is a beautiful interactive visualization that came out of Google Brain and the People and AI Research (P+AIR) Initiave. If you want to get a better intuitive sense of what the parameters in your neural network are doing, this is the perfect place to start.
Reading
- Reinforcement Learning by Sutton & Barto - This is the canonical book on reinforcement learning, and it has been for quite some time. Consequently, this will get you through the basic ideas of reinforcement learning, but to learn about the most modern advances, you'll probably need to find another resource. During my Master's, I learned reinforcement learning by reading this book and implementing each algorithm discussed in Python, and for me, that was a good balance between theory and practice.
Tools
- OpenAI Gym- The OpenAI Gym is a toolkit for developing and comparing reinforcement learning algorithms. This did not yet exist when I learned reinforcement learning, but they have some great visualizations that make the process of training an agent feel more fun and rewarding than the basic text-based maze navigator that I learned with did.
- Distill.pub - This online publication aims to publish clear, understandable explanations of machine learning concepts. Currently, their compilation is by no means exhaustive, but if you happen to find an explanation here, there is a good chance that you will understand it better after reading this.