A bit over a year ago I got the offer to write a Master’s thesis about Deep Learning, but the problem was that I had almost no prior knowledge about it.
So it was not that long ago that I myself was scrambling for resources to learn more about Neural Networks and how to actually work with them. Here are my favorite resources that helped me immensely to get to a point where I can now read research papers in the area and design and code my own numerical experiments on Stochastic Gradient Descent using Pytorch.
All of these resources are free, so you do not need access to university or a lot of money to learn these things.
Either scroll down to read about the resources or watch this video I made about the topic:
Youtube Series by University Michigan: Deep Learning for Computer Vision held by Justin Johnson
This Youtube playlist is a complete semester course on Computer Vision held at University Michigan – they provided the full lectures on Youtube for the public, which is simply amazing.
The content of the lectures is very similar to the cs231n Stanford course on Computer Vision, which you can also find on Youtube and which was also partly presented by Justin Johnson back when he was in Stanford. But the Michigan course is more recent, the videos are from the Fall 2019 semester, so I would recommend that one.
Because these are university lectures, they go quite deep into the theory and background which I appreciated for my thesis work in university, but if you don’t have a strong Computer Science background the lectures might be a bit daunting and you will also not find any coding tutorials in this playlist. Also, since they focus on Computer Vision, they don’t cover Natural Language Processing or Time Series tasks in too much detail, though you do find a chapter about Recurrent Neural Networks in there as well, which can be used for caption generation on images.
However, I think these lectures are a great intro into many topics, of which you can pick and choose: Backpropagation, Convolutional Networks and various CNN Architectures, How to train Neural Networks and even more advanced or specialized topics like Attention and Visualization of learning.
Deep Learning Book by Ian Goodfellow, Yoshua Bengio and Aaron Courville
This book is available for free online in HTML format, provided by the authors and you can find it under this link: https://www.deeplearningbook.org/
The Deep Learning textbook is a resource intended to help students and practitioners enter the field of machine learning in general and deep learning in particular. The online version of the book is now complete and will remain available online for free.from the website itself
This book will serve you for quite a while and give deep knowledge on all important concepts. If you ever stumble upon some theoretic concept that you don’t know or understand, use this book to look it up. Also consider reading through some of the first chapters up front if you’re missing background knowledge in math or machine learning.
The book starts with some mathematical basics, like Linear Algebra and Singular Value Decomposition, Probability Theory and it also covers Machine Learning basics like Supervised vs Unsupervised Learning etc. It then dives into Deep Networks, going over how training and regularization works and covers Convolutional as well as Recurrent and Recursive Nets. The third part is then about current research directions – and “current” here means 2016, but many of these areas are still in active research and the information about them is still extremely interesting. You should just be aware that by now there might be a bit more information available online that you can read up on.
The authors are all very accomplished researchers in the Deep Learning field, so the fact that they created this book and provide it online for free is amazing. You can buy a print version of this as well – which I did because I like books and flipping through them and reading for long times on the computer is exhausting to me.
A very useful feature of this book is that it refers to all the important papers that influenced Deep Learning, so you have many further reading opportunities and if you are looking for references or topic ideas for your thesis or other university assignment, this can come in extremely handy.
Implementing different parts from scratch
This is the first resource or tip that is more focused on the practical coding side of Deep Learning. If you want to understand how Deep Learning works and not just use the most used coding snippets, I recommend getting your hands dirty with some numpy code in Python and implementing all the details of a forward and backward pass of some simple network structures for example.
Don’t just use the LSTM implementation that Pytorch or Keras offer you, but instead code the different steps and gates yourself.
When something goes wrong in your training, it will then be much easier to find the cause for that because you will know the path that your data takes through the network and might be able to infer from the wrong results what the cause may be and how to fix it.
There is a Youtube Series, which does exactly that: https://www.youtube.com/playlist?list=PLQVvvaa0QuDcjD5BAw2DxE6OF2tius3V3 by the user sentdex. However, I think it does not cover advanced cells like the LSTM, so you have some more work left to do here, and I would recommend to always try coding it yourself first before watching the video. It is always best to actively work through a problem instead of just passively listening to someone who already figured it out. Struggling to get there is 80% of the wisdom you can gain.
Okay look: If you prefer Tensorflow/Keras over Pytorch, I can’t help you. I have never worked with that, but I’m sure they also have some tutorials and documentation on their website that can help you.
I learned Pytorch because that is what is used in universities and research and most importantly, that is what my supervisor uses, so that decision was made for me.
Pytorch has some great tutorials on their website, that are very readable even for total noobs. I think I used their first tutorial within the first 2 weeks of studying Deep Learning and it made a lot of sense to me. Their Quickstart Tutorial goes through downloading a dataset, setting up a model and training it and you can read it in under an hour.
After learning the basics, the documentation continues to provide good information, as they have some explanatory sentences on almost all major functions, like when to use a specific loss function or what other concepts something works well with.
This is a universal machine learning tip and also works well for Deep Learning.
Kaggle.com is a website that provides free data sets that you can use to practice building models and working with data in general. Some of these datasets are also provided as a challenge by companies and sometimes they even offer prices for the best solutions.
For each data set, the website usually tells you what task you can try and solve with the data and there is a “Code” tab where other programmers submitted their solutions, which can then be voted on by the public.
These code solutions by other people are almost more valuable than the datasets because you can learn how a complete workflow looks or pick up some best practices or ideas on how to code more clearly or more efficient.
Personally I used a bird classification data set and build an image classification network that outputs which bird species can be seen on a specific image.
I hope these help you get started or progress in your Deep Learning journey!