Andrew Ng’s five courser aims to give newbies and practitioners a crash course on all things deep learning – from fully connected neural networks to convolutional nets to sequence models. I’ve taken all five courses, and completed four. (I’m yet to finish the convolutional nets course, but I’ve taken enough material to write a review!)
For some more online course recommendations, check out the best online courses to get started with data science. Another well-known data science specialization in Coursera is the one in Python from the University of Michigan, check out my course-by-course review here.
Here’s my course-by-course review of Andrew Ng’s Deep Learning specialization.
Neural Networks and Deep Learning
The first course in the specialization focuses on the building blocks of deep learning. It goes over logistic regression interpreted as a one-layer network, shallow networks, and finally deep networks as stacked shallow networks.
Well, if you’ve taken Andrew Ng’s precursor course Machine Learning, then the first course in Deep Learning is basically just an elaboration of the neural network part. The nice thing about this course is that it’s done in pure Python. The older course was done in a very inconvenient and not-widely-used language called Octave (a.k.a free Matlab), and learning the language was half the battle itself.
For a complete beginner, this course is absolutely essential to get started with deep learning. Take this course over and over again until you understand everything. The only nitpick I have with the course, and all the other courses in the specialization for that matter, is that the Jupyter notebook exercises are basically just fill-in-the-blanks exercises. Something a bit more challenging would definitely be appreciated.
Improving Deep Neural Networks
The second course in the sequence is a short one – just three weeks. It focuses on the details of training neural networks – different types of regularization, optimization algorithms, and parameter tuning. It also briefly covers the TensorFlow programming framework, arguably the most popular deep learning library.
To be honest, I was quite shocked how in-depth and technical the course handled the details of training. This surely isn’t a pedestrian course. However, the techniques were presented in a way that didn’t really adhere in my mind. The course was a collection of information that was good to know, but wasn’t really essential to master in practice. You are really going to implement dropout layers or the Adam optimizer at work, but it’s good to know how they do work. Another nitpick I have is that the coverage of TensorFlow was a bit lacking. For a sequence which aims to be a launchpad for a career in deep learning, the discussion on TensorFlow was just a bit too shallow.
I won’t say that this is an essential course to take for a first timer. Everything here is good-to-know but not essential.
Structuring Machine Learning Projects
Wow, this course is just two-weeks-long. Why didn’t they just merge it with the previous course? This course focuses on ML strategy, i.e. how you set up an ML project. Specifically, the courses talks about how to set up the train-test split, deciding on the optimization metric, bias, variance and error analysis, and new paradigms such as transfer learning.
Again, everything here is nice-to-know but nonessential, like the second course, but personally I found the learnings here to be a bit more important than the second course. While the second one dealt with the technical aspects of training, this course focused more on the practical aspects of setting up a project.
Convolutional Neural Networks
The fourth and fifth courses of the specialization are most definitely the stars of the specialization. If you’re more interested in working with image data, then this course is for you. Conv nets are the focus of course number four. You will learn about the two building blocks of convolutional nets, the convolutional layer and the pooling layer, and how stacking these two layers on top of one another basically gives you amazing performance on image classification tasks. You will also learn about the different famous conv nets in the market: ResNets, VGG, Inception. Practical applications such as object detection and facial recognition round out the course.
This is the only course I haven’t finished as of now. I’m sorry, I’m just not that enthusiastic working on image data. However, make no mistake that the lectures by Andrew Ng on the basics of conv nets are remarkable. I was able to fully grasp how the convolutional network works in contrast to the vanilla fully connected model. This course is a good first dive into spatial models in deep learning, and I’m planning to complete the course soon. My only nitpick would probably be the coverage of the different conv nets on the market. Andrew Ng wasn’t able to explain them in a way that is easily understandable, and I basically just gave up on that part.
The last course in the sequence works on sequences, no pun intended. Most applications here fall under natural language processing, but some applications in music generation and speech recognition are also discussed. Topics covered here are recurrent neural networks (RNNs), the long-short-term memory (LSTM) network, embedding models like Word2Vec and GloVe, and sequence-to-sequence architectures for translation and transcription.
I absolutely love working with text data, and so I had so much fun taking this course. I was never able to fully grasp what an RNN does before this course. The first week of this course is absolutely amazing, and Andrew Ng was able to discuss the technicals of RNNs in a very informative way. This should be the first intro to RNNs for everybody – don’t read blogs! They will just confuse you. The section on NLP was also very fun. The coverage on word2Vec and sentiment analysis was excellent.
However, the third week of the course was a bit rushed. It was basically an information overload on seq-to-seq models. Much, much more elaboration is needed in order to discuss beam search and the attention mechanism fully. They’re very complicated models!
If you’re looking for a good reference in machine learning, it’s hard to find a better one than Introduction to Statistical Learning by James. Though it’s presented in R, the exposition of the different techniques is wonderful.
If you want to focus more on deep learning, try Patterson’s Deep Learning.
Both these books don’t dwell too much on the theory. Practicality all the way.