How to get most out of ML course (CS-725) @ iitb?

This blog post is meant for students at IIT Bombay considering to take Foundations of Machine Learning (CS-725) Course taught by prof. Ganesh Ramakrishnan.

Prerequisite: To do well in this course, you should be thorough with Linear Algebra (LA) and Probability Theory(PT) concepts.

If you are already familiar with LA and PT, you can skip this part.

For Linear Algebra (LA), I suggest you go through “Linear Algebra and Its Applications” Book by Gilbert Strang along with his MIT OCW videos  , Once you do this, you should be familiar with things like rank, row space, column space, null space of a matrix, least squares method(*), eigen values, and various decompositions of matrices like Eigendecomposition, SVD, QR, LU etc.

images  images-1

For Probability theory, go through “First Course in Probability” book by Sheldon Ross along with MIT OCW videos of prof. Tsitsiklis. Once you go through this, You should be thorough with Bayes rule(prior, posterior, likelihood), distributions like Gaussian(* very imp), binomial, beta etc

415tc8ue1l-_sx398_bo1204203200_  hqdefault-1

Extra prerequisite for non cs students, As this course involves 2 programming assignments and 1 project, you are expected that you should be able to use atleast one of  python/R/matlab languages effectively.

Now Coming to actual course content,

For 2016b(Autumn), the course content was properly organised compared to the previous offering(2016a Spring) of the same course. Before midsem, supervised regression models were covered  and post midsem supervised classification models and a little bit of unsupervised clustering models were covered.

2016a(Spring) course content is available here

2016b(Autumn) course content is available here

The main reading material for the course are class slides and tutorials that are given in the class. There are short lecture videos of class lectures that are available on course site on bodhi tree. Even if you miss the lecture, I would suggest you not to miss tutorial solutions discussion.

Whatever I mention below are for understanding and auxillary.

I personally preferred using “Pattern Recognition and Machine Learning” Book by Christopher Bishop as reference book. Along with the book, I used AI course videos of patrick H winston available @ mit ocw. These videos were particularly helpful for support vector machines(SVM), decision trees and neural networks. He is an amazing orator, conveys the content in simple terms without losing out on mathematical rigour. If you get time, I would suggest you to watch all his videos.

images-2

Before midsem:

For Linear, Ridge, Lasso, I studied mainly from online resources.

These are some of the material on youtube that I found useful (in no particular order).

mathmonk

Ritvik

patrickJMT

Alexander Ihler

Alex Smola

For support vector regression(SVR) – I think winston’s lecture video on svm + it’s mega recitation video would be sufficient

For KKT optimization conditions, This link  would be useful.

mercer kernels (I still haven’t completely understood these things, I’ll update once I find any good resource)

Post Midsem:

For perceptron, sigmoidal classifier (aka logistic regression):  class slides and tutorials should be sufficient

Here is a simple proof of convergence of perceptron that I found online.

For Neural Networks(*), CNN, RNN, LSTM: I started with  Patrick H winston’s AI video on neural which gives the background and overall idea of a neural network and back propagation which is a must for understanding the working of neural networks.

Once you go through the Patrick’s video, you can watch Andrej Karpathy’s CS231(stanford) videos, this guy and his blog is amazing . I actually binge watched all videos in that playlist over 2 days, really enjoyed it.

For Decision Trees – Adaboost, Support Vector Classification : Again Patrick’s videos would suffice along with slides.

For unsupervised learning: (again haven’t quite understood this one also, will update this one once I find any good resource for this)

I just found this Presentation by Yann LeCun contains overview of lot many topics ML, DeepLearning (differences + overlap -> slide 24) http://www.cs.nyu.edu/~yann/talks/lecun-ranzato-icml2013.pdf