- Instructor: Kaveh Kavousi (kkavousi at ut.ac.ir) and Hesam Montazeri (hesam.montazeri at ut.ac.ir)
- Teaching Assistants: Fahimeh Palizban (fahimehpalizban at ut.ac.ir) & Zohreh Toghrayee ( zohreh.toghrayee at ut.ac.ir)
- Time & Location: Sep-Dec 2019, lectures are held on Sundays 15:00-17:00 and Tuesdays 13:00-15:00 at Ghods st. 37, Department of Bioinformatics, IBB, Tehran.
- Google Calendar: for the detailed schedule, add the course calendar to your calendars!
- The Elements of Statistical Learning by Jerome H. Friedman, Robert Tibshirani, and Trevor Hastie [ESL]
- An Introduction to Statistical Learning: With Applications in R by Daniela Witten, Gareth James, Robert Tibshirani, and Trevor Hastie [ISL]
- Pattern Recognition and Machine Learning by Christopher Bishop [PRML]
- A First Course in Machine Learning by Simon Rogers and Mark Girolami [FCML]
- Probabilistic Graphical Models by Daphne Koller & Nir Friedman [PGM]
- Learning from data by Abu-Mostafa, Y.S., Magdon-Ismail, M. and Lin, H.T [LFD].
- Mathematics for Machine Learning by Marc Peter Deisenroth, A Aldo Faisal, and Cheng Soon Ong [MML].
- Advances in Kernel Methods: Support Vector Learning by Christopher J.C. Burges, Bernhard Schölkopf and Alexander J. Smola [AKM]
CS229 Lecture notes at Stanford available at here [CS229]
- Final exam, 1/11/1398
Week | Lecture | Reading Assignments | Homeworks & whiteboard notes | By |
---|---|---|---|---|
W1 | Logistics (slides) (31/6/1398) Lecture 1- Introduction to machine learning; simple linear regression- gradient descent algorithm (slides) (2/7/1398) Lecture 2- linear regression- analytical solution; mathematical formulation in matrix form Tutorial 1- Introduction to R (slides) |
Required: FCML, Sec. 1.1-3 CS229, Supervised learning (notes) Highly recommended: Linear algebra review from Stanford (notes) |
HW1 WB notes*1 |
HM |
W2 | (7/7/1398) Lecture 3: Linear regression in matrix form; polynomial regression; basis functions (9/7/1398) Lecture 4: Ridge regression; The LASSO; generalization error; cross validation |
Required: FCML, Sec. 1.4-6; ESL, P. 43-46, Sec. 3.4.1-3, 7.10 Recommended: ISL Sec. 5.1, 6.2 |
HW2 WB notes*2 |
HM |
W3 | (14/7/1398) Lecture 5: Bias-variance decomposition; maximum likelihood estimation (slides) (16/7/1398) Lecture 6: Maximum a posteriori estimation; Bayesian interpretation of linear regression |
Required: ISL, Sec. 2.1-2, 3.1-4, 6.1 | HW3 WB notes*1 |
HM |
W4 | (21/7/1398) Lecture 7: K-nearest neighbor regression; classification; KNN classifier; logistic regression (slides) (23/7/1398) Lecture 8: Newton's method; iteratively reweighted least squares; exponential family |
Required: ISL, Sec. 2.2.3, 3.5, 4.1-3; ESL, Sec. 4.4.1-4; PRML, Sec. 2.4 (up to 2.4.1) Optional: MML, Sec. 5.7-8 |
HW4 WB notes*1 |
HM |
W5 | (28/7/1398) Lecture 9: Exponential family; Generalized Linear Models; Discriminative vs Generative models (30/7/1398) Lecture 10: Linear discriminant analysis; Naïve Bayes classifier |
Required: CS229, parts III-IV, ISL, Sec. 4.4, ESL, Sec. 4.3 |
HW5 WB notes*1 |
HM |
W6 | (12/8/1398) Lecture 11: Learning theory; Support Vector Machines (slides) |
Required: CS229, part VI; AKM, Ch. 1; KKT notes Optional: CS229, part V |
HW6 | K2 |
W7 | (19/8/1398) Lecture 12: Soft margin hyperlane; Nonlinear SVM; Kernels (slides) | HW7 | K2 | |
W8 | (25/8/1398) Lecture 13: Convex sets & functions; convex optimization; Linear and quadratic programming; Lagrangian duality (2/9/1398) Lecture 14: Subgradient; coordinate descent algorithm for linear regression and Lasso; sequential minimal optimization (SMO) |
Required: MML, Ch. 7; CS229, part V | HW8 WB notes*1 |
HM |
W9 | (3/9/1398) Lecture 15: Performance assessment of learners (slides) (5/9/1398) Lecture 16: Bootstrapping |
Required: ISL 5.2 | HW9 | K2 HM |
W10 | (10/9/1398) Lecture 17: Statistical hypothesis testing; p-value; statistical testing for comparing machine learners (12/9/1398) Lecture 18: Feature selection methods (slides) |
Required: Jason Brownlee's notes on comparing machine learners | HW10 | HM K2 |
W11 | (17/9/1398) Lecture 19: Decision/regression trees; Bagging; random forest (19/9/1398) Lecture 20: Boosting (slides) |
Required: ESL, Sec. 8.7, 9.2, 10.1-6, 15.1-3; ISL Ch. 8 | HW11 | HM |
W12 | (24/9/1398) Lecture 21: Multiple Classifier System (slides) |
HW12 | K2 | |
W13 | (30/9/1398) Lecture 22: Bayesian inference; conjugate models; Bayesian linear regression; Laplace approximation (slides) (1/10/1398) Lecture 23: Clustering algorithms (3/10/1398) Lecture 24: Clustering algorithms |
Required: FCML, Ch. 4-5; PRML, Sec. 3.3; Clustering slides at the shared Google folder | HW13 | HM K2 |
W14 | (8/10/1398) Lecture 25: Markov chain Monte Carlo; principal component analysis (10/10/1398) Lecture 26: Neural networks |
Required: MML, Sec. 10.1-2; ESL, Sec. 11.3 | HW14 | HM |
W15 | (15/10/1398) Lecture 27: Debugging learning algorithms (17/10/1398) Lecture 28: A review of common statistical tests |
Required: Andrew Ng’s slides on ML debugging |
* Thanks to Fereshteh Fallah1 and Ali Maddi2 for kindly sharing their class notes.
** While uploaded students' WB notes are of high quality, the instructors have not checked all the detailed derivations for the correctness.