Course:
Instructors: Animesh Garg
Webpage: https://pairlab.github.io/csc2621-w20/#
Teaching assistants: Dylan Turpin and Tingwu Wang
Lecture hours: Tuesday 1 – 3 BA1200
Office hours: AG: Tues 315 – 415 (after class), Location PT283E // TAs: TBD
Course Breadth: M3/RA16
Discussion: Quercus, Piazza
Course Staff Email: TBD
Urgent contact Email: garg@cs.toronto.edu with subject “CSC2621:
Announcements:
- Jan 7: Welcome.
- Jan 10: Piazza class created link added. Also, here is the Rosetta stone paper mentioned in class.
- Jan 21: Project guidelines posted
- Feb 22: Project guidelines updated with proposal info. Midterm progress report extended to February 28 at 11:59PM
- Mar 3: Professor Garg will hold extra office hours on Monday March 9 from 11:30-1 for project discussions
Course Overview:
Description
Robots of the future will need to operate autonomously in unstructured and unseen environments. It is imperative that these systems are built on intelligent and adaptive algorithms. Learning by interaction through reinforcement offers a natural mechanism to postulate these problems.
This graduate-level seminar course will cover topics and new research frontiers in reinforcement learning (RL). Planned topics include: Model-Based and Model-Free RL, Policy Search, Monte Carlo Tree Search, off-policy evaluation, temporal abstraction/hierarchical approaches, inverse reinforcement learning and imitation learning.
Learning objectives
At the end of this course, you will:
- Acquire familiarity with state of the art in RL
- Articulate limitations of current work, identify open frontiers, and scope research projects.
- Constructively critique research papers, and deliver a tutorial style presentation.
- Work on a research based project, implement & evaluate experimental results, and discuss future work in a project paper.
Prerequisites
You need to be comfortable with:
- introductory machine learning concepts (such as from CSC411/ECE521 or equivalent),
- linear algebra,
- basic multivariable calculus,
- intro to probability.
You also need to have strong programming skills in Python.
Note: if you don’t meet all the prerequisites above please contact the instructor by email.
Optional, but recommended: experience with neural networks, such as from CSC321, introductory-level familiarity with reinforcement learning and control.
Recommended Textbooks
- Marco Wiering and Martijn van Otterlo, Eds., Reinforcement Learning: State-of-the-Art, Springer, 2012. Available for free under UofT library subscription. Install proxy bookmark, then visit book page, login with UofT credentials.
- Sutton and Barto’s 2018 updated edition.
Grading & Evaluation :
This course will consist of lectures, along with paper presentations & discussions. Along with this there would be a take home midterms, and a group project.
In-Class Paper Presentation: 25%
Take-Home Midterm: 15%
Pop-quizzes & Class Participation: 10%
Project: 50%
Calendar:
This a draft schedule and is subject to change.
Resources:
Type | Name | Description |
---|---|---|
RL Code base | OpenAI Baseline | Implementations of common reinforcement learning algorithms. |
Google Dopamine | Research framework for fast prototyping of reinforcement learning algorithms. | |
Evolution-strategies-starter | Evolution Strategies as a Scalable Alternative to Reinforcement Learning. | |
Pytorch-a2c-ppo-acktr | PyTorch implementation of A2C, PPO and ACKTR. | |
Model-Agnostic Meta-Learning | Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. | |
Reptile | Reptile is a meta-learning algorithm that finds a good initialization. | |
General Framework | TensorFlow | An open source machine learning framework. |
PyTorch | An open source deep learning platform that provides a seamless path from research prototyping to production deployment. | |
Environments | OpenAI Gym | Gym is a toolkit for developing and comparing reinforcement learning algorithms. |
Deepmind Control Suite | A set of Python Reinforcement Learning environments powered by the MuJoCo physics engine. | |
Suggested (Free) online computation platform | AWS-EC2 | Amazon Elastic Compute Cloud (EC2) forms a central part of Amazon.com’s cloud-computing platform, Amazon Web Services (AWS), by allowing users to rent virtual computers on which to run their own computer applications. |
GCE | Google Compute Engine delivers virtual machines running in Google’s innovative data centers and worldwide fiber network. | |
Colab | Colaboratory is a free Jupyter notebook environment that requires no setup and runs entirely in the cloud. | |
Related Courses | Topics in Machine Learning, Fall 2018 by Jimmy Ba. | |
Topics in Machine Learning, Fall 2018 by Jimmy Ba. |