Learning-Based Control for Robotics (ME461/561)
Learning-Based Control for Robotics (ME461/561)
Semester: Fall 2026
Instructor: Dr. Leilei Cui · Email: lcui@unm.edu
Time: Tue & Thu, 9:30–10:45 AM
Location: Science & Math Learning Center B79
Course Description
Learning-based methods have achieved remarkable progress in robot control, with applications spanning autonomous driving, quadrupeds, and humanoids. At the core of these advances lie optimal control and reinforcement learning (RL). This course develops the mathematical foundations of these methodologies and demonstrates how to apply them to real robotic systems. Topics range from core RL concepts to advanced algorithms (value/policy iteration, TD/Q-learning, actor–critic, DPG/DDPG, TRPO, PPO). The course emphasizes hands-on practice so you can use these tools in your own research.
Course Goals
Build a solid foundation in reinforcement learning and show how RL integrates with control for robotics, strengthening modeling skills, algorithmic thinking, and data-driven decision-making.
Student Learning Outcomes
By the end of the course, you will be able to:
- Explain key RL concepts: value/Q functions, Bellman equations, value/policy iteration.
- Implement RL algorithms (DQN, DDPG, PPO).
- Apply RL to robotic control tasks (e.g., drone flight).
- Interpret RL from an optimal-control perspective.
Prerequisites
- Linear algebra, calculus, probability
- Foundations of dynamics
- Programming experience (Python)
Course Materials
No required textbook. Lecture notes will be posted online. Additional readings will accompany each lecture.
Assessments
| Assessment | Weight | Notes |
|---|---|---|
| Final Exam | 35% | Tue, Dec 9, 9:30–11:30 AM |
| Homework | 30% | 2 writing assignments + 3 coding problems |
| Project | 30% | Final report (15%) + presentation (15%) |
| Participation / Attendance | 5% | Up to 2 unexcused absences |
Schedule (subject to change)
Minor changes announced in class; major changes provided in writing.
| Week | Dates | Topics | Notes |
|---|---|---|---|
| 1 | 8/19, 8/21 | Syllabus & Introduction; Basic Concepts of RL | |
| 2 | 8/26, 8/28 | Bellman Equation; Bellman Optimality | |
| 3 | 9/2, 9/4 | Value Iteration; VI from Optimal-Control Perspective | HW1 announced (Tue) |
| 4 | 9/9, 9/11 | Policy Iteration; PI from Optimal-Control Perspective | HW1 due (Sun 11pm) |
| 5 | 9/16, 9/18 | Monte Carlo Learning; Stochastic Approximation & SGD | |
| 6 | 9/23, 9/25 | Temporal-Difference Learning; Q-Learning | HW2 announced (Tue) |
| 7 | 9/30, 10/2 | Value Function Approximation; Deep Q-Learning (DQN) | HW2 due (Sun 11pm) |
| 8 | 10/7, 10/9 | Robotic Application of DQN | 10/9 Fall Break · HW3 announced (Tue) |
| 9 | 10/14, 10/16 | Policy-Gradient Methods; Monte Carlo PG | |
| 10 | 10/21, 10/23 | Actor–Critic; Deterministic Policy Gradient (DPG) | HW3 due (Sun 11pm) |
| 11 | 10/28, 10/30 | Deep Deterministic Policy Gradient (DDPG); Robotic Applications | HW4 announced (Tue) |
| 12 | 11/4, 11/6 | Trust Region Policy Optimization (TRPO I & II) | HW4 due (Sun 11pm) |
| 13 | 11/11, 11/13 | Robotic Application of TRPO; Proximal Policy Optimization (PPO) | |
| 14 | 11/18, 11/20 | Robotic Application of PPO (I & II) | HW5 announced (Tue) |
| 15 | 11/25, 11/27 | PG from Optimal-Control Perspective | 11/27 Thanksgiving Break |
| 16 | 12/2, 12/4 | Project Presentations | HW5 due (Sun 11pm) |
| — | 12/8–12/12 | Final Exam (Tue, 9:30–11:30 AM) | Project Report due Wed 11pm |
Office Hours
Thu 10:45–11:30 AM · Mechanical Engineering Building, Room 435
Last updated: July 03, 2026