Learning-Based Control for Robotics (ME461/561)

July 03, 2026

Learning-Based Control for Robotics (ME461/561)

Semester: Fall 2026
Instructor: Dr. Leilei Cui · Email: lcui@unm.edu
Time: Tue & Thu, 9:30–10:45 AM
Location: Science & Math Learning Center B79

Course Description

Learning-based methods have achieved remarkable progress in robot control, with applications spanning autonomous driving, quadrupeds, and humanoids. At the core of these advances lie optimal control and reinforcement learning (RL). This course develops the mathematical foundations of these methodologies and demonstrates how to apply them to real robotic systems. Topics range from core RL concepts to advanced algorithms (value/policy iteration, TD/Q-learning, actor–critic, DPG/DDPG, TRPO, PPO). The course emphasizes hands-on practice so you can use these tools in your own research.

Course Goals

Build a solid foundation in reinforcement learning and show how RL integrates with control for robotics, strengthening modeling skills, algorithmic thinking, and data-driven decision-making.

Student Learning Outcomes

By the end of the course, you will be able to:

Explain key RL concepts: value/Q functions, Bellman equations, value/policy iteration.
Implement RL algorithms (DQN, DDPG, PPO).
Apply RL to robotic control tasks (e.g., drone flight).
Interpret RL from an optimal-control perspective.

Prerequisites

Linear algebra, calculus, probability
Foundations of dynamics
Programming experience (Python)

Course Materials

No required textbook. Lecture notes will be posted online. Additional readings will accompany each lecture.

Assessments

Assessment	Weight	Notes
Final Exam	35%	Tue, Dec 9, 9:30–11:30 AM
Homework	30%	2 writing assignments + 3 coding problems
Project	30%	Final report (15%) + presentation (15%)
Participation / Attendance	5%	Up to 2 unexcused absences

Schedule (subject to change)

Minor changes announced in class; major changes provided in writing.

Week	Dates	Topics	Notes
1	8/19, 8/21	Syllabus & Introduction; Basic Concepts of RL
2	8/26, 8/28	Bellman Equation; Bellman Optimality
3	9/2, 9/4	Value Iteration; VI from Optimal-Control Perspective	HW1 announced (Tue)
4	9/9, 9/11	Policy Iteration; PI from Optimal-Control Perspective	HW1 due (Sun 11pm)
5	9/16, 9/18	Monte Carlo Learning; Stochastic Approximation & SGD
6	9/23, 9/25	Temporal-Difference Learning; Q-Learning	HW2 announced (Tue)
7	9/30, 10/2	Value Function Approximation; Deep Q-Learning (DQN)	HW2 due (Sun 11pm)
8	10/7, 10/9	Robotic Application of DQN	10/9 Fall Break · HW3 announced (Tue)
9	10/14, 10/16	Policy-Gradient Methods; Monte Carlo PG
10	10/21, 10/23	Actor–Critic; Deterministic Policy Gradient (DPG)	HW3 due (Sun 11pm)
11	10/28, 10/30	Deep Deterministic Policy Gradient (DDPG); Robotic Applications	HW4 announced (Tue)
12	11/4, 11/6	Trust Region Policy Optimization (TRPO I & II)	HW4 due (Sun 11pm)
13	11/11, 11/13	Robotic Application of TRPO; Proximal Policy Optimization (PPO)
14	11/18, 11/20	Robotic Application of PPO (I & II)	HW5 announced (Tue)
15	11/25, 11/27	PG from Optimal-Control Perspective	11/27 Thanksgiving Break
16	12/2, 12/4	Project Presentations	HW5 due (Sun 11pm)
—	12/8–12/12	Final Exam (Tue, 9:30–11:30 AM)	Project Report due Wed 11pm

Office Hours

Thu 10:45–11:30 AM · Mechanical Engineering Building, Room 435

Last updated: July 03, 2026