Learning and Control (LC) Lab

Learning-Based Control for Robotics (ME461/561)

Learning-Based Control for Robotics (ME461/561)

Semester: Fall 2026
Instructor: Dr. Leilei Cui · Email: lcui@unm.edu
Time: Tue & Thu, 9:30–10:45 AM
Location: Science & Math Learning Center B79


Course Description

Learning-based methods have achieved remarkable progress in robot control, with applications spanning autonomous driving, quadrupeds, and humanoids. At the core of these advances lie optimal control and reinforcement learning (RL). This course develops the mathematical foundations of these methodologies and demonstrates how to apply them to real robotic systems. Topics range from core RL concepts to advanced algorithms (value/policy iteration, TD/Q-learning, actor–critic, DPG/DDPG, TRPO, PPO). The course emphasizes hands-on practice so you can use these tools in your own research.

Course Goals

Build a solid foundation in reinforcement learning and show how RL integrates with control for robotics, strengthening modeling skills, algorithmic thinking, and data-driven decision-making.

Student Learning Outcomes

By the end of the course, you will be able to:

  1. Explain key RL concepts: value/Q functions, Bellman equations, value/policy iteration.
  2. Implement RL algorithms (DQN, DDPG, PPO).
  3. Apply RL to robotic control tasks (e.g., drone flight).
  4. Interpret RL from an optimal-control perspective.

Prerequisites

Course Materials

No required textbook. Lecture notes will be posted online. Additional readings will accompany each lecture.


Assessments

Assessment Weight Notes
Final Exam 35% Tue, Dec 9, 9:30–11:30 AM
Homework 30% 2 writing assignments + 3 coding problems
Project 30% Final report (15%) + presentation (15%)
Participation / Attendance 5% Up to 2 unexcused absences

Schedule (subject to change)

Minor changes announced in class; major changes provided in writing.

Week Dates Topics Notes
1 8/19, 8/21 Syllabus & Introduction; Basic Concepts of RL  
2 8/26, 8/28 Bellman Equation; Bellman Optimality  
3 9/2, 9/4 Value Iteration; VI from Optimal-Control Perspective HW1 announced (Tue)
4 9/9, 9/11 Policy Iteration; PI from Optimal-Control Perspective HW1 due (Sun 11pm)
5 9/16, 9/18 Monte Carlo Learning; Stochastic Approximation & SGD  
6 9/23, 9/25 Temporal-Difference Learning; Q-Learning HW2 announced (Tue)
7 9/30, 10/2 Value Function Approximation; Deep Q-Learning (DQN) HW2 due (Sun 11pm)
8 10/7, 10/9 Robotic Application of DQN 10/9 Fall Break · HW3 announced (Tue)
9 10/14, 10/16 Policy-Gradient Methods; Monte Carlo PG  
10 10/21, 10/23 Actor–Critic; Deterministic Policy Gradient (DPG) HW3 due (Sun 11pm)
11 10/28, 10/30 Deep Deterministic Policy Gradient (DDPG); Robotic Applications HW4 announced (Tue)
12 11/4, 11/6 Trust Region Policy Optimization (TRPO I & II) HW4 due (Sun 11pm)
13 11/11, 11/13 Robotic Application of TRPO; Proximal Policy Optimization (PPO)  
14 11/18, 11/20 Robotic Application of PPO (I & II) HW5 announced (Tue)
15 11/25, 11/27 PG from Optimal-Control Perspective 11/27 Thanksgiving Break
16 12/2, 12/4 Project Presentations HW5 due (Sun 11pm)
12/8–12/12 Final Exam (Tue, 9:30–11:30 AM) Project Report due Wed 11pm

Office Hours

Thu 10:45–11:30 AM · Mechanical Engineering Building, Room 435


Last updated: July 03, 2026