Professor: Andrea Bajcsy (abajcsy [at] cmu [dot] edu)
Office Hours: Wed, 12:20 - 1:00 pm (i.e., after class) or by appointment
Office Hours Location: NSH 4629

Lecture Time: Mon & Wed, 11:00 - 12:20 pm
Lecture Location: GHC 4101

Syllabus: PDF
Canvas: https://canvas.cmu.edu/courses/38604

Overview

Robot deployment around real people is rapidly accelerating: autonomous cars navigate through crowded cities on a daily basis, assistive robots increasingly help end-users with daily living tasks, and large teams of human engineers interactively teach robots basic skills. However, robot interaction with humans requires us to re-evaluate the assumptions built into all components of our autonomy algorithms, from decision-making, to machine learning, to safety analysis.

In this graduate seminar class, we will build the mathematical foundations for modeling human-robot interaction, develop the tools to analyze the safety and reliability of robots deployed around people, and investigate algorithms for robot learning from human data. The approaches covered will draw upon a variety of tools such as optimal control, dynamic game theory, Bayesian inference, and modern machine learning. Throughout the class, there will also be guest lectures from experts in the field. Students will practice essential research skills including reviewing papers, writing research project proposals, and technical communication.

News

  • [02/07/24]

    Homework 1 description updated. Please see Canvas.
  • [01/09/24]

    New classroom location: GHC 4101.

Schedule (tentative)

Date Topic Info
Week 1
Mon, Jan 15
No Class (MLK Day)
Week 1
Wed, Jan 17
Lecture Introduction
  • Please check the course syllabus

Materials: Slides

Week 2
Mon, Jan 22
Lecture Dynamical systems model of interaction

Materials: Notes

Week 2
Wed, Jan 24
Lecture Optimal control & decision-making

Materials: Notes

Week 3
Mon, Jan 29
Lecture Multi-agent games & robust optimal control

Further reading:

  • Differential Games I: Introduction. Isaacs. (1954)

Materials: Notes

Week 3
Wed, Jan 31
Lecture Safety Analysis I

Further reading:

  • A Time-Dependent Hamilton–Jacobi Formulation of Reachable Sets for Continuous Dynamic Games. Mitchell, et al. (2005)
  • Hamilton-Jacobi formulation for reach-avoid differential games. Margellos & Lygeros (2009)
  • Reach-avoid problems with time-varying dynamics, targets and constraints. Fisac, et al. (2015)

Materials: Notes

Week 4
Mon, Feb 5
Lecture Safety Analysis II Due Project Proposal

Materials: Notes

Week 4
Wed Feb 7
Paper discussion Computationally scalable safety

Required reading:

Further reading:

  • A Minimum Discounted Reward Hamilton-Jacobi Formulation for Computing Reachable Sets. Akametalu, et al. (2018)
  • Reachability-Based Safety Guarantees using Efficient Initializations. Herbert, et al. (2019)
Week 5
Mon, Feb 12
Guest Lecture Jason Choi (UC Berkeley)

Title: Safety Filters for Uncertain Dynamical Systems: Control Theory & Data-driven Approaches

Abstract: Safety is a primary concern when deploying autonomous robots in the real world. Model-based controllers designed to ensure safety constraints often fail due to model uncertainties present in real physical systems. Providing practical safety guarantees for uncertain systems is a significant challenge, which will be the main focus of this talk. In the first part of the talk, I will review three of the most popular methods for implementing safety filters in general nonlinear systems—Hamilton-Jacobi Reachability, Control Barrier Functions, and Model Predictive Control. The theories underlying each of these methods are well-established for systems with good mathematical models, and have been extended to account for the uncertainties of real-world systems. I will discuss the strengths, drawbacks, and connections of each method. In the second part of the talk, I will discuss how data-driven methods can help resolve the challenge. I will provide an overview of various data-driven safety filters developed during my PhD studies. Finally, I will explore remaining open research problems in addressing safety effectively for real-world robot autonomy.

Materials: Recording


Further Reading:

  • An Efficient Reachability-Based Framework for Provably Safe Autonomous Navigation in Unknown Environments. Bajcsy, et al. (2019)
  • Data-Driven Safety Filters: Hamilton-Jacobi Reachability, Control Barrier Functions, and Predictive Methods for Uncertain Systems. Wabersich, et al. (2023)
  • The safety filter: A unified view of safety-critical control in autonomous systems. Hsu, et al. (2023)
Week 5
Wed, Feb 14
Paper discussion Safety filtering around humans

Required reading:

Week 6
Mon, Feb 19
Lecture Human prediction

Due Homework

Further Reading:

  • Maximum Entropy Inverse Reinforcement Learning. Ziebart, et al. (2008)
  • Activity Forecasting. Kitani, et al. (2012)
  • Predicting Human Reaching Motion in Collaborative Tasks Using Inverse Optimal Control and Iterative Re-planning. Mainprice, et al. (2015)

Materials: Notes

Week 6
Wed, Feb 21
Guest Lecture Lasse Peters (TU Delft)

Title: Game-Theoretic Models for Multi-Agent Interaction

Abstract: When multiple agents operate in a common environment, their actions are naturally interdependent and this coupling complicates planning. In this lecture, we will approach this problem through the lens of dynamic game theory. We will discuss how to model multi-agent interactions as general-sum games over continuous states and actions, characterize solution concepts of such games, and highlight the key challenges of solving them in practice. Based on this foundation, we will review established techniques to tractably approximate game solutions for online decision-making. Finally, will discuss extensions of the game-theoretic framework to settings that involve incomplete knowledge about the intent, dynamics, or state of other agents.

Materials: Recording


Further Reading:

  • Social behavior for autonomous vehicles. Schwarting, et al. (2019)
  • Efficient Iterative Linear-Quadratic Approximations for Nonlinear Multi-Player General-Sum Differential Games. Fridovich-Keil, et al. (2019)
  • NashFormer: Leveraging Local Nash Equilibria for Semantically Diverse Trajectory Prediction. Lidard, et al. (2023)
Week 7
Mon, Feb 26
Lecture Human prediction: Data-driven

Further Reading:

  • Trajectron++: Dynamically-Feasible Trajectory Forecasting With Heterogeneous Data. Salzmann and Ivanovic, et al. (2020)
  • Identifying Driver Interactions via Conditional Behavior Prediction. Tolstaya, et al. (2021)
  • MotionLM: Multi-Agent Motion Forecasting as Language Modeling. Seff, et al. (2023)

Materials: Notes

Week 7
Wed, Feb 28
Guest Lecture Dr. Boris Ivanovic (NVIDIA)

Title: Behavior Prediction as a Nucleus of Modern AV Research

Abstract: Research on behavior prediction, the task of predicting the future motion of agents, has had an outsized impact on multiple aspects of autonomous vehicles (AVs). From direct improvements in online driving performance to deeper connections between AV stack modules to enabling closed-loop training and evaluation in simulation with intelligent reactive agents, behavior prediction has served as a nucleus for much of modern AV research. In this lecture, I will discuss recent advancements along each of these directions, covering modern approaches for behavior prediction, generalization to unseen environments, tighter integrations of AV stack components (towards end-to-end AV architectures), and methods for simulating the behaviors of agents. Finally, I will outline some open research problems in modeling human motion and their potential impacts on downstream driving performance.

Materials: Recording

Week 8
Mon, Mar 4
No Class (Spring Break)
Week 8
Wed, Mar 6
No Class (Spring Break)
Week 9
Mon, Mar 11
Paper discussion Embedding human models into safety I

Required Reading:

Week 9
Wed, Mar 13
Guest lecture Prof. David Fridovich-Keil (UT Austin)

Title: Inverse games: an MPEC by any other name…

Abstract: This lecture will introduce mathematical programs with equilibrium constraints (MPECs), and show how they encompass an “inverse” variant of mathematical games in which parameters of players’ costs and constraints must be inferred from data, or designed to yield specific equilibrium outcomes. We will begin with a brief review of the fundamentals of constrained optimization, and discuss how these familiar concepts appear in inverse games and the implications for designing efficient solution methods. The lecture will conclude with a review of several recent papers that present new developments in this space.

Materials: Recording

Week 10
Mon, Mar 18
Paper discussion Embedding human models into safety II

Due Mid-term Report

Required Reading:

Week 10
Wed, Mar 20
Lecture Sources of human feedback

Further Reading:

  • Learning Robot Objectives from Physical Human Interaction. Bajcsy et al. (2018)
  • Learning Generalizable Robotic Reward Functions from “In-The-Wild” Human Videos. Chen et al. (2021)
  • Correcting Robot Plans with Natural Language Feedback. Sharma et al. (2022)

Materials: Slides

Week 11
Mon, Mar 25
Lecture Reliably learning from human feedback

Further Reading:

  • Quantifying Hypothesis Space Misspecification in Learning from Human-Robot Demonstrations and Physical Corrections. Bobu et al. (2020)
  • Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality. Zhang and Cao. (2021)
Week 11
Wed, Mar 27
Paper discussion Reinforcement learning from human feedback

Required Reading:

Further Reading:

  • Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback. Capser et al. (2023)
  • Nash Learning from Human Feedback. Munos et al. (2024)
Week 12
Mon, Apr 1
Guest Lecture Prof. Sanjiban Choudhury (Cornell)

Title: To RL or not to RL

Abstract: Model-based Reinforcement Learning (MBRL) and Inverse Reinforcement Learning (IRL) are powerful techniques that leverage expert demonstrations to learn either models or rewards. However, traditional approaches suffer from a computational weakness: they require repeatedly solving a hard reinforcement learning (RL) problem as a subroutine. This requirement presents a formidable barrier to scalability. Is the RL subroutine necessary? After all, if the expert already provides a distribution of “good” states, does the learner really need to explore? In this work, we demonstrate an informed MBRL and IRL reduction that utilizes the state distribution of the expert to provide an exponential speedup in theory. In practice, we find that we are able to significantly speed up over prior art on continuous control tasks.

Materials: Recording

Week 12
Wed, Apr 3
Paper discussion Alignment

Required Reading:

Further Reading:

  • Getting aligned on representational alignment. Sucholutsky et al. (2023)
  • AI Alignment: A Comprehensive Survey. Ji et al. (2023)
Week 13
Mon, Apr 8
Paper discussion Learning constraints from demonstration

Required Reading:

Week 13
Wed, Apr 10
Paper discussion Latent-space safety

Required reading:

Week 14
Mon, Apr 15
Guest Lecture Prof. Aditi Raghunathan (CMU)

Talk Title: Robust machine learning with foundation models

Abstract: In recent years, foundation models—large pretrained models that can be adapted for a wide range of tasks—have achieved state-of-the-art performance on a variety of tasks. While the pretrained models are trained on broad data, the adaptation (or fine-tuning) process is often performed on limited data. As a result, the challenges of distribution shift, where a model is deployed on a different distribution as the fine-tuning data remain, albeit in a different form. This talk will provide some concrete instances of this challenge and discuss some principles for developing robust approaches.

Week 14
Wed, Apr 17
Lecture What is safety in interactive robotics?

Further Reading:

Materials: Slides

Week 15
Mon, Apr 22
Final presentations Due Slides uploaded to Canvas Apr. 21, 11:59pm ET

Presenters: Bowen Jiang, Yilin Wu, Weihao (Zack) Zeng, Samuel Li, Sidney Nimako – Boateng, Xilun Zhang

Week 15
Wed, Apr 24
Final presentations Due Final report uploaded to Canvas on May 1, 11:59pm ET

Presenters: Jehan Yang, Eliot Xing, Yumeng Xiu, Kavya Puthuveetil