Professor: Andrea Bajcsy (abajcsy [at] cmu [dot] edu)
Office Hours: Wed, 12:20 - 1:00 pm (i.e., after class) or by appointment
Office Hours Location: NSH 4629

Lecture Time: Mon & Wed, 11:00 - 12:20 pm
Lecture Location: GHC 4101

Syllabus: PDF
Canvas: https://canvas.cmu.edu/courses/38604

Overview

Robot deployment around real people is rapidly accelerating: autonomous cars navigate through crowded cities on a daily basis, assistive robots increasingly help end-users with daily living tasks, and large teams of human engineers interactively teach robots basic skills. However, robot interaction with humans requires us to re-evaluate the assumptions built into all components of our autonomy algorithms, from decision-making, to machine learning, to safety analysis.

In this graduate seminar class, we will build the mathematical foundations for modeling human-robot interaction, develop the tools to analyze the safety and reliability of robots deployed around people, and investigate algorithms for robot learning from human data. The approaches covered will draw upon a variety of tools such as optimal control, dynamic game theory, Bayesian inference, and modern machine learning. Throughout the class, there will also be guest lectures from experts in the field. Students will practice essential research skills including reviewing papers, writing research project proposals, and technical communication.

News

[02/07/24]
Homework 1 description updated. Please see Canvas.
[01/09/24]
New classroom location: GHC 4101.

Schedule (tentative)

Date	Topic	Info
Week 1 Mon, Jan 15	No Class (MLK Day)
Week 1 Wed, Jan 17	Lecture Introduction	Please check the course syllabus Materials: *Slides*
Week 2 Mon, Jan 22	Lecture Dynamical systems model of interaction	Materials: *Notes*
Week 2 Wed, Jan 24	Lecture Optimal control & decision-making	Materials: *Notes*
Week 3 Mon, Jan 29	Lecture Multi-agent games & robust optimal control	Further reading: Differential Games I: Introduction. Isaacs. (1954) Materials: *Notes*
Week 3 Wed, Jan 31	Lecture Safety Analysis I	Further reading: A Time-Dependent Hamilton–Jacobi Formulation of Reachable Sets for Continuous Dynamic Games. Mitchell, et al. (2005) Hamilton-Jacobi formulation for reach-avoid differential games. Margellos & Lygeros (2009) Reach-avoid problems with time-varying dynamics, targets and constraints. Fisac, et al. (2015) Materials: *Notes*
Week 4 Mon, Feb 5	Lecture Safety Analysis II	Due Project Proposal Materials: *Notes*
Week 4 Wed Feb 7	Paper discussion Computationally scalable safety	Required reading: [P1] DeepReach: A Deep Learning Approach to High-Dimensional Reachability. Bansal & Tomlin (2020) [P2] ISAACS: Iterative Soft Adversarial Actor-Critic for Safety. Hsu, et al. (2023) Further reading: A Minimum Discounted Reward Hamilton-Jacobi Formulation for Computing Reachable Sets. Akametalu, et al. (2018) Reachability-Based Safety Guarantees using Efficient Initializations. Herbert, et al. (2019)
Week 5 Mon, Feb 12	Guest Lecture Jason Choi (UC Berkeley)	Title: Safety Filters for Uncertain Dynamical Systems: Control Theory & Data-driven Approaches Abstract: Safety is a primary concern when deploying autonomous robots in the real world. Model-based controllers designed to ensure safety constraints often fail due to model uncertainties present in real physical systems. Providing practical safety guarantees for uncertain systems is a significant challenge, which will be the main focus of this talk. In the first part of the talk, I will review three of the most popular methods for implementing safety filters in general nonlinear systems—Hamilton-Jacobi Reachability, Control Barrier Functions, and Model Predictive Control. The theories underlying each of these methods are well-established for systems with good mathematical models, and have been extended to account for the uncertainties of real-world systems. I will discuss the strengths, drawbacks, and connections of each method. In the second part of the talk, I will discuss how data-driven methods can help resolve the challenge. I will provide an overview of various data-driven safety filters developed during my PhD studies. Finally, I will explore remaining open research problems in addressing safety effectively for real-world robot autonomy. Materials: *Recording* Further Reading: An Efficient Reachability-Based Framework for Provably Safe Autonomous Navigation in Unknown Environments. Bajcsy, et al. (2019) Data-Driven Safety Filters: Hamilton-Jacobi Reachability, Control Barrier Functions, and Predictive Methods for Uncertain Systems. Wabersich, et al. (2023) The safety filter: A unified view of safety-critical control in autonomous systems. Hsu, et al. (2023)
Week 5 Wed, Feb 14	Paper discussion Safety filtering around humans	Required reading: [P1] On Infusing Reachability-Based Safety Assurance within Planning Frameworks for Human-Robot Vehicle Interactions. Leung, et al. (2020) [P2] SHARP: Shielding-aware robust planning for safe and efficient human-robot interaction. Hu, et al. (2022)
Week 6 Mon, Feb 19	Lecture Human prediction	Due Homework Further Reading: Maximum Entropy Inverse Reinforcement Learning. Ziebart, et al. (2008) Activity Forecasting. Kitani, et al. (2012) Predicting Human Reaching Motion in Collaborative Tasks Using Inverse Optimal Control and Iterative Re-planning. Mainprice, et al. (2015) Materials: *Notes*
Week 6 Wed, Feb 21	Guest Lecture Lasse Peters (TU Delft)	Title: Game-Theoretic Models for Multi-Agent Interaction Abstract: When multiple agents operate in a common environment, their actions are naturally interdependent and this coupling complicates planning. In this lecture, we will approach this problem through the lens of dynamic game theory. We will discuss how to model multi-agent interactions as general-sum games over continuous states and actions, characterize solution concepts of such games, and highlight the key challenges of solving them in practice. Based on this foundation, we will review established techniques to tractably approximate game solutions for online decision-making. Finally, will discuss extensions of the game-theoretic framework to settings that involve incomplete knowledge about the intent, dynamics, or state of other agents. Materials: *Recording* Further Reading: Social behavior for autonomous vehicles. Schwarting, et al. (2019) Efficient Iterative Linear-Quadratic Approximations for Nonlinear Multi-Player General-Sum Differential Games. Fridovich-Keil, et al. (2019) NashFormer: Leveraging Local Nash Equilibria for Semantically Diverse Trajectory Prediction. Lidard, et al. (2023)
Week 7 Mon, Feb 26	Lecture Human prediction: Data-driven	Further Reading: Trajectron++: Dynamically-Feasible Trajectory Forecasting With Heterogeneous Data. Salzmann and Ivanovic, et al. (2020) Identifying Driver Interactions via Conditional Behavior Prediction. Tolstaya, et al. (2021) MotionLM: Multi-Agent Motion Forecasting as Language Modeling. Seff, et al. (2023) Materials: *Notes*
Week 7 Wed, Feb 28	Guest Lecture Dr. Boris Ivanovic (NVIDIA)	Title: Behavior Prediction as a Nucleus of Modern AV Research Abstract: Research on behavior prediction, the task of predicting the future motion of agents, has had an outsized impact on multiple aspects of autonomous vehicles (AVs). From direct improvements in online driving performance to deeper connections between AV stack modules to enabling closed-loop training and evaluation in simulation with intelligent reactive agents, behavior prediction has served as a nucleus for much of modern AV research. In this lecture, I will discuss recent advancements along each of these directions, covering modern approaches for behavior prediction, generalization to unseen environments, tighter integrations of AV stack components (towards end-to-end AV architectures), and methods for simulating the behaviors of agents. Finally, I will outline some open research problems in modeling human motion and their potential impacts on downstream driving performance. Materials: *Recording*
Week 8 Mon, Mar 4	No Class (Spring Break)
Week 8 Wed, Mar 6	No Class (Spring Break)
Week 9 Mon, Mar 11	Paper discussion Embedding human models into safety I	Required Reading: [P1] Prediction-Based Reachability for Collision Avoidance in Autonomous Driving. Li, et al. (2021) [P2] Probabilistically Safe Robot Planning with Confidence-Based Human Predictions. Fisac & Bajcsy, et al. (2018)
Week 9 Wed, Mar 13	Guest lecture Prof. David Fridovich-Keil (UT Austin)	Title: Inverse games: an MPEC by any other name… Abstract: This lecture will introduce mathematical programs with equilibrium constraints (MPECs), and show how they encompass an “inverse” variant of mathematical games in which parameters of players’ costs and constraints must be inferred from data, or designed to yield specific equilibrium outcomes. We will begin with a brief review of the fundamentals of constrained optimization, and discuss how these familiar concepts appear in inverse games and the implications for designing efficient solution methods. The lecture will conclude with a review of several recent papers that present new developments in this space. Materials: *Recording*
Week 10 Mon, Mar 18	Paper discussion Embedding human models into safety II	Due Mid-term Report Required Reading: [P1] Safety Assurances for Human-Robot Interaction via Confidence-aware Game-theoretic Human Models. Tian, et al. (2022) [P2] Analyzing Human Models that Adapt Online. Bajcsy, et al. (2021)
Week 10 Wed, Mar 20	Lecture Sources of human feedback	Further Reading: Learning Robot Objectives from Physical Human Interaction. Bajcsy et al. (2018) Learning Generalizable Robotic Reward Functions from “In-The-Wild” Human Videos. Chen et al. (2021) Correcting Robot Plans with Natural Language Feedback. Sharma et al. (2022) Materials: *Slides*
Week 11 Mon, Mar 25	Lecture Reliably learning from human feedback	Further Reading: Quantifying Hypothesis Space Misspecification in Learning from Human-Robot Demonstrations and Physical Corrections. Bobu et al. (2020) Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality. Zhang and Cao. (2021)
Week 11 Wed, Mar 27	Paper discussion Reinforcement learning from human feedback	Required Reading: [P1] Deep reinforcement learning from human preferences. Christiano et al. (2017) [P2] Contrastive Preference Learning: Learning from Human Feedback without RL. Hejna et al. (2023) Further Reading: Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback. Capser et al. (2023) Nash Learning from Human Feedback. Munos et al. (2024)
Week 12 Mon, Apr 1	Guest Lecture Prof. Sanjiban Choudhury (Cornell)	Title: To RL or not to RL Abstract: Model-based Reinforcement Learning (MBRL) and Inverse Reinforcement Learning (IRL) are powerful techniques that leverage expert demonstrations to learn either models or rewards. However, traditional approaches suffer from a computational weakness: they require repeatedly solving a hard reinforcement learning (RL) problem as a subroutine. This requirement presents a formidable barrier to scalability. Is the RL subroutine necessary? After all, if the expert already provides a distribution of “good” states, does the learner really need to explore? In this work, we demonstrate an informed MBRL and IRL reduction that utilizes the state distribution of the expert to provide an exponential speedup in theory. In practice, we find that we are able to significantly speed up over prior art on continuous control tasks. Materials: *Recording*
Week 12 Wed, Apr 3	Paper discussion Alignment	Required Reading: [P1] Direct Preference Optimization: Your Language Model is Secretly a Reward Model. Rafailov et al. (2023) [P2] What Matters to You? Towards Visual Representation Alignment for Robot Learning. Tian et al. (2024) Further Reading: Getting aligned on representational alignment. Sucholutsky et al. (2023) AI Alignment: A Comprehensive Survey. Ji et al. (2023)
Week 13 Mon, Apr 8	Paper discussion Learning constraints from demonstration	Required Reading: [P1] Learning Autonomous Vehicle Safety Concepts from Demonstrations. Leung, et al. (2023) [P2] Learning Shared Safety Constraints from Multi-task Demonstrations. Kim & Swamy, et al. (2023)
Week 13 Wed, Apr 10	Paper discussion Latent-space safety	Required reading: [P1] LS3: Latent Space Safe Sets for Long-Horizon Visuomotor Control of Sparse Reward Iterative Tasks. Wilcox & Balakrishna, et al. (2021) [P2] Discovering Closed-Loop Failures of Vision-Based Controllers via Reachability Analysis. Chakraborty & Bansal (2023).
Week 14 Mon, Apr 15	Guest Lecture Prof. Aditi Raghunathan (CMU)	Talk Title: Robust machine learning with foundation models Abstract: In recent years, foundation models—large pretrained models that can be adapted for a wide range of tasks—have achieved state-of-the-art performance on a variety of tasks. While the pretrained models are trained on broad data, the adaptation (or fine-tuning) process is often performed on limited data. As a result, the challenges of distribution shift, where a model is deployed on a different distribution as the fine-tuning data remain, albeit in a different form. This talk will provide some concrete instances of this challenge and discuss some principles for developing robust approaches.
Week 14 Wed, Apr 17	Lecture What is safety in interactive robotics?	Further Reading: How Should a Robot Assess Risk? Towards an Axiomatic Theory of Risk in Robotics, Majumdar & Pavone (2017) A System-Level View on Out-of-Distribution Data in Robotics. Sinha et al. (2022) What Truly Matters in Trajectory Prediction for Autonomous Driving? Tran, Wu, Yu, et al. (2023) Materials: *Slides*
Week 15 Mon, Apr 22	Final presentations	Due Slides uploaded to Canvas Apr. 21, 11:59pm ET Presenters: Bowen Jiang, Yilin Wu, Weihao (Zack) Zeng, Samuel Li, Sidney Nimako – Boateng, Xilun Zhang
Week 15 Wed, Apr 24	Final presentations	Due Final report uploaded to Canvas on May 1, 11:59pm ET Presenters: Jehan Yang, Eliot Xing, Yumeng Xiu, Kavya Puthuveetil