This course will be based on grounded natural language processing for robotics, where we will cover several recent research topics such as:
This will be a research-oriented grad-level seminar course, where we will read lots of interesting research papers, brainstorm about ideas on latest research topics, and code and write up fun and novel projects!
- interpreting and executing verbal instructions for navigation, articulation, manipulation, assembly, skill learning, etc.
- human-robot collaboration and dialogue for learning new subactions, mediating shared perceptual basis, referring expression generation, etc.
- grounding and language learning via dialogue-based and interactive games.
- automatic language generation for embodied tasks.
- grounded reinforcement learning.
- grounded knowledge representations (mapping language to world).
- machine learning models (structured and deep), datasets, and metrics for embodied language.
Please email me or drop by my office if you have any questions!
Since this is a graduate research-level class, some machine learning and coding experience is expected (see references below). Moreover, some NLP and RL background is highly recommended.
Grading will consist of:
Details in first class intro lecture slides. There will not be any exams. All submissions should be emailed to: firstname.lastname@example.org
- project presentations and write-ups (midterm = 15% and final = 25%; total = 40%)
- paper presentations (15%)
- paper written summaries (25%)
- class participation, discussion, and brainstorming (20%)
Written summaries are due *before class* by email to email@example.com. First paper summary submission will have no late penalties. After that, for every week's summary submission, there will be a 25% value reduction per late day. Other lateness policies (for projects, etc.) will be sent via email during the semester.
Paper summaries have to be written and submitted individually. Projects are encouraged to be done in pairs (but individual projects are fine too, e.g., if it relates to your current research), with clearly outlined contributions from each team member.
|Date||Topic || Readings ||Discussion Leaders||Todo's|
|Jan 17||Intro to the Course (and Example Papers) || slides || Mohit ||-
|Jan 24|| Navigational Instruction Following || (1) " Learning to Interpret Natural Language Navigation Instructions from Observations "; |
(2) " Weakly Supervised Learning of Semantic Parsers for Mapping Instructions to Actions ";
(3) " Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences "
| Adam A (1, 3), Bhavya V (2) ||2 paper summaries
|Jan 31|| Navigation and Manipulation Instruction Following || (1) " Learning to Follow Navigational Directions "; |
(2) " Following Directions Using Statistical Machine Translation ";
(3) " Understanding Natural Language Commands for Robotic Navigation and Mobile Manipulation "
| Evan D (1--3) ||2 paper summaries
|Feb 07|| Manipulation, Assembly, and Game Instruction Following || (1) " A Natural Language Planner Interface for Mobile Manipulators "; |
(2) " Reinforcement Learning for Mapping Instructions to Actions ";
(3) " Learning From Natural Instructions ";
(4) " Natural Language Communication with Robots "
| Biao J (1, 2), Shiwei F (3, 4) ||2 paper summaries
|Feb 14|| Recipe Instruction Following || (1) " Interpreting and Executing Recipes with a Cooking Robot "; |
(2) " Tell Me Dave: Context-Sensitive Grounding of Natural Language to Mobile Manipulation Instructions ";
(3) " Robobarista: Object Part based Transfer of Manipulation Trajectories from Crowd-sourcing in 3D Pointclouds "
| Andrew B (1, 2), Alan K (1, 3) ||2 paper summaries
|Feb 21|| Colloquium Speaker (Lillian Lee, Cornell University) -- 10-11am, FB007 || "Big data pragmatics!", or, "Putting computational linguistics in computational social science" || -- || Talk summary
|Feb 28|| Instruction and Question Generation by Robots || (1) " Clarifying commands with information-theoretic human-robot dialog "; |
(2) " Asking for help using inverse semantics ";
(3) " Navigational Instruction Generation as Inverse Reinforcement Learning with Neural Machine Translation "
| Hao T, Ram P ||2 paper summaries
|Mar 07|| Human-Robot Dialog 1 || (1) " PLOW: A Collaborative Task Learning Agent "; |
(2) " Back to the Blocks World: Learning New Actions through Situated Human-Robot Dialogue ";
(3) " Learning to Interpret Natural Language Commands through Human-Robot Dialog "
| Alyssa B, Chris G ||2 paper summaries
|Mar 14|| Spring break (no class) || -- || -- || --
|Mar 21|| Traveling (no class) || -- || -- || --
|Mar 28|| Midterm Project Presentations || Midterm Project Presentations || -- || Project write-ups due Apr1 midnight
|Apr 04|| Grounded Language Learning Through Dialog 2 || (1) " Learning Language Games through Interaction "; |
(2) " Collaborative Models for Referring Expression Generation in Situated Dialogue ";
(3) "Emergence of Grounded Compositional Language in Multi-Agent Populations";
(4) " Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning"
| Adam A, Andrew B ||2 paper summaries
|Apr 11|| Machine Learning Models for Dialog Generation || (1) "On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems"; |
(2) " Coherent Dialogue with Attention-based Language Models ";
(3) " A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues ;
(4) " Adversarial Learning for Neural Dialogue Generation "
| Evan D, Alyssa B ||2 paper summaries
|Apr 18|| Gestures, Turn-taking, Gaze in Human-Robot Interaction || (1) "Generation of Nodding, Head Tilting and Eye Gazing for Human-Robot Dialogue Interaction"; |
(2) " Conversational Gaze Aversion for Humanlike Robots ";
(3) " Effects of Responding to, Initiating and Ensuring Joint Attention in Human-Robot Interaction ”;
(4) " Simon plays Simon says: The timing of turn-taking in an imitation game "
| Shiwei F, Alan K ||2 paper summaries
|Apr 25|| Final Project Presentations || Final Project Presentations || -- || Final project write-ups due May5 midnight
Jointly Learning Grounded Task Structures from Language Instruction and Visual Demonstration
Toward Interactive Grounded Language Acquisition
Timing in Multimodal Turn-Taking Interactions: Control and Analysis Using Timed Petri Nets
The professor reserves the right to make changes to the syllabus, including project due dates. These changes will be announced as early as possible.