Generalist Language Grounding Agents Challenge

Embodied AI Workshop @ CVPR 2023

Recent embodied agents have been successful in learning navigation and interaction skills from large-scale datasets, but progress has been limited to single-setting domains like either instruction-following or dialogue-driven tasks. To avoid over-specialization of models to specific datasets and tasks, this challenge encourages the development of generalist language grounding agents whose architectures transfer language-understanding and decision-making capabilities across tasks. For this first iteration, we unify aspects of the ALFRED and TEACh datasets. While both datasets are set in the Ai2THOR simulator, they differ along several axes:

  • Declarative (ALFRED) vs Dialogue (TEACh) language introduces grounding and alignment challenges
  • Agent heights change depth estimation and segmentation pipelines
  • Changes to the action spaces and room layout require world model generalization
Participants will submit independently to each leaderboard, and submissions will be ranked by a combined unseen success metric.

πŸ† Challenge Winners πŸ†

1st PlaceπŸ₯‡

Yonsei VnL: Jinyeon Kim, Byeonghwi Kim, Cheolhong Min, Yuyeong Kim, Taewoong Kim, Jonghyun Choi
Yonsei University

Can you do even better? Checkout ALFRED and TEACh