Leaderboard is now live for submissions to the ALFRED challenge! Humans have a success rate of 91% on unseen environments, but our models have a 0.4% success rate. 😢 Can you do better?! Leaderboard submissions are eligible to win a cash prize (details coming soon) and a chance to speak at the workshop. Results will be frozen August 5, 2020.
The focus of this workshop is on embodied visual tasks that require the grounding of language to actions in real-world settings. Specifically, we want to draw focus to challenges like partial observability, continuous state spaces, and irrevocable actions for language-guided agents in visual environments.
Such challenges are not captured by current datasets for grounding and embodiment [1, 2, 3].
Egocentric and Robotic vision
Navigation and Motion Planning
Learning from Demonstration
Task and Symbolic Planning
Deep Reinforcement Learning
To encourage research in embodied vision & language, the workshop includes a benchmark challenge based on ALFRED.
This benchmark captures real-world complexities like object state changes, and requires long-horizon planning. This workshop exists to bring together Vision, Robotics, and NLP researchers to tackle the unique challenges of this three-field intersection that are often avoided when focusing only on vision-and-language or vision-and-robotics (i.e., 'embodied AI').
Challenge Papers Participants are required to upload their model to our evaluation server. The evaluation server automatically evaluates the models on an unseen test set. Final numbers for the challenge will be frozen for the camera ready on July 10
Publication OptionsArchival vs Unofficial Papers can submitted for publication in either the official proceedings (archival) or to be hosted on this website (unofficial). Both submission types can be presented at the workshop, but opting out of the proceedings allows you to submit your work for publication at another venue.