ALFRED (Action Learning From Realistic Environments and Directives), is a new benchmark for learning a mapping from natural language instructions and egocentric vision to sequences of actions for household tasks. Long composition rollouts with non-reversible state changes are among the phenomena we include to shrink the gap between research benchmarks and real-world applications.
Rank | Submission | Created | Unseen Success Rate |
Seen Success Rate |
Seen PLWSR |
Unseen PLWSR |
Seen GC |
Unseen GC |
Seen PLW GC Success Rate |
Unseen PLW GC Success Rate |
---|
See this guide for downloading the dataset, running models, and data-augmentation. Also see the paper for a description of the challenge.
We use a model-agnostic evaluation process to measure the performance of your trained agent. See the guide for running a model on the test sets to create a JSON dump of action-sequences executed by the agent. These action-sequences are replayed on the leaderboard server to compute the performance metrics.
We will report the following metrics:
Run your model on test seen and unseen sets, and create an action-sequence dump of your agent:
$ python models/eval/leaderboard.py --model_path <model_path>/best_seen.pth --model models.model.seq2seq_im_mask --data data/json_feat_2.1.0 --gpu --num_threads 5
This will create a JSON file, e.g. task_results_20191218_081448_662435.json
, inside the <model_path>
folder. Email this file to askforalfred@googlegroups.com, preferrably through a storage link on a platform like Google Drive, Dropbox etc.
We provide a pre-trained Seq2Seq+PM (Both) model described in the paper.
Only one submission is allowed every 7 days. All submissions will be made public. Please do not create anonymous emails for multiple submissions. Use the val set to iterate on your agent.
If you need any help, please email us at askforalfred@googlegroups.com and say you're asking about leaderboard ALFRED. Please include a submission URL if you are asking about a specific submission.
If you need any help, please email us at askforalfred@googlegroups.com. Please be specific about your query.