What is IES?

Given two images (source, target) each containing blocks arranged in some configuration, Image-based Event-Sequencing (IES) is a task to predict the sequence of actions required to rearrange objects in the source configuration to the target configuration.

What is BIRD?

Blocksworld Image Reasoning Dataset (BIRD) is a testbed for IES task to combine learning and reasoning for the IES task. Prediction of event sequences requires understanding of spatial configuration of objects through vision followed by reasoning about this configuration with respect to a visual reasoning task such as IES.


  • 1 Million image pairs (available in raw image format and structured representation of images) in the Blocksworld setting
  • 900 image pairs of natural images of indoor scenes
  • Ground truth event-sequence(s) for each image pair (“no sequence possible” for the given image pair is a valid case)
  • Evaluation scripts for testing inductive generalization capability of a method

Instructions for downloading the latest version of BIRD can be found here.


  1. Blocksworld Revisited: Learning and Reasoning to Generate Event-Sequences from Image Pairs
  2. Download the paper


  3. Cooking With Blocks: A Recipe for Visual Reasoning on Image Pairs, CVPR 2019 Vision Meets Cognition Workshop
  4. Download the paper


Contact Tejas Gokhale for further information tgokhale@asu.edu