O-DRUM @ CVPR 2023

Workshop on Open-Domain Reasoning Under Multi-Modal Settings

June 19, 2023
8:30 AM -- 5:00 PM PDT

ODRUM 2022 Archive: [Webpage] YouTube

AI has undergone a paradigm shift in the past decade -- the connection between vision and language (V+L) is now an integral part of AI, with deep impact beyond vision and NLP -- robotics, graphics, cybersecurity, and HCI are utilizing V+L tools and there are direct industrial implications for software, arts, and media. The link between vision and language is much more complex than simple image--text alignment – the use of language for reasoning beyond the visible (for example, physical reasoning, spatial reasoning, commonsense reasoning, and embodied reasoning) is being pursued. Open-Domain Reasoning in Multi-Modal Settings (ODRUM 2023) provides a platform for discussions on multimodal (vision+language) topics with special emphasis on reasoning capabilities.

The aim of ODRUM 2023 is to address the emerging topic of visual reasoning using multiple modalities (such as text, images, videos, audio, etc.). The workshop will feature invited talks by experts in the realm of reasoning such as: embodied AI, navigation, learning via interaction and collaboration with humans, building large V+L that can perform multiple tasks, visual grounding, and the use of language to instruct robots. Participants and speakers will converge for a panel discussion to discuss the importance of reasoning (a core AI topic that has a rich and long history since the 1950s) to computer vision, relevance to recent progress in visual reasoning, discuss trends and challenges in open-domain reasoning, from different perspectives of NLP, vision, machine learning, and robotics researchers.

Confirmed Speakers

Kristen Grauman
University of Texas at Austin

Jiajun Wu
Assistant Professor
Stanford University

Alane Suhr
Young Investigator
Allen Institute for AI

Jean_Baptiste Alayrac
Research Scientist

Angel Xuan Chang
Assistant Professor
Simon Fraser University

Tentative Schedule

08:30 -- 08:45Welcome and Introduction
08:45 -- 09:35Invited Talk 1
09:35 -- 10:00Spotlight Talks
10:00 -- 10:45Poster Session + Coffee Break
10:45 -- 11:35Invited Talks 2
11:35 -- 12:25Invited Talk 3
12:25 -- 13:25Lunch
13:25 -- 14:15Invited Talk 4
14:15 -- 15:05Invited Talk 5
15:05 -- 16:00Poster Session 2 + Coffee Break
16:00 -- 17:15Panel Discussion + Concluding Remarks

Call for Papers

We invite submissions related to the broad topic area of multi-modal understanding, reasoning and comprehension, including but not limited to following topics:

We encourage two types of submissions:

All submitted materials must be anonymized and formatted according to the CVPR 2023 author guidelines and template. Accepted papers will be presented as posters at the workshop, where attendees, invited speakers, and organizers will engage in discussion. We intend to highlight three best papers during the workshop session with spotlight presentations. We will give authors of all accepted papers an option to opt-in or opt-out of CVPR proceedings.

Important Dates:

Submission Deadline:March 24, 2023, 23:59 PDT
Notification of Decision:March 31, 2023
Camera Ready Deadline:April 08, 2022 , 12:00 PDT
Submission website (CMT): https://cmt3.research.microsoft.com/ODRUM2023/


Please contact Man Luo (mluo26@asu.edu) or Tejas Gokhale (tgokhale@asu.edu) for additional details
The workshop is supported by US National Science Foundation grants 1816039, 2132724 as part of Research, Education, and Outreach activities.

