Skip to main content

Showing 1–1 of 1 results for author: Khangaonkar, K

  1. arXiv:2306.05392  [pdf, other

    cs.CL

    Modular Visual Question Answering via Code Generation

    Authors: Sanjay Subramanian, Medhini Narasimhan, Kushal Khangaonkar, Kevin Yang, Arsha Nagrani, Cordelia Schmid, Andy Zeng, Trevor Darrell, Dan Klein

    Abstract: We present a framework that formulates visual question answering as modular code generation. In contrast to prior work on modular approaches to VQA, our approach requires no additional training and relies on pre-trained language models (LMs), visual models pre-trained on image-caption pairs, and fifty VQA examples used for in-context learning. The generated Python programs invoke and compose the o… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: ACL 2023