Skip to main content

Showing 1–3 of 3 results for author: Al-Tahan, H

  1. arXiv:2405.17247  [pdf, other

    cs.LG

    An Introduction to Vision-Language Modeling

    Authors: Florian Bordes, Richard Yuanzhe Pang, Anurag Ajay, Alexander C. Li, Adrien Bardes, Suzanne Petryk, Oscar Mañas, Zhiqiu Lin, Anas Mahmoud, Bargav Jayaraman, Mark Ibrahim, Melissa Hall, Yunyang Xiong, Jonathan Lebensold, Candace Ross, Srihari Jayakumar, Chuan Guo, Diane Bouchacourt, Haider Al-Tahan, Karthik Padthe, Vasu Sharma, Hu Xu, Xiaoqing Ellen Tan, Megan Richards, Samuel Lavoie , et al. (16 additional authors not shown)

    Abstract: Following the recent popularity of Large Language Models (LLMs), several attempts have been made to extend them to the visual domain. From having a visual assistant that could guide us through unfamiliar environments to generative models that produce images using only a high-level text description, the vision-language model (VLM) applications will significantly impact our relationship with technol… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  2. arXiv:2110.07082  [pdf, other

    cs.CV

    The Impact of Spatiotemporal Augmentations on Self-Supervised Audiovisual Representation Learning

    Authors: Haider Al-Tahan, Yalda Mohsenzadeh

    Abstract: Contrastive learning of auditory and visual perception has been extremely successful when investigated individually. However, there are still major questions on how we could integrate principles learned from both domains to attain effective audiovisual representations. In this paper, we present a contrastive framework to learn audiovisual representations from unlabeled videos. The type and strengt… ▽ More

    Submitted 13 October, 2021; originally announced October 2021.

  3. arXiv:2010.09542  [pdf, other

    cs.SD cs.LG eess.AS

    CLAR: Contrastive Learning of Auditory Representations

    Authors: Haider Al-Tahan, Yalda Mohsenzadeh

    Abstract: Learning rich visual representations using contrastive self-supervised learning has been extremely successful. However, it is still a major question whether we could use a similar approach to learn superior auditory representations. In this paper, we expand on prior work (SimCLR) to learn better auditory representations. We (1) introduce various data augmentations suitable for auditory data and ev… ▽ More

    Submitted 19 October, 2020; originally announced October 2020.