Skip to main content

Showing 1–1 of 1 results for author: Wasim, S T

  1. arXiv:2304.03307  [pdf, other

    cs.CV eess.IV

    Vita-CLIP: Video and text adaptive CLIP via Multimodal Prompting

    Authors: Syed Talal Wasim, Muzammal Naseer, Salman Khan, Fahad Shahbaz Khan, Mubarak Shah

    Abstract: Adopting contrastive image-text pretrained models like CLIP towards video classification has gained attention due to its cost-effectiveness and competitive performance. However, recent works in this area face a trade-off. Finetuning the pretrained model to achieve strong supervised performance results in low zero-shot generalization. Similarly, freezing the backbone to retain zero-shot capability… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

    Comments: Accepted at CVPR-2023. Codes/models available at https://github.com/TalalWasim/Vita-CLIP