Skip to main content

Showing 1–3 of 3 results for author: Hanu, L

  1. arXiv:2309.10783  [pdf, other

    cs.CV cs.AI cs.CL

    Language as the Medium: Multimodal Video Classification through text only

    Authors: Laura Hanu, Anita L. Verő, James Thewlis

    Abstract: Despite an exciting new wave of multimodal machine learning models, current approaches still struggle to interpret the complex contextual relationships between the different modalities present in videos. Going beyond existing methods that emphasize simple activities or objects, we propose a new model-agnostic approach for generating detailed textual descriptions that captures multimodal video info… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

    Comments: Accepted at "What is Next in Multimodal Foundation Models?" (MMFM) workshop at ICCV 2023

  2. arXiv:2210.10820  [pdf, other

    cs.CV cs.CL cs.IR cs.LG

    VTC: Improving Video-Text Retrieval with User Comments

    Authors: Laura Hanu, James Thewlis, Yuki M. Asano, Christian Rupprecht

    Abstract: Multi-modal retrieval is an important problem for many applications, such as recommendation and search. Current benchmarks and even datasets are often manually constructed and consist of mostly clean samples where all modalities are well-correlated with the content. Thus, current video-text retrieval literature largely focuses on video titles or audio transcripts, while ignoring user comments, sin… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

    Comments: Accepted paper at the European Conference on Computer Vision (ECCV) 2022

  3. arXiv:2001.11055  [pdf, other

    cs.CV cs.LG

    Evaluating Robustness to Context-Sensitive Feature Perturbations of Different Granularities

    Authors: Isaac Dunn, Laura Hanu, Hadrien Pouget, Daniel Kroening, Tom Melham

    Abstract: We cannot guarantee that training datasets are representative of the distribution of inputs that will be encountered during deployment. So we must have confidence that our models do not over-rely on this assumption. To this end, we introduce a new method that identifies context-sensitive feature perturbations (e.g. shape, location, texture, colour) to the inputs of image classifiers. We produce th… ▽ More

    Submitted 23 October, 2020; v1 submitted 29 January, 2020; originally announced January 2020.