Skip to main content

Showing 1–7 of 7 results for author: Killick, G

  1. arXiv:2403.06289  [pdf, other

    cs.CV cs.AI cs.LG

    Understanding and Mitigating Human-Labelling Errors in Supervised Contrastive Learning

    Authors: Zijun Long, Lipeng Zhuang, George Killick, Richard McCreadie, Gerardo Aragon Camarasa, Paul Henderson

    Abstract: Human-annotated vision datasets inevitably contain a fraction of human mislabelled examples. While the detrimental effects of such mislabelling on supervised learning are well-researched, their influence on Supervised Contrastive Learning (SCL) remains largely unexplored. In this paper, we show that human-labelling errors not only differ significantly from synthetic label errors, but also pose uni… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2311.16481

  2. arXiv:2402.14551  [pdf, other

    cs.CV cs.AI cs.LG

    CLCE: An Approach to Refining Cross-Entropy and Contrastive Learning for Optimized Learning Fusion

    Authors: Zijun Long, George Killick, Lipeng Zhuang, Gerardo Aragon-Camarasa, Zaiqiao Meng, Richard Mccreadie

    Abstract: State-of-the-art pre-trained image models predominantly adopt a two-stage approach: initial unsupervised pre-training on large-scale datasets followed by task-specific fine-tuning using Cross-Entropy loss~(CE). However, it has been demonstrated that CE can compromise model generalization and stability. While recent works employing contrastive learning address some of these limitations by enhancing… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: text overlap with arXiv:2308.14893

  3. arXiv:2312.01450  [pdf, other

    cs.CV cs.AI cs.LG

    Foveation in the Era of Deep Learning

    Authors: George Killick, Paul Henderson, Paul Siebert, Gerardo Aragon-Camarasa

    Abstract: In this paper, we tackle the challenge of actively attending to visual scenes using a foveated sensor. We introduce an end-to-end differentiable foveated active vision architecture that leverages a graph convolutional network to process foveated images, and a simple yet effective formulation for foveated image sampling. Our model learns to iteratively attend to regions of the image relevant for cl… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

    Comments: Accepted at BMVC2023

    ACM Class: I.2.10; I.5.1; I.4.8

  4. arXiv:2311.16481  [pdf, other

    cs.CV

    Elucidating and Overcoming the Challenges of Label Noise in Supervised Contrastive Learning

    Authors: Zijun Long, George Killick, Lipeng Zhuang, Richard McCreadie, Gerardo Aragon Camarasa, Paul Henderson

    Abstract: Image classification datasets exhibit a non-negligible fraction of mislabeled examples, often due to human error when one class superficially resembles another. This issue poses challenges in supervised contrastive learning (SCL), where the goal is to cluster together data points of the same class in the embedding space while distancing those of disparate classes. While such methods outperform tho… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

  5. arXiv:2310.10221  [pdf, other

    cs.RO cs.CV

    RoboLLM: Robotic Vision Tasks Grounded on Multimodal Large Language Models

    Authors: Zijun Long, George Killick, Richard McCreadie, Gerardo Aragon Camarasa

    Abstract: Robotic vision applications often necessitate a wide range of visual perception tasks, such as object detection, segmentation, and identification. While there have been substantial advances in these individual tasks, integrating specialized models into a unified vision pipeline presents significant engineering challenges and costs. Recently, Multimodal Large Language Models (MLLMs) have emerged as… ▽ More

    Submitted 23 February, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

  6. arXiv:2309.01516  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    MultiWay-Adapater: Adapting large-scale multi-modal models for scalable image-text retrieval

    Authors: Zijun Long, George Killick, Richard McCreadie, Gerardo Aragon Camarasa

    Abstract: As Multimodal Large Language Models (MLLMs) grow in size, adapting them to specialized tasks becomes increasingly challenging due to high computational and memory demands. Indeed, traditional fine-tuning methods are costly, due to the need for extensive, task-specific training. While efficient adaptation methods exist that aim to reduce these costs, in practice they suffer from shallow inter-modal… ▽ More

    Submitted 5 February, 2024; v1 submitted 4 September, 2023; originally announced September 2023.

  7. arXiv:2308.14893  [pdf, other

    cs.CV cs.AI cs.LG

    When hard negative sampling meets supervised contrastive learning

    Authors: Zijun Long, George Killick, Richard McCreadie, Gerardo Aragon Camarasa, Zaiqiao Meng

    Abstract: State-of-the-art image models predominantly follow a two-stage strategy: pre-training on large datasets and fine-tuning with cross-entropy loss. Many studies have shown that using cross-entropy can result in sub-optimal generalisation and stability. While the supervised contrastive loss addresses some limitations of cross-entropy loss by focusing on intra-class similarities and inter-class differe… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.