Skip to main content

Showing 1–45 of 45 results for author: Avrithis, Y

  1. arXiv:2405.15587  [pdf, other

    cs.CV

    Composed Image Retrieval for Remote Sensing

    Authors: Bill Psomas, Ioannis Kakogeorgiou, Nikos Efthymiadis, Giorgos Tolias, Ondrej Chum, Yannis Avrithis, Konstantinos Karantzalos

    Abstract: This work introduces composed image retrieval to remote sensing. It allows to query a large image archive by image examples alternated by a textual description, enriching the descriptive power over unimodal queries, either visual or textual. Various attributes can be modified by the textual part, such as shape, color, or context. A novel method fusing image-to-image and text-to-image similarity is… ▽ More

    Submitted 29 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: Accepted for ORAL presentation at the 2024 IEEE International Geoscience and Remote Sensing Symposium

  2. arXiv:2405.06502  [pdf, other

    cs.CV

    Multi-Target Unsupervised Domain Adaptation for Semantic Segmentation without External Data

    Authors: Yonghao Xu, Pedram Ghamisi, Yannis Avrithis

    Abstract: Multi-target unsupervised domain adaptation (UDA) aims to learn a unified model to address the domain shift between multiple target domains. Due to the difficulty of obtaining annotations for dense predictions, it has recently been introduced into cross-domain semantic segmentation. However, most existing solutions require labeled data from the source domain and unlabeled data from multiple target… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

  3. arXiv:2404.15024  [pdf, other

    cs.CV cs.LG

    A Learning Paradigm for Interpretable Gradients

    Authors: Felipe Torres Figueroa, Hanwei Zhang, Ronan Sicre, Yannis Avrithis, Stephane Ayache

    Abstract: This paper studies interpretability of convolutional networks by means of saliency maps. Most approaches based on Class Activation Maps (CAM) combine information from fully connected layers and gradient through variants of backpropagation. However, it is well understood that gradients are noisy and alternatives like guided backpropagation have been proposed to obtain better visualization at infere… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: VISAPP 2024

  4. arXiv:2404.14996  [pdf, other

    cs.CV

    CA-Stream: Attention-based pooling for interpretable image recognition

    Authors: Felipe Torres, Hanwei Zhang, Ronan Sicre, Stéphane Ayache, Yannis Avrithis

    Abstract: Explanations obtained from transformer-based architectures in the form of raw attention, can be seen as a class-agnostic saliency map. Additionally, attention-based pooling serves as a form of masking the in feature space. Motivated by this observation, we design an attention-based pooling mechanism intended to replace Global Average Pooling (GAP) at inference. This mechanism, called Cross-Attenti… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: CVPR XAI4CV workshop 2024

  5. arXiv:2404.01524  [pdf, other

    cs.CV cs.AI

    On Train-Test Class Overlap and Detection for Image Retrieval

    Authors: Chull Hwan Song, Jooyoung Yoon, Taebaek Hwang, Shunghyun Choi, Yeong Hyeon Gu, Yannis Avrithis

    Abstract: How important is it for training and evaluation sets to not have class overlap in image retrieval? We revisit Google Landmarks v2 clean, the most popular training set, by identifying and removing class overlap with Revisited Oxford and Paris [34], the most popular evaluation set. By comparing the original and the new RGLDv2-clean on a benchmark of reproduced state-of-the-art methods, our findings… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: CVPR2024 Accepted

  6. arXiv:2312.12334  [pdf, other

    cs.CL

    PowMix: A Versatile Regularizer for Multimodal Sentiment Analysis

    Authors: Efthymios Georgiou, Yannis Avrithis, Alexandros Potamianos

    Abstract: Multimodal sentiment analysis (MSA) leverages heterogeneous data sources to interpret the complex nature of human sentiments. Despite significant progress in multimodal architecture design, the field lacks comprehensive regularization methods. This paper introduces PowMix, a versatile embedding space regularizer that builds upon the strengths of unimodal mixing-based regularization approaches and… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: Preprint

  7. arXiv:2311.05538  [pdf, other

    cs.LG cs.CV

    Embedding Space Interpolation Beyond Mini-Batch, Beyond Pairs and Beyond Examples

    Authors: Shashanka Venkataramanan, Ewa Kijak, Laurent Amsaleg, Yannis Avrithis

    Abstract: Mixup refers to interpolation-based data augmentation, originally motivated as a way to go beyond empirical risk minimization (ERM). Its extensions mostly focus on the definition of interpolation and the space (input or feature) where it takes place, while the augmentation process itself is less studied. In most methods, the number of generated examples is limited to the mini-batch size and the nu… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: Accepted to NeurIPS 2023. arXiv admin note: substantial text overlap with arXiv:2206.14868

  8. arXiv:2310.19996  [pdf, other

    cs.CV

    Adaptive Anchor Label Propagation for Transductive Few-Shot Learning

    Authors: Michalis Lazarou, Yannis Avrithis, Guangyu Ren, Tania Stathaki

    Abstract: Few-shot learning addresses the issue of classifying images using limited labeled data. Exploiting unlabeled data through the use of transductive inference methods such as label propagation has been shown to improve the performance of few-shot learning significantly. Label propagation infers pseudo-labels for unlabeled data by utilizing a constructed graph that exploits the underlying manifold str… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: published in ICIP 2023

  9. arXiv:2310.08584  [pdf, other

    cs.CV

    Is ImageNet worth 1 video? Learning strong image encoders from 1 long unlabelled video

    Authors: Shashanka Venkataramanan, Mamshad Nayeem Rizve, João Carreira, Yuki M. Asano, Yannis Avrithis

    Abstract: Self-supervised learning has unlocked the potential of scaling up pretraining to billions of images, since annotation is unnecessary. But are we making the best use of data? How more economical can we be? In this work, we attempt to answer this question by making two contributions. First, we investigate first-person videos and introduce a "Walking Tours" dataset. These videos are high-resolution,… ▽ More

    Submitted 23 May, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: Accepted to ICLR 2024 (Best paper honorable mention). Project Page: https://shashankvkt.github.io/dora

  10. arXiv:2309.15915  [pdf, other

    cs.CV

    Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts

    Authors: Deniz Engin, Yannis Avrithis

    Abstract: Recent vision-language models are driven by large-scale pretrained models. However, adapting pretrained models on limited data presents challenges such as overfitting, catastrophic forgetting, and the cross-modal gap between vision and language. We introduce a parameter-efficient method to address these challenges, combining multimodal prompt learning and a transformer-based mapping network, while… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

    Comments: ICCV2023 CLVL Workshop (Oral). Project page: https://engindeniz.github.io/vitis

  11. arXiv:2309.06891  [pdf, other

    cs.CV cs.LG

    Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit?

    Authors: Bill Psomas, Ioannis Kakogeorgiou, Konstantinos Karantzalos, Yannis Avrithis

    Abstract: Convolutional networks and vision transformers have different forms of pairwise interactions, pooling across layers and pooling at the end of the network. Does the latter really need to be different? As a by-product of pooling, vision transformers provide spatial attention for free, but this is most often of low quality unless self-supervised, which is not well studied. Is supervision really the p… ▽ More

    Submitted 13 September, 2023; originally announced September 2023.

    Comments: ICCV 2023. Code and models: https://github.com/billpsomas/simpool

    Journal ref: International Conference on Computer Vision (2023)

  12. arXiv:2304.14281  [pdf, other

    cs.CV

    Adaptive manifold for imbalanced transductive few-shot learning

    Authors: Michalis Lazarou, Yannis Avrithis, Tania Stathaki

    Abstract: Transductive few-shot learning algorithms have showed substantially superior performance over their inductive counterparts by leveraging the unlabeled queries. However, the vast majority of such methods are evaluated on perfectly class-balanced benchmarks. It has been shown that they undergo remarkable drop in performance under a more realistic, imbalanced setting. To this end, we propose a novel… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

  13. arXiv:2303.09554  [pdf, other

    cs.CV

    PartNeRF: Generating Part-Aware Editable 3D Shapes without 3D Supervision

    Authors: Konstantinos Tertikas, Despoina Paschalidou, Boxiao Pan, Jeong Joon Park, Mikaela Angelina Uy, Ioannis Emiris, Yannis Avrithis, Leonidas Guibas

    Abstract: Impressive progress in generative models and implicit representations gave rise to methods that can generate 3D shapes of high quality. However, being able to locally control and edit shapes is another essential property that can unlock several content creation applications. Local control can be achieved with part-aware models, but existing methods require 3D supervision and cannot produce texture… ▽ More

    Submitted 21 March, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

    Comments: To appear in CVPR 2023, Project Page: https://ktertikas.github.io/part_nerf

  14. arXiv:2301.07002  [pdf, other

    cs.CV

    Opti-CAM: Optimizing saliency maps for interpretability

    Authors: Hanwei Zhang, Felipe Torres, Ronan Sicre, Yannis Avrithis, Stephane Ayache

    Abstract: Methods based on class activation maps (CAM) provide a simple mechanism to interpret predictions of convolutional neural networks by using linear combinations of feature maps as saliency maps. By contrast, masking-based methods optimize a saliency map directly in the image space or learn it by training another network on additional data. In this work we introduce Opti-CAM, combining ideas from C… ▽ More

    Submitted 5 April, 2024; v1 submitted 17 January, 2023; originally announced January 2023.

    Comments: This work is under consideration at "Computer Vision and Image Understanding"

  15. arXiv:2210.11909  [pdf, other

    cs.CV cs.IR cs.LG

    Boosting vision transformers for image retrieval

    Authors: Chull Hwan Song, Jooyoung Yoon, Shunghyun Choi, Yannis Avrithis

    Abstract: Vision transformers have achieved remarkable progress in vision tasks such as image classification and detection. However, in instance-level image retrieval, transformers have not yet shown good performance compared to convolutional networks. We propose a number of improvements that make transformers outperform the state of the art for the first time. (1) We show that a hybrid architecture is more… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

    Comments: WACV 2023

  16. arXiv:2206.14868  [pdf, other

    cs.LG cs.CV

    Teach me how to Interpolate a Myriad of Embeddings

    Authors: Shashanka Venkataramanan, Ewa Kijak, Laurent Amsaleg, Yannis Avrithis

    Abstract: Mixup refers to interpolation-based data augmentation, originally motivated as a way to go beyond empirical risk minimization (ERM). Yet, its extensions focus on the definition of interpolation and the space where it takes place, while the augmentation itself is less studied: For a mini-batch of size $m$, most methods interpolate between $m$ pairs with a single scalar interpolation factor $λ$. I… ▽ More

    Submitted 29 June, 2022; originally announced June 2022.

  17. What to Hide from Your Students: Attention-Guided Masked Image Modeling

    Authors: Ioannis Kakogeorgiou, Spyros Gidaris, Bill Psomas, Yannis Avrithis, Andrei Bursuc, Konstantinos Karantzalos, Nikos Komodakis

    Abstract: Transformers and masked language modeling are quickly being adopted and explored in computer vision as vision transformers and masked image modeling (MIM). In this work, we argue that image token masking differs from token masking in text, due to the amount and correlation of tokens in an image. In particular, to generate a challenging pretext task for MIM, we advocate a shift from random masking… ▽ More

    Submitted 22 July, 2022; v1 submitted 23 March, 2022; originally announced March 2022.

    Comments: ECCV 2022. Codes and models are available at https://github.com/gkakogeorgiou/attmask

    Journal ref: European Conference on Computer Vision (2022)

  18. arXiv:2107.08000  [pdf, other

    cs.CV

    All the attention you need: Global-local, spatial-channel attention for image retrieval

    Authors: Chull Hwan Song, Hye Joo Han, Yannis Avrithis

    Abstract: We address representation learning for large-scale instance-level image retrieval. Apart from backbone, training pipelines and loss functions, popular approaches have focused on different spatial pooling and attention mechanisms, which are at the core of learning a powerful global image representation. There are different forms of attention according to the interaction of elements of the feature t… ▽ More

    Submitted 16 July, 2021; originally announced July 2021.

  19. arXiv:2106.05321  [pdf, other

    cs.CV

    Tensor feature hallucination for few-shot learning

    Authors: Michalis Lazarou, Tania Stathaki, Yannis Avrithis

    Abstract: Few-shot learning addresses the challenge of learning how to address novel tasks given not just limited supervision but limited data as well. An attractive solution is synthetic data generation. However, most such methods are overly sophisticated, focusing on high-quality, realistic data in the input space. It is unclear whether adapting them to the few-shot regime and using them for the downstrea… ▽ More

    Submitted 4 January, 2022; v1 submitted 9 June, 2021; originally announced June 2021.

    Comments: Accepted at WACV 2022. arXiv admin note: text overlap with arXiv:2104.09467

  20. arXiv:2106.04990  [pdf, other

    cs.LG cs.CV

    It Takes Two to Tango: Mixup for Deep Metric Learning

    Authors: Shashanka Venkataramanan, Bill Psomas, Ewa Kijak, Laurent Amsaleg, Konstantinos Karantzalos, Yannis Avrithis

    Abstract: Metric learning involves learning a discriminative representation such that embeddings of similar classes are encouraged to be close, while embeddings of dissimilar classes are pushed far apart. State-of-the-art methods focus mostly on sophisticated loss functions or mining strategies. On the one hand, metric learning losses consider two or more examples at a time. On the other hand, modern data a… ▽ More

    Submitted 28 February, 2022; v1 submitted 9 June, 2021; originally announced June 2021.

    Comments: Accepted to ICLR 2022

  21. arXiv:2104.09467  [pdf, ps, other

    cs.CV

    Few-shot learning via tensor hallucination

    Authors: Michalis Lazarou, Yannis Avrithis, Tania Stathaki

    Abstract: Few-shot classification addresses the challenge of classifying examples given only limited labeled data. A powerful approach is to go beyond data augmentation, towards data synthesis. However, most of data augmentation/synthesis methods for few-shot classification are overly complex and sophisticated, e.g. training a wGAN with multiple regularizers or training a network to transfer latent diversit… ▽ More

    Submitted 19 April, 2021; originally announced April 2021.

    Comments: Accepted as oral at ICLR2021 workshop: "Synthetic Data Generation: Quality, Privacy, Bias"

  22. arXiv:2103.15375  [pdf, other

    cs.CV

    AlignMixup: Improving Representations By Interpolating Aligned Features

    Authors: Shashanka Venkataramanan, Ewa Kijak, Laurent Amsaleg, Yannis Avrithis

    Abstract: Mixup is a powerful data augmentation method that interpolates between two or more examples in the input or feature space and between the corresponding target labels. Many recent mixup methods focus on cutting and pasting two or more objects into one image, which is more about efficient processing than interpolation. However, how to best interpolate images is not well defined. In this sense, mixup… ▽ More

    Submitted 25 March, 2022; v1 submitted 29 March, 2021; originally announced March 2021.

    Comments: Accepted to CVPR 2022

  23. arXiv:2103.14517  [pdf, other

    cs.CV

    On the hidden treasure of dialog in video question answering

    Authors: Deniz Engin, François Schnitzler, Ngoc Q. K. Duong, Yannis Avrithis

    Abstract: High-level understanding of stories in video such as movies and TV shows from raw data is extremely challenging. Modern video question answering (VideoQA) systems often use additional human-made sources like plot synopses, scripts, video descriptions or knowledge bases. In this work, we present a new approach to understand the whole story without such external sources. The secret lies in the dialo… ▽ More

    Submitted 19 August, 2021; v1 submitted 26 March, 2021; originally announced March 2021.

    Comments: ICCV 2021

  24. arXiv:2101.01480  [pdf, other

    cs.CV

    Local Propagation for Few-Shot Learning

    Authors: Yann Lifchitz, Yannis Avrithis, Sylvaine Picard

    Abstract: The challenge in few-shot learning is that available data is not enough to capture the underlying distribution. To mitigate this, two emerging directions are (a) using local image representations, essentially multiplying the amount of data by a constant factor, and (b) using more unlabeled data, for instance by transductive inference, jointly on a number of queries. In this work, we bring these tw… ▽ More

    Submitted 5 January, 2021; originally announced January 2021.

    Comments: ICPR 2020

  25. arXiv:2012.07962  [pdf, other

    cs.LG cs.AI cs.CV

    Iterative label cleaning for transductive and semi-supervised few-shot learning

    Authors: Michalis Lazarou, Tania Stathaki, Yannis Avrithis

    Abstract: Few-shot learning amounts to learning representations and acquiring knowledge such that novel tasks may be solved with both supervision and data being limited. Improved performance is possible by transductive inference, where the entire test set is available concurrently, and semi-supervised learning, where more unlabeled data is available. Focusing on these two settings, we introduce a new algori… ▽ More

    Submitted 28 March, 2023; v1 submitted 14 December, 2020; originally announced December 2020.

    Comments: published in ICCV 2021

  26. arXiv:2006.16331  [pdf, other

    cs.CV

    Asymmetric metric learning for knowledge transfer

    Authors: Mateusz Budnik, Yannis Avrithis

    Abstract: Knowledge transfer from large teacher models to smaller student models has recently been studied for metric learning, focusing on fine-grained classification. In this work, focusing on instance-level image retrieval, we study an asymmetric testing task, where the database is represented by the teacher and queries by the student. Inspired by this task, we introduce asymmetric metric learning, a nov… ▽ More

    Submitted 29 June, 2020; originally announced June 2020.

  27. arXiv:2002.07522  [pdf, other

    cs.CV

    Few-Shot Few-Shot Learning and the role of Spatial Attention

    Authors: Yann Lifchitz, Yannis Avrithis, Sylvaine Picard

    Abstract: Few-shot learning is often motivated by the ability of humans to learn new tasks from few examples. However, standard few-shot classification benchmarks assume that the representation is learned on a limited amount of base class data, ignoring the amount of prior knowledge that a human may have accumulated before learning new tasks. At the same time, even if a powerful representation is available,… ▽ More

    Submitted 18 February, 2020; originally announced February 2020.

  28. Walking on the Edge: Fast, Low-Distortion Adversarial Examples

    Authors: Hanwei Zhang, Yannis Avrithis, Teddy Furon, Laurent Amsaleg

    Abstract: Adversarial examples of deep neural networks are receiving ever increasing attention because they help in understanding and reducing the sensitivity to their input. This is natural given the increasing applications of deep neural networks in our everyday lives. When white-box attacks are almost always successful, it is typically only the distortion of the perturbations that matters in their evalua… ▽ More

    Submitted 5 December, 2019; v1 submitted 4 December, 2019; originally announced December 2019.

    Comments: 13 pages, 9 figures

  29. arXiv:1912.00384  [pdf, other

    cs.CV

    Training Object Detectors from Few Weakly-Labeled and Many Unlabeled Images

    Authors: Zhaohui Yang, Miaojing Shi, Chao Xu, Vittorio Ferrari, Yannis Avrithis

    Abstract: Weakly-supervised object detection attempts to limit the amount of supervision by dispensing the need for bounding boxes, but still assumes image-level labels on the entire training set. In this work, we study the problem of training an object detector from one or few images with image-level labels and a larger set of completely unlabeled images. This is an extreme case of semi-supervised learning… ▽ More

    Submitted 20 July, 2021; v1 submitted 1 December, 2019; originally announced December 2019.

    Comments: Accepted by Pattern Recognition

  30. arXiv:1911.08177  [pdf, ps, other

    cs.CV cs.LG

    Rethinking deep active learning: Using unlabeled data at model training

    Authors: Oriane Siméoni, Mateusz Budnik, Yannis Avrithis, Guillaume Gravier

    Abstract: Active learning typically focuses on training a model on few labeled examples alone, while unlabeled ones are only used for acquisition. In this work we depart from this setting by using both labeled and unlabeled data during model training across active learning cycles. We do so by using unsupervised feature learning at the beginning of the active learning pipeline and semi-supervised learning at… ▽ More

    Submitted 19 November, 2019; originally announced November 2019.

  31. arXiv:1910.00324  [pdf, other

    cs.CV cs.LG

    Graph convolutional networks for learning with few clean and many noisy labels

    Authors: Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, Ondrej Chum, Cordelia Schmid

    Abstract: In this work we consider the problem of learning a classifier from noisy labels when a few clean labeled examples are given. The structure of clean and noisy data is modeled by a graph per class and Graph Convolutional Networks (GCN) are used to predict class relevance of noisy examples. For each class, the GCN is treated as a binary classifier, which learns to discriminate clean from noisy exampl… ▽ More

    Submitted 24 August, 2020; v1 submitted 1 October, 2019; originally announced October 2019.

  32. arXiv:1905.06358  [pdf, other

    cs.CV

    Local Features and Visual Words Emerge in Activations

    Authors: Oriane Siméoni, Yannis Avrithis, Ondrej Chum

    Abstract: We propose a novel method of deep spatial matching (DSM) for image retrieval. Initial ranking is based on image descriptors extracted from convolutional neural network activations by global pooling, as in recent state-of-the-art work. However, the same sparse 3D activation tensor is also approximated by a collection of local features. These local features are then robustly matched to approximate t… ▽ More

    Submitted 15 May, 2019; originally announced May 2019.

    Journal ref: CVPR 2019

  33. arXiv:1904.04717  [pdf, other

    cs.CV cs.LG

    Label Propagation for Deep Semi-supervised Learning

    Authors: Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, Ondrej Chum

    Abstract: Semi-supervised learning is becoming increasingly important because it can combine data carefully labeled by humans with abundant unlabeled data to train deep neural networks. Classic methods on semi-supervised learning that have focused on transductive learning have not been fully exploited in the inductive framework followed by modern deep learning. The same holds for the manifold assumption---t… ▽ More

    Submitted 9 April, 2019; originally announced April 2019.

    Comments: Accepted to CVPR 2019

  34. Smooth Adversarial Examples

    Authors: Hanwei Zhang, Yannis Avrithis, Teddy Furon, Laurent Amsaleg

    Abstract: This paper investigates the visual quality of the adversarial examples. Recent papers propose to smooth the perturbations to get rid of high frequency artefacts. In this work, smoothing has a different meaning as it perceptually shapes the perturbation according to the visual content of the image to be attacked. The perturbation becomes locally smooth on the flat areas of the input image, but it m… ▽ More

    Submitted 28 March, 2019; originally announced March 2019.

  35. arXiv:1903.05050  [pdf, other

    cs.CV

    Dense Classification and Implanting for Few-Shot Learning

    Authors: Yann Lifchitz, Yannis Avrithis, Sylvaine Picard, Andrei Bursuc

    Abstract: Training deep neural networks from few examples is a highly challenging and key problem for many computer vision tasks. In this context, we are targeting knowledge transfer from a set with abundant data to other sets with few available examples. We propose two simple and effective solutions: (i) dense classification over feature maps, which for the first time studies local activations in the domai… ▽ More

    Submitted 12 March, 2019; originally announced March 2019.

    Comments: CVPR 2019

  36. arXiv:1807.08692  [pdf, ps, other

    cs.CV

    Hybrid Diffusion: Spectral-Temporal Graph Filtering for Manifold Ranking

    Authors: Ahmet Iscen, Yannis Avrithis, Giorgos Tolias, Teddy Furon, Ondrej Chum

    Abstract: State of the art image retrieval performance is achieved with CNN features and manifold ranking using a k-NN similarity graph that is pre-computed off-line. The two most successful existing approaches are temporal filtering, where manifold ranking amounts to solving a sparse linear system online, and spectral filtering, where eigen-decomposition of the adjacency matrix is performed off-line and th… ▽ More

    Submitted 22 November, 2018; v1 submitted 23 July, 2018; originally announced July 2018.

  37. arXiv:1803.11285  [pdf, other

    cs.CV

    Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

    Authors: Filip Radenović, Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, Ondřej Chum

    Abstract: In this paper we address issues with image retrieval benchmarking on standard and popular Oxford 5k and Paris 6k datasets. In particular, annotation errors, the size of the dataset, and the level of challenge are addressed: new annotation for both datasets is created with an extra attention to the reliability of the ground truth. Three new protocols of varying difficulty are introduced. The protoc… ▽ More

    Submitted 29 March, 2018; originally announced March 2018.

    Comments: CVPR 2018

  38. arXiv:1803.11095  [pdf, other

    cs.CV

    Mining on Manifolds: Metric Learning without Labels

    Authors: Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, Ondrej Chum

    Abstract: In this work we present a novel unsupervised framework for hard training example mining. The only input to the method is a collection of images relevant to the target application and a meaningful initial representation, provided e.g. by pre-trained CNN. Positive examples are distant points on a single manifold, while negative examples are nearby points on different manifolds. Both types of example… ▽ More

    Submitted 29 March, 2018; originally announced March 2018.

  39. arXiv:1709.04725  [pdf, other

    cs.CV

    Unsupervised object discovery for instance recognition

    Authors: Oriane Siméoni, Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, Ondrej Chum

    Abstract: Severe background clutter is challenging in many computer vision tasks, including large-scale image retrieval. Global descriptors, that are popular due to their memory and search efficiency, are especially prone to corruption by such a clutter. Eliminating the impact of the clutter on the image descriptor increases the chance of retrieving relevant images and prevents topic drift due to actually r… ▽ More

    Submitted 24 January, 2018; v1 submitted 14 September, 2017; originally announced September 2017.

  40. arXiv:1704.06591  [pdf, other

    cs.CV

    Panorama to panorama matching for location recognition

    Authors: Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, Teddy Furon, Ondrej Chum

    Abstract: Location recognition is commonly treated as visual instance retrieval on "street view" imagery. The dataset items and queries are panoramic views, i.e. groups of images taken at a single location. This work introduces a novel panorama-to-panorama matching process, either by aggregating features of individual images in a group or by explicitly constructing a larger panorama. In either case, multipl… ▽ More

    Submitted 21 April, 2017; originally announced April 2017.

  41. arXiv:1704.03755  [pdf, other

    cs.CV cs.IR

    Unsupervised part learning for visual recognition

    Authors: Ronan Sicre, Yannis Avrithis, Ewa Kijak, Frederic Jurie

    Abstract: Part-based image classification aims at representing categories by small sets of learned discriminative parts, upon which an image representation is built. Considered as a promising avenue a decade ago, this direction has been neglected since the advent of deep neural networks. In this context, this paper brings two contributions: first, it shows that despite the recent success of end-to-end holis… ▽ More

    Submitted 12 April, 2017; originally announced April 2017.

  42. arXiv:1703.06935  [pdf, other

    cs.CV

    Fast Spectral Ranking for Similarity Search

    Authors: Ahmet Iscen, Yannis Avrithis, Giorgos Tolias, Teddy Furon, Ondrej Chum

    Abstract: Despite the success of deep learning on representing images for particular object retrieval, recent studies show that the learned representations still lie on manifolds in a high dimensional space. This makes the Euclidean nearest neighbor search biased for this task. Exploring the manifolds online remains expensive even if a nearest neighbor graph has been computed offline. This work introduces a… ▽ More

    Submitted 29 March, 2018; v1 submitted 20 March, 2017; originally announced March 2017.

  43. arXiv:1611.05113  [pdf, other

    cs.CV

    Efficient Diffusion on Region Manifolds: Recovering Small Objects with Compact CNN Representations

    Authors: Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, Teddy Furon, Ondrej Chum

    Abstract: Query expansion is a popular method to improve the quality of image retrieval with both conventional and CNN representations. It has been so far limited to global image similarity. This work focuses on diffusion, a mechanism that captures the image manifold in the feature space. The diffusion is carried out on descriptors of overlapping image regions rather than on a global image descriptor like i… ▽ More

    Submitted 1 July, 2019; v1 submitted 15 November, 2016; originally announced November 2016.

    Comments: CVPR 2017

  44. arXiv:1611.04413  [pdf, other

    cs.CV

    Automatic discovery of discriminative parts as a quadratic assignment problem

    Authors: Ronan Sicre, Julien Rabin, Yannis Avrithis, Teddy Furon, Frederic Jurie

    Abstract: Part-based image classification consists in representing categories by small sets of discriminative parts upon which a representation of the images is built. This paper addresses the question of how to automatically learn such parts from a set of labeled training images. The training of parts is cast as a quadratic assignment problem in which optimal correspondences between image regions and parts… ▽ More

    Submitted 14 November, 2016; originally announced November 2016.

  45. arXiv:1603.09596  [pdf, ps, other

    cs.CG

    High-dimensional approximate nearest neighbor: k-d Generalized Randomized Forests

    Authors: Yannis Avrithis, Ioannis Z. Emiris, Georgios Samaras

    Abstract: We propose a new data-structure, the generalized randomized kd forest, or kgeraf, for approximate nearest neighbor searching in high dimensions. In particular, we introduce new randomization techniques to specify a set of independently constructed trees where search is performed simultaneously, hence increasing accuracy. We omit backtracking, and we optimize distance computations, thus acceleratin… ▽ More

    Submitted 31 March, 2016; originally announced March 2016.