Skip to main content

Showing 1–21 of 21 results for author: de Melo, C

  1. arXiv:2407.06839  [pdf, other

    cs.CV

    A Mamba-based Siamese Network for Remote Sensing Change Detection

    Authors: Jay N. Paranjape, Celso de Melo, Vishal M. Patel

    Abstract: Change detection in remote sensing images is an essential tool for analyzing a region at different times. It finds varied applications in monitoring environmental changes, man-made changes as well as corresponding decision-making and prediction of future trends. Deep learning methods like Convolutional Neural Networks (CNNs) and Transformers have achieved remarkable success in detecting significan… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 11 pages, 7 figures

  2. arXiv:2406.13123  [pdf, other

    cs.AI cs.CV

    ViLCo-Bench: VIdeo Language COntinual learning Benchmark

    Authors: Tianqi Tang, Shohreh Deldari, Hao Xue, Celso De Melo, Flora D. Salim

    Abstract: Video language continual learning involves continuously adapting to information from video and text inputs, enhancing a model's ability to handle new tasks while retaining prior knowledge. This field is a relatively under-explored area, and establishing appropriate datasets is crucial for facilitating communication and research in this field. In this study, we present the first dedicated benchmark… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 14 pages, 4 figures, 8 tables, under review

  3. arXiv:2403.14874  [pdf, other

    cs.CV cs.LG

    WeatherProof: Leveraging Language Guidance for Semantic Segmentation in Adverse Weather

    Authors: Blake Gella, Howard Zhang, Rishi Upadhyay, Tiffany Chang, Nathan Wei, Matthew Waliman, Yunhao Ba, Celso de Melo, Alex Wong, Achuta Kadambi

    Abstract: We propose a method to infer semantic segmentation maps from images captured under adverse weather conditions. We begin by examining existing models on images degraded by weather conditions such as rain, fog, or snow, and found that they exhibit a large performance drop as compared to those captured under clear weather. To control for changes in scene structures, we propose WeatherProof, the first… ▽ More

    Submitted 7 May, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2312.09534

  4. arXiv:2312.14126  [pdf, other

    cs.CV

    Entropic Open-set Active Learning

    Authors: Bardia Safaei, Vibashan VS, Celso M. de Melo, Vishal M. Patel

    Abstract: Active Learning (AL) aims to enhance the performance of deep models by selecting the most informative samples for annotation from a pool of unlabeled data. Despite impressive performance in closed-set settings, most AL methods fail in real-world scenarios where the unlabeled data contains unknown categories. Recently, a few studies have attempted to tackle the AL problem for the open-set setting.… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: Accepted in AAAI 2024

  5. arXiv:2312.02914  [pdf, other

    cs.CV cs.LG

    Unsupervised Video Domain Adaptation with Masked Pre-Training and Collaborative Self-Training

    Authors: Arun Reddy, William Paul, Corban Rivera, Ketul Shah, Celso M. de Melo, Rama Chellappa

    Abstract: In this work, we tackle the problem of unsupervised domain adaptation (UDA) for video action recognition. Our approach, which we call UNITE, uses an image teacher model to adapt a video student model to the target domain. UNITE first employs self-supervised pre-training to promote discriminative feature learning on target domain videos using a teacher-guided masked distillation objective. We then… ▽ More

    Submitted 20 April, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: Accepted at CVPR 2024. 13 pages, 4 figures. Approved for public release: distribution unlimited

  6. arXiv:2312.02151  [pdf, other

    cs.CV cs.AI cs.LG

    Guarding Barlow Twins Against Overfitting with Mixed Samples

    Authors: Wele Gedara Chaminda Bandara, Celso M. De Melo, Vishal M. Patel

    Abstract: Self-supervised Learning (SSL) aims to learn transferable feature representations for downstream applications without relying on labeled data. The Barlow Twins algorithm, renowned for its widespread adoption and straightforward implementation compared to its counterparts like contrastive learning methods, minimizes feature redundancy while maximizing invariance to common corruptions. Optimizing fo… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

    Comments: Code and checkpoints are available at: https://github.com/wgcban/mix-bt.git

  7. arXiv:2309.16650  [pdf, other

    cs.RO cs.CV

    ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning

    Authors: Qiao Gu, Alihusein Kuwajerwala, Sacha Morin, Krishna Murthy Jatavallabhula, Bipasha Sen, Aditya Agarwal, Corban Rivera, William Paul, Kirsty Ellis, Rama Chellappa, Chuang Gan, Celso Miguel de Melo, Joshua B. Tenenbaum, Antonio Torralba, Florian Shkurti, Liam Paull

    Abstract: For robots to perform a wide variety of tasks, they require a 3D representation of the world that is semantically rich, yet compact and efficient for task-driven perception and planning. Recent approaches have attempted to leverage features from large vision-language models to encode semantics in 3D representations. However, these approaches tend to produce maps with per-point feature vectors, whi… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: Project page: https://concept-graphs.github.io/ Explainer video: https://youtu.be/mRhNkQwRYnc

  8. arXiv:2307.10018  [pdf, other

    cs.RO cs.AI

    RobôCIn Small Size League Extended Team Description Paper for RoboCup 2023

    Authors: Aline Lima de Oliveira, Cauê Addae da Silva Gomes, Cecília Virginia Santos da Silva, Charles Matheus de Sousa Alves, Danilo Andrade Martins de Souza, Driele Pires Ferreira Araújo Xavier, Edgleyson Pereira da Silva, Felipe Bezerra Martins, Lucas Henrique Cavalcanti Santos, Lucas Dias Maciel, Matheus Paixão Gumercindo dos Santos, Matheus Lafayette Vasconcelos, Matheus Vinícius Teotonio do Nascimento Andrade, João Guilherme Oliveira Carvalho de Melo, João Pedro Souza Pereira de Moura, José Ronald da Silva, José Victor Silva Cruz, Pedro Henrique Santana de Morais, Pedro Paulo Salman de Oliveira, Riei Joaquim Matos Rodrigues, Roberto Costa Fernandes, Ryan Vinicius Santos Morais, Tamara Mayara Ramos Teobaldo, Washington Igor dos Santos Silva, Edna Natividade Silva Barros

    Abstract: RobôCIn has participated in RoboCup Small Size League since 2019, won its first world title in 2022 (Division B), and is currently a three-times Latin-American champion. This paper presents our improvements to defend the Small Size League (SSL) division B title in RoboCup 2023 in Bordeaux, France. This paper aims to share some of the academic research that our team developed over the past year. Ou… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

  9. arXiv:2303.18177  [pdf, other

    cs.CV

    STMT: A Spatial-Temporal Mesh Transformer for MoCap-Based Action Recognition

    Authors: Xiaoyu Zhu, Po-Yao Huang, Junwei Liang, Celso M. de Melo, Alexander Hauptmann

    Abstract: We study the problem of human action recognition using motion capture (MoCap) sequences. Unlike existing techniques that take multiple manual steps to derive standardized skeleton representations as model input, we propose a novel Spatial-Temporal Mesh Transformer (STMT) to directly model the mesh sequences. The model uses a hierarchical transformer with intra-frame off-set attention and inter-fra… ▽ More

    Submitted 31 March, 2023; originally announced March 2023.

    Comments: CVPR 2023

  10. arXiv:2303.10280  [pdf, other

    cs.CV

    Synthetic-to-Real Domain Adaptation for Action Recognition: A Dataset and Baseline Performances

    Authors: Arun V. Reddy, Ketul Shah, William Paul, Rohita Mocharla, Judy Hoffman, Kapil D. Katyal, Dinesh Manocha, Celso M. de Melo, Rama Chellappa

    Abstract: Human action recognition is a challenging problem, particularly when there is high variability in factors such as subject appearance, backgrounds and viewpoint. While deep neural networks (DNNs) have been shown to perform well on action recognition tasks, they typically require large amounts of high-quality labeled data to achieve robust performance across a variety of conditions. Synthetic data h… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

    Comments: ICRA 2023. The first two authors contributed equally. Dataset available at: https://github.com/reddyav1/RoCoG-v2

  11. AZTR: Aerial Video Action Recognition with Auto Zoom and Temporal Reasoning

    Authors: Xijun Wang, Ruiqi Xian, Tianrui Guan, Celso M. de Melo, Stephen M. Nogar, Aniket Bera, Dinesh Manocha

    Abstract: We propose a novel approach for aerial video action recognition. Our method is designed for videos captured using UAVs and can run on edge or mobile devices. We present a learning-based approach that uses customized auto zoom to automatically identify the human target and scale it appropriately. This makes it easier to extract the key features and reduces the computational overhead. We also presen… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: Accepted for publication at ICRA 2023

  12. arXiv:2302.07241  [pdf, other

    cs.CV cs.AI cs.RO

    ConceptFusion: Open-set Multimodal 3D Mapping

    Authors: Krishna Murthy Jatavallabhula, Alihusein Kuwajerwala, Qiao Gu, Mohd Omama, Tao Chen, Alaa Maalouf, Shuang Li, Ganesh Iyer, Soroush Saryazdi, Nikhil Keetha, Ayush Tewari, Joshua B. Tenenbaum, Celso Miguel de Melo, Madhava Krishna, Liam Paull, Florian Shkurti, Antonio Torralba

    Abstract: Building 3D maps of the environment is central to robot navigation, planning, and interaction with objects in a scene. Most existing approaches that integrate semantic concepts with 3D maps largely remain confined to the closed-set setting: they can only reason about a finite set of concepts, pre-defined at training time. Further, these maps can only be queried using class labels, or in recent wor… ▽ More

    Submitted 23 October, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

    Comments: RSS 2023. Project page: https://concept-fusion.github.io Explainer video: https://www.youtube.com/watch?v=rkXgws8fiDs Code: https://github.com/concept-fusion/concept-fusion

  13. arXiv:2211.05883  [pdf, other

    cs.CV

    Open-Set Automatic Target Recognition

    Authors: Bardia Safaei, Vibashan VS, Celso M. de Melo, Shuowen Hu, Vishal M. Patel

    Abstract: Automatic Target Recognition (ATR) is a category of computer vision algorithms which attempts to recognize targets on data obtained from different sensors. ATR algorithms are extensively used in real-world scenarios such as military and surveillance applications. Existing ATR algorithms are developed for traditional closed-set methods where training and testing have the same class distribution. Th… ▽ More

    Submitted 10 November, 2022; originally announced November 2022.

    Comments: 5 pages, 3 figures. Submitted to ICASSP 2023

  14. arXiv:2207.00925  [pdf

    cs.GT

    The Impact of Partner Expressions on Felt Emotion in the Iterated Prisoner's Dilemma: An Event-level Analysis

    Authors: Maria Angelika-Nikita, Celso M. de Melo, Kazunori Terada, Gale Lucas, Jonathan Gratch

    Abstract: Social games like the prisoner's dilemma are often used to develop models of the role of emotion in social decision-making. Here we examine an understudied aspect of emotion in such games: how an individual's feelings are shaped by their partner's expressions. Prior research has tended to focus on other aspects of emotion. Research on felt-emotion has focused on how an individual's feelings shape… ▽ More

    Submitted 2 July, 2022; originally announced July 2022.

    Comments: 18 pages, 7 figures, Ninth Annual Conference on Advances in Cognitive Systems

  15. On Structuring Functional Programs with Monoidal Profunctors

    Authors: Alexandre Garcia de Oliveira, Mauro Jaskelioff, Ana Cristina Vieira de Melo

    Abstract: We study monoidal profunctors as a tool to reason and structure pure functional programs both from a categorical perspective and as a Haskell implementation. From the categorical point of view we approach them as monoids in a certain monoidal category of profunctors. We study properties of this monoidal category and construct and implement the free monoidal profunctor. We study the relationship of… ▽ More

    Submitted 2 July, 2022; originally announced July 2022.

    Comments: In Proceedings MSFP 2022, arXiv:2206.09534

    Journal ref: EPTCS 360, 2022, pp. 134-150

  16. arXiv:2206.10779  [pdf, other

    cs.CV

    Not Just Streaks: Towards Ground Truth for Single Image Deraining

    Authors: Yunhao Ba, Howard Zhang, Ethan Yang, Akira Suzuki, Arnold Pfahnl, Chethan Chinder Chandrappa, Celso de Melo, Suya You, Stefano Soatto, Alex Wong, Achuta Kadambi

    Abstract: We propose a large-scale dataset of real-world rainy and clean image pairs and a method to remove degradations, induced by rain streaks and rain accumulation, from the image. As there exists no real-world dataset for deraining, current state-of-the-art methods rely on synthetic data and thus are limited by the sim2real domain gap; moreover, rigorous evaluation remains a challenge due to the absenc… ▽ More

    Submitted 28 August, 2022; v1 submitted 21 June, 2022; originally announced June 2022.

  17. arXiv:2203.14779  [pdf, other

    cs.CV cs.HC cs.SD eess.AS

    A Joint Cross-Attention Model for Audio-Visual Fusion in Dimensional Emotion Recognition

    Authors: R. Gnana Praveen, Wheidima Carneiro de Melo, Nasib Ullah, Haseeb Aslam, Osama Zeeshan, Théo Denorme, Marco Pedersoli, Alessandro Koerich, Simon Bacon, Patrick Cardinal, Eric Granger

    Abstract: Multimodal emotion recognition has recently gained much attention since it can leverage diverse and complementary relationships over multiple modalities (e.g., audio, visual, biosignals, etc.), and can provide some robustness to noisy modalities. Most state-of-the-art methods for audio-visual (A-V) fusion rely on recurrent networks or conventional attention mechanisms that do not effectively lever… ▽ More

    Submitted 6 July, 2024; v1 submitted 28 March, 2022; originally announced March 2022.

    Comments: arXiv admin note: text overlap with arXiv:2111.05222

  18. arXiv:2203.11111  [pdf, other

    cs.CV

    Facial Expression Analysis Using Decomposed Multiscale Spatiotemporal Networks

    Authors: Wheidima Carneiro de Melo, Eric Granger, Miguel Bordallo Lopez

    Abstract: Video-based analysis of facial expressions has been increasingly applied to infer health states of individuals, such as depression and pain. Among the existing approaches, deep learning models composed of structures for multiscale spatiotemporal processing have shown strong potential for encoding facial dynamics. However, such models have high computational complexity, making for a difficult deplo… ▽ More

    Submitted 21 March, 2022; originally announced March 2022.

  19. arXiv:1910.02319  [pdf, ps, other

    cs.CV cs.LG

    Covariance-free Partial Least Squares: An Incremental Dimensionality Reduction Method

    Authors: Artur Jordao, Maiko Lie, Victor Hugo Cunha de Melo, William Robson Schwartz

    Abstract: Dimensionality reduction plays an important role in computer vision problems since it reduces computational cost and is often capable of yielding more discriminative data representation. In this context, Partial Least Squares (PLS) has presented notable results in tasks such as image classification and neural network optimization. However, PLS is infeasible on large datasets, such as ImageNet, bec… ▽ More

    Submitted 10 November, 2020; v1 submitted 5 October, 2019; originally announced October 2019.

    Comments: Accepted for publication at Winter Conference on Applications of Computer Vision (WACV) 2021

  20. arXiv:1907.01578  [pdf, other

    cs.DC

    The Information Processing Factory: Organization, Terminology, and Definitions

    Authors: Eberle A. Rambo, Bryan Donyanavard, Minjun Seo, Florian Maurer, Thawra Kadeed, Caio B. de Melo, Biswadip Maity, Anmol Surhonne, Andreas Herkersdorf, Fadi Kurdahi, Nikil Dutt, Rolf Ernst

    Abstract: The Information Processing Factory (IPF) project has recently introduced the abstraction of complex architectures as self-aware information processing factories. These factories consist of a set of highly configurable resources, e.g., processing elements and interconnects, whose use is monitored, planned, and configured during runtime. Managing a factory involves multiple facets, such as efficienc… ▽ More

    Submitted 2 July, 2019; originally announced July 2019.

  21. arXiv:1708.06637  [pdf, other

    cs.CV

    Activity Recognition based on a Magnitude-Orientation Stream Network

    Authors: Carlos Caetano, Victor H. C. de Melo, Jefersson A. dos Santos, William Robson Schwartz

    Abstract: The temporal component of videos provides an important clue for activity recognition, as a number of activities can be reliably recognized based on the motion information. In view of that, this work proposes a novel temporal stream for two-stream convolutional networks based on images computed from the optical flow magnitude and orientation, named Magnitude-Orientation Stream (MOS), to learn the m… ▽ More

    Submitted 22 August, 2017; originally announced August 2017.

    Comments: 8 pages, SIBGRAPI 2017